llama_cpp_for_radxa_dragon_wing_q6a

History

fairydreaming 7c3f55c100 Add support for encoder-only T5 models (#8900 ) * gguf-py : add T5ENCODER model architecture * common : call llama_decode() during warmup only if the model has decoder * convert-hf : add T5EncoderModel * llama : add llama_model_has_decoder() API function * llama : split build_t5() into build_t5_encoder() and build_t5_decoder() * llama : add support for LLM_ARCH_T5ENCODER * llama-embedding : add support for LLAMA_POOLING_TYPE_NONE * llama-embedding : add support for encoder-only models --------- Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>		2024-08-10 11:43:26 +02:00
..
cmake
base64.hpp
build-info.cpp.in
CMakeLists.txt
common.cpp	Add support for encoder-only T5 models (#8900 )	2024-08-10 11:43:26 +02:00
common.h	llama : better replace_all (cont) (#8926 )	2024-08-09 18:23:52 +03:00
console.cpp
console.h
grammar-parser.cpp
grammar-parser.h
json-schema-to-grammar.cpp
json-schema-to-grammar.h
json.hpp
log.h	infill : assert prefix/suffix tokens + remove old space logic (#8351 )	2024-07-08 09:34:35 +03:00
ngram-cache.cpp
ngram-cache.h	lookup: fibonacci hashing, fix crashes (#8548 )	2024-07-17 23:35:44 +02:00
sampling.cpp	llama : move vocab, grammar and sampling into separate files (#8508 )	2024-07-23 13:10:17 +03:00
sampling.h
stb_image.h
train.cpp
train.h