llama_cpp_for_radxa_dragon_wing_q6a

pingu_98/llama_cpp_for_radxa_dragon_wing_q6a

History

Georgi Gerganov 2f966b8ed8 clip : use FA (#16837 ) * clip : use FA * cont : add warning about unsupported ops * implement "auto" mode for clip flash attn * clip : print more detailed op support info during warmup * cont : remove obsolete comment [no ci] * improve debugging message * trailing space * metal : remove stray return --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>		2025-11-02 21:21:48 +01:00
..
.gitignore
CMakeLists.txt
get-model.cpp
get-model.h
run-json-schema-to-grammar.mjs
test-alloc.cpp
test-arg-parser.cpp
test-autorelease.cpp
test-backend-ops.cpp	clip : use FA (#16837 )	2025-11-02 21:21:48 +01:00
test-barrier.cpp
test-c.c
test-chat-parser.cpp
test-chat-template.cpp
test-chat.cpp	chat: Add LFM2 tool handling (#16763 )	2025-10-27 23:54:01 +01:00
test-double-float.cpp
test-gbnf-validator.cpp
test-gguf.cpp
test-grammar-integration.cpp	grammar : use int64_t to avoid int overflows in int schema to grammar conversion logic (#16626 )	2025-10-17 08:59:31 +03:00
test-grammar-llguidance.cpp
test-grammar-parser.cpp
test-json-partial.cpp
test-json-schema-to-grammar.cpp	grammar : support array references in json schema (#16792 )	2025-10-28 09:37:52 +01:00
test-llama-grammar.cpp
test-log.cpp
test-lora-conversion-inference.sh
test-model-load-cancel.cpp
test-mtmd-c-api.c
test-opt.cpp
test-quantize-fns.cpp
test-quantize-perf.cpp
test-quantize-stats.cpp
test-regex-partial.cpp
test-rope.cpp	model: add support for qwen3vl series (#16780 )	2025-10-30 16:19:14 +01:00
test-sampling.cpp
test-thread-safety.cpp	server : support unified cache across slots (#16736 )	2025-11-02 18:14:04 +02:00
test-tokenizer-0.cpp
test-tokenizer-0.py
test-tokenizer-0.sh
test-tokenizer-1-bpe.cpp
test-tokenizer-1-spm.cpp
test-tokenizer-random.py
test-tokenizers-repo.sh