llama_cpp_for_radxa_dragon_wing_q6a

History

Radoslav Gerganov c556418b60 llama-bench : use local GPUs along with RPC servers (#14917 ) Currently if RPC servers are specified with '--rpc' and there is a local GPU available (e.g. CUDA), the benchmark will be performed only on the RPC device(s) but the backend result column will say "CUDA,RPC" which is incorrect. This patch is adding all local GPU devices and makes llama-bench consistent with llama-cli.		2025-07-28 18:59:04 +03:00
..
batched-bench	llama : add high-throughput mode (#14363 )	2025-07-16 16:35:42 +03:00
cvector-generator
export-lora	mtmd : fix 32-bit narrowing issue in export-lora and mtmd clip (#14503 )	2025-07-25 13:08:04 +02:00
gguf-split
imatrix	imatrix: add option to display importance score statistics for a given imatrix file (#12718 )	2025-07-22 14:33:37 +02:00
llama-bench	llama-bench : use local GPUs along with RPC servers (#14917 )	2025-07-28 18:59:04 +03:00
main	llama : fix `--reverse-prompt` crashing issue (#14794 )	2025-07-21 17:38:36 +08:00
mtmd	mtmd : add support for Voxtral (#14862 )	2025-07-28 15:01:48 +02:00
perplexity
quantize	quantize : update README.md (#14905 )	2025-07-27 23:31:11 +02:00
rpc
run	cmake : do not search for curl libraries by ourselves (#14613 )	2025-07-10 15:29:05 +03:00
server	server : allow setting `--reverse-prompt` arg (#14799 )	2025-07-22 09:24:22 +08:00
tokenize
tts
CMakeLists.txt