llama_cpp_for_radxa_dragon_.../tools
Radoslav Gerganov c556418b60
llama-bench : use local GPUs along with RPC servers (#14917)
Currently if RPC servers are specified with '--rpc' and there is a local
GPU available (e.g. CUDA), the benchmark will be performed only on the
RPC device(s) but the backend result column will say "CUDA,RPC" which is
incorrect. This patch is adding all local GPU devices and makes
llama-bench consistent with llama-cli.
2025-07-28 18:59:04 +03:00
..
batched-bench llama : add high-throughput mode (#14363) 2025-07-16 16:35:42 +03:00
cvector-generator
export-lora mtmd : fix 32-bit narrowing issue in export-lora and mtmd clip (#14503) 2025-07-25 13:08:04 +02:00
gguf-split
imatrix imatrix: add option to display importance score statistics for a given imatrix file (#12718) 2025-07-22 14:33:37 +02:00
llama-bench llama-bench : use local GPUs along with RPC servers (#14917) 2025-07-28 18:59:04 +03:00
main llama : fix --reverse-prompt crashing issue (#14794) 2025-07-21 17:38:36 +08:00
mtmd mtmd : add support for Voxtral (#14862) 2025-07-28 15:01:48 +02:00
perplexity
quantize quantize : update README.md (#14905) 2025-07-27 23:31:11 +02:00
rpc
run cmake : do not search for curl libraries by ourselves (#14613) 2025-07-10 15:29:05 +03:00
server server : allow setting --reverse-prompt arg (#14799) 2025-07-22 09:24:22 +08:00
tokenize
tts
CMakeLists.txt