llama_cpp_for_radxa_dragon_.../tools
Georgi Gerganov 85a7d8677b
memory : remove KV cache size padding (#16812)
* memory : remove KV cache size padding

* cont : restore padding for n_kv tensor shape

* server : use slot context size instead of training context size

* server : simplify context limit logic
2025-10-28 20:19:44 +02:00
..
batched-bench
cvector-generator
export-lora
gguf-split
imatrix Manually link -lbsd to resolve flock symbol on AIX (#16610) 2025-10-23 19:37:31 +08:00
llama-bench llama-bench : clarify benchmarked parts of the computation (#16823) 2025-10-28 19:41:43 +02:00
main
mtmd mtmd : fix idefics3 preprocessing (#16806) 2025-10-27 23:12:16 +01:00
perplexity
quantize
rpc rpc : report actual free memory (#16616) 2025-10-17 18:02:52 +03:00
run Manually link -lbsd to resolve flock symbol on AIX (#16610) 2025-10-23 19:37:31 +08:00
server memory : remove KV cache size padding (#16812) 2025-10-28 20:19:44 +02:00
tokenize
tts
CMakeLists.txt