llama_cpp_for_radxa_dragon_wing_q6a

History

Daniel Bevenius 25f40ca65f completion : simplify batch (embd) processing (#19286 ) * completion : simplify batch (embd) processing This commit simplifies the processing of embd by removing the for loop that currently exists which uses params.n_batch as its increment. This commit also removes the clamping of n_eval as the size of embd is always at most the size of params.n_batch. The motivation is to clarify the code as it is currently a little confusing when looking at this for loop in isolation and thinking that it can process multiple batches. * add an assert to verify n_eval is not greater than n_batch		2026-02-04 05:43:28 +01:00
..
batched-bench
cli
completion	completion : simplify batch (embd) processing (#19286 )	2026-02-04 05:43:28 +01:00
cvector-generator	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
export-lora	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
fit-params
gguf-split
imatrix
llama-bench
mtmd	mtmd: add min/max pixels gguf metadata (#19273 )	2026-02-02 20:59:06 +01:00
perplexity	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
quantize
rpc
server	server: print actual model name in 'model not found" error (#19117 )	2026-02-02 16:55:27 +01:00
tokenize
tts
CMakeLists.txt