llama_cpp_for_radxa_dragon_.../tools
Xuan-Son Nguyen 9ac2693a30
server: fix n_cmpl not skipping processing prompt (#18663)
* server: fix n_cmpl not skipping processing

* fix infinite loop on empty batch

* cont : init child samplers + modify child logic

* cont : cleanup

* cont : improve n_cmpl logic

- launch the parent task first so it finds the slot with best cache
- parent task waits for child tasks to be launched
- when a child task finishes - remove its cache

* cont : remove redundant function

* cont : reduce parent checks

* fix : nullptr task dereference

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2026-01-10 00:00:41 +01:00
..
batched-bench
cli
completion
cvector-generator
export-lora
fit-params
gguf-split
imatrix
llama-bench
mtmd mtmd: Add Gemma3n multimodal support with MobileNetV5 vision encoder (#18256) 2026-01-09 23:42:38 +01:00
perplexity
quantize
rpc
server server: fix n_cmpl not skipping processing prompt (#18663) 2026-01-10 00:00:41 +01:00
tokenize
tts
CMakeLists.txt cmake: only build cli when server is enabled (#18670) 2026-01-09 16:43:26 +01:00