llama_cpp_for_radxa_dragon_.../tools
Xuan-Son Nguyen c42712b056
server: support multiple generations from one prompt (OAI "n" option) (#17775)
* backend support

* server: support multiple generations from one prompt (OAI "n" option)

* fix invalid batch

* format oai

* clean up

* disable ctx shift

* add test

* update comments

* fix style

* add n_cmpl to docs [no ci]

* allowing using both n_cmpl and n
2025-12-06 15:54:38 +01:00
..
batched-bench
cvector-generator
export-lora
gguf-split
imatrix
llama-bench
main cli: add migration warning (#17620) 2025-11-30 15:32:43 +01:00
mtmd mtmd: fix --no-warmup (#17695) 2025-12-02 22:48:08 +01:00
perplexity
quantize
rpc
run
server server: support multiple generations from one prompt (OAI "n" option) (#17775) 2025-12-06 15:54:38 +01:00
tokenize
tts
CMakeLists.txt