llama_cpp_for_radxa_dragon_.../tools
Xuan-Son Nguyen 6ce863c803
server: prevent data race from HTTP threads (#18263)
* server: prevent data race from HTTP threads

* fix params

* fix default_generation_settings

* nits: make handle_completions_impl looks less strange

* stricter const

* fix GGML_ASSERT(idx < states.size())

* move index to be managed by server_response_reader

* http: make sure req & res lifecycle are tied together

* fix compile

* fix index handling buggy

* fix data race for lora endpoint

* nits: fix shadow variable

* nits: revert redundant changes

* nits: correct naming for json_webui_settings
2025-12-22 14:23:34 +01:00
..
batched-bench tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
cli server: prevent data race from HTTP threads (#18263) 2025-12-22 14:23:34 +01:00
completion arg: clarify auto kvu/np being set on server (#17997) 2025-12-16 12:01:27 +01:00
cvector-generator
export-lora
fit-params llama-fit-params: QoL impr. for prints/errors (#18089) 2025-12-17 00:03:19 +01:00
gguf-split
imatrix
llama-bench tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
mtmd model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106) 2025-12-19 00:18:01 +01:00
perplexity
quantize
rpc
run
server server: prevent data race from HTTP threads (#18263) 2025-12-22 14:23:34 +01:00
tokenize
tts
CMakeLists.txt