llama_cpp_for_radxa_dragon_.../scripts
Ryan Goulden 26c9ce1288
server: Add cached_tokens info to oaicompat responses (#19361)
* tests : fix fetch_server_test_models.py

* server: to_json_oaicompat cached_tokens

Adds OpenAI and Anthropic compatible information about the
number of cached prompt tokens used in a response.
2026-03-19 19:09:33 +01:00
..
apple
hip
jinja
snapdragon
bench-models.sh
build-info.sh
check-requirements.sh
compare-commits.sh
compare-llama-bench.py
compare-logprobs.py
create_ops_docs.py
debug-test.sh
fetch_server_test_models.py
gen-authors.sh
gen-unicode-data.py
get-flags.mk
get-hellaswag.sh
get-pg.sh
get-wikitext-2.sh
get-winogrande.sh
get_chat_template.py
git-bisect-run.sh
git-bisect.sh
hf.sh
install-oneapi.bat
pr2wt.sh
serve-static.js
server-bench.py
server-test-model.py
sync-ggml-am.sh
sync-ggml.last
sync-ggml.sh
sync_vendor.py
tool_bench.py
tool_bench.sh
verify-checksum-models.py
xxd.cmake