llama_cpp_for_radxa_dragon_.../examples
Georgi Gerganov a10b36c91a
llama : refactor kv cache guard (#12695)
* llama : refactor kv cache guard

ggml-ci

* cont : fix comment [no ci]

* llama : fix kv_cache restore logic

ggml-ci

* context : simplify kv cache updates

ggml-ci

* cont : better name [no ci]

* llama : fix llama_decode return code when could not find KV slot

ggml-ci

* context : change log err -> warn [no ci]

* kv-cache : add comment + warning
2025-04-02 14:32:59 +03:00
..
batched common : refactor downloading system, handle mmproj with -hf option (#12694) 2025-04-01 23:44:05 +02:00
batched-bench common : refactor downloading system, handle mmproj with -hf option (#12694) 2025-04-01 23:44:05 +02:00
batched.swift
convert-llama2c-to-ggml
cvector-generator
deprecation-warning
embedding
eval-callback
export-lora common : refactor downloading system, handle mmproj with -hf option (#12694) 2025-04-01 23:44:05 +02:00
gbnf-validator
gen-docs
gguf
gguf-hash
gguf-split
gritlm common : refactor downloading system, handle mmproj with -hf option (#12694) 2025-04-01 23:44:05 +02:00
imatrix
infill
jeopardy
llama-bench
llama.android
llama.swiftui
llava common : refactor downloading system, handle mmproj with -hf option (#12694) 2025-04-01 23:44:05 +02:00
lookahead
lookup
main
parallel llama : refactor kv cache guard (#12695) 2025-04-02 14:32:59 +03:00
passkey common : refactor downloading system, handle mmproj with -hf option (#12694) 2025-04-01 23:44:05 +02:00
perplexity
quantize
quantize-stats
retrieval
rpc rpc : update README for cache usage (#12620) 2025-03-28 09:44:13 +02:00
run run: de-duplicate fmt and format functions and optimize (#11596) 2025-03-25 18:46:11 +01:00
save-load-state
server common : remove json.hpp from common.cpp (#12697) 2025-04-02 09:58:34 +02:00
simple
simple-chat
simple-cmake-pkg
speculative common : refactor downloading system, handle mmproj with -hf option (#12694) 2025-04-01 23:44:05 +02:00
speculative-simple common : refactor downloading system, handle mmproj with -hf option (#12694) 2025-04-01 23:44:05 +02:00
sycl
tokenize
tts common : refactor downloading system, handle mmproj with -hf option (#12694) 2025-04-01 23:44:05 +02:00
chat-13B.bat
chat-13B.sh
chat-persistent.sh
chat-vicuna.sh
chat.sh
CMakeLists.txt
convert_legacy_llama.py
json_schema_pydantic_example.py
json_schema_to_grammar.py
llama.vim
llm.vim
Miku.sh
pydantic_models_to_grammar.py
pydantic_models_to_grammar_examples.py
reason-act.sh
regex_to_grammar.py
server-llama2-13B.sh
server_embd.py
ts-type-to-grammar.sh