llama_cpp_for_radxa_dragon_.../examples
Daniel Bevenius a18f481f99
server : use common_token_to_piece instead of common_detokenize (#11740)
* server : use common_token_to_piece instead of common_detokenize

This commit replaces the call to common_detokenize with
common_token_to_piece in the populate_token_probs.

The motivation for this change is to avoid an issue where
common_detokenize would remove the word boundary character for tokens,
which caused a regression in the server generated token probabilities.

Resolves: https://github.com/ggerganov/llama.cpp/issues/11728

* squash! server : use common_token_to_piece instead of common_detokenize

Use common_token_to_piece for post_sampling_probs as well.
2025-02-11 14:06:45 +01:00
..
batched
batched-bench
batched.swift swift : fix llama-vocab api usage (#11645) 2025-02-04 13:15:24 +02:00
convert-llama2c-to-ggml
cvector-generator
deprecation-warning
embedding
eval-callback
export-lora export-lora : fix tok_embd tensor (#11330) 2025-01-21 14:07:12 +01:00
gbnf-validator Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639) 2025-01-30 19:13:58 +00:00
gen-docs
gguf
gguf-hash
gguf-split ci : use -no-cnv in gguf-split tests (#11254) 2025-01-15 18:28:35 +02:00
gritlm
imatrix
infill
jeopardy
llama-bench rpc : early register backend devices (#11262) 2025-01-17 10:57:09 +02:00
llama.android llama.android: add field formatChat to control whether to parse special tokens when send message (#11270) 2025-01-17 14:57:56 +02:00
llama.swiftui swift : fix llama-vocab api usage (#11645) 2025-02-04 13:15:24 +02:00
llava llava: add quantization for the visual projector LLAVA, Qwen2VL (#11644) 2025-02-05 10:45:40 +03:00
lookahead
lookup
main Update README.md [no ci] (#11781) 2025-02-10 09:05:57 +01:00
parallel
passkey
perplexity
quantize ci : use -no-cnv in gguf-split tests (#11254) 2025-01-15 18:28:35 +02:00
quantize-stats
retrieval
rpc
run There's a better way of clearing lines (#11756) 2025-02-09 10:34:49 +00:00
save-load-state
server server : use common_token_to_piece instead of common_detokenize (#11740) 2025-02-11 14:06:45 +01:00
simple
simple-chat Add Jinja template support (#11016) 2025-01-21 13:18:51 +00:00
simple-cmake-pkg cmake: add ggml find package (#11369) 2025-01-26 12:07:48 -04:00
speculative
speculative-simple
sycl
tokenize
tts tts : add guide tokens support (#11186) 2025-01-18 12:20:57 +02:00
chat-13B.bat
chat-13B.sh
chat-persistent.sh
chat-vicuna.sh
chat.sh
CMakeLists.txt
convert_legacy_llama.py
json_schema_pydantic_example.py
json_schema_to_grammar.py
llama.vim
llm.vim
Miku.sh
pydantic_models_to_grammar.py
pydantic_models_to_grammar_examples.py
reason-act.sh
regex_to_grammar.py
server-llama2-13B.sh
server_embd.py
ts-type-to-grammar.sh