llama_cpp_for_radxa_dragon_wing_q6a

pingu_98/llama_cpp_for_radxa_dragon_wing_q6a

History

Daniel Bevenius a18f481f99 server : use common_token_to_piece instead of common_detokenize (#11740 ) * server : use common_token_to_piece instead of common_detokenize This commit replaces the call to common_detokenize with common_token_to_piece in the populate_token_probs. The motivation for this change is to avoid an issue where common_detokenize would remove the word boundary character for tokens, which caused a regression in the server generated token probabilities. Resolves: https://github.com/ggerganov/llama.cpp/issues/11728 * squash! server : use common_token_to_piece instead of common_detokenize Use common_token_to_piece for post_sampling_probs as well.		2025-02-11 14:06:45 +01:00
..
batched
batched-bench
batched.swift	swift : fix llama-vocab api usage (#11645 )	2025-02-04 13:15:24 +02:00
convert-llama2c-to-ggml
cvector-generator
deprecation-warning
embedding
eval-callback
export-lora	export-lora : fix tok_embd tensor (#11330 )	2025-01-21 14:07:12 +01:00
gbnf-validator	Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )	2025-01-30 19:13:58 +00:00
gen-docs
gguf
gguf-hash
gguf-split	ci : use -no-cnv in gguf-split tests (#11254 )	2025-01-15 18:28:35 +02:00
gritlm
imatrix
infill
jeopardy
llama-bench	rpc : early register backend devices (#11262 )	2025-01-17 10:57:09 +02:00
llama.android	llama.android: add field formatChat to control whether to parse special tokens when send message (#11270 )	2025-01-17 14:57:56 +02:00
llama.swiftui	swift : fix llama-vocab api usage (#11645 )	2025-02-04 13:15:24 +02:00
llava	llava: add quantization for the visual projector LLAVA, Qwen2VL (#11644 )	2025-02-05 10:45:40 +03:00
lookahead
lookup
main	Update README.md [no ci] (#11781 )	2025-02-10 09:05:57 +01:00
parallel
passkey
perplexity
quantize	ci : use -no-cnv in gguf-split tests (#11254 )	2025-01-15 18:28:35 +02:00
quantize-stats
retrieval
rpc
run	There's a better way of clearing lines (#11756 )	2025-02-09 10:34:49 +00:00
save-load-state
server	server : use common_token_to_piece instead of common_detokenize (#11740 )	2025-02-11 14:06:45 +01:00
simple
simple-chat	Add Jinja template support (#11016 )	2025-01-21 13:18:51 +00:00
simple-cmake-pkg	cmake: add ggml find package (#11369 )	2025-01-26 12:07:48 -04:00
speculative
speculative-simple
sycl
tokenize
tts	tts : add guide tokens support (#11186 )	2025-01-18 12:20:57 +02:00
chat-13B.bat
chat-13B.sh
chat-persistent.sh
chat-vicuna.sh
chat.sh
CMakeLists.txt
convert_legacy_llama.py
json_schema_pydantic_example.py
json_schema_to_grammar.py
llama.vim
llm.vim
Miku.sh
pydantic_models_to_grammar.py
pydantic_models_to_grammar_examples.py
reason-act.sh
regex_to_grammar.py
server-llama2-13B.sh
server_embd.py
ts-type-to-grammar.sh