llama_cpp_for_radxa_dragon_.../tools
Adrien Gallouët 463b6a963c
tools : enable kvu in perplexity for hellaswag, winogrande, multiple-choice (#19954)
llama-perplexity -hf unsloth/Qwen3-0.6B-GGUF:Q4_K_M -f winogrande-debiased-eval.csv --winogrande

    winogrande_score : tokenizing selected tasks
    winogrande_score : calculating winogrande score over selected tasks.
    split_equal: sequential split is not supported when there are coupled sequences in the input batch (you may need to use the -kvu flag)
    decode: failed to find a memory slot for batch of size 46
    failed to decode the batch, n_batch = 2048, ret = 1
    winogrande_score: llama_decode() failed

same for hellaswag:

    split_equal: sequential split is not supported when there are coupled sequences in the input batch (you may need to use the -kvu flag)
    decode: failed to find a memory slot for batch of size 99
    failed to decode the batch, n_batch = 2048, ret = 1
    hellaswag_score: llama_decode() failed

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2026-03-13 21:25:57 +01:00
..
batched-bench
cli common/parser: handle reasoning budget (#20297) 2026-03-11 10:26:12 +01:00
completion
cvector-generator
export-lora
fit-params
gguf-split
imatrix
llama-bench llama-bench: introduce -hf and -hff flags & use --mmap 1 by default (#20211) 2026-03-09 09:05:44 +08:00
mtmd mtmd : rename mtmd_get_audio_bitrate to mtmd_get_audio_sample_rate (#20105) 2026-03-13 12:30:02 +01:00
parser
perplexity tools : enable kvu in perplexity for hellaswag, winogrande, multiple-choice (#19954) 2026-03-13 21:25:57 +01:00
quantize llama-quant : fail early on missing imatrix, refactor type selection, code cleanup (#19770) 2026-03-10 08:16:05 +02:00
results llama: end-to-end tests (#19802) 2026-03-08 12:30:21 +01:00
rpc
server llama : fix pooling assertion crash in chunked GDN detection path (#20468) 2026-03-13 20:53:42 +02:00
tokenize
tts
CMakeLists.txt llama: end-to-end tests (#19802) 2026-03-08 12:30:21 +01:00