llama_cpp_for_radxa_dragon_.../tools
Gabe Goodhart c08002a198
chat : Granite Docling stopping (#16438)
* fix: Fix duplicate fake image before token on first slice

Branch: GraniteDoclingStopping

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* fix: Use double-newline before overview image

Branch: GraniteDoclingStopping

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* fix: Remove incorrect newline at the end of granite chat template gen prompt

There should not be one, even for the language models.

Branch: GraniteDoclingStopping

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* tests: Remove bad newline from granite chat template test (legacy)

Branch: GraniteDoclingStopping

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

---------

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
2025-10-06 18:59:40 +02:00
..
batched-bench
cvector-generator
export-lora
gguf-split
imatrix
llama-bench rpc : add support for multiple devices (#16276) 2025-10-04 12:49:16 +03:00
main
mtmd chat : Granite Docling stopping (#16438) 2025-10-06 18:59:40 +02:00
perplexity
quantize
rpc rpc : add support for multiple devices (#16276) 2025-10-04 12:49:16 +03:00
run
server server: update readme to mention n_past_max metric (#16436) 2025-10-06 10:53:31 +03:00
tokenize
tts
CMakeLists.txt