llama_cpp_for_radxa_dragon_wing_q6a

History

Johannes Gäßler c466abe158 llama: -fa 1/0/-1 aliases for -fa on/off/auto (#15746 )		2025-09-02 18:17:26 +02:00
..
arg.cpp	llama: -fa 1/0/-1 aliases for -fa on/off/auto (#15746 )	2025-09-02 18:17:26 +02:00
arg.h
base64.hpp
build-info.cpp.in
chat-parser.cpp	chat : support Granite model reasoning and tool call (#14864 )	2025-08-06 20:27:30 +02:00
chat-parser.h
chat.cpp	chat : Seed OSS thinking + tool call support (#15552 )	2025-08-29 14:53:41 +02:00
chat.h	chat : Seed OSS thinking + tool call support (#15552 )	2025-08-29 14:53:41 +02:00
CMakeLists.txt
common.cpp	llama: use FA + max. GPU layers by default (#15434 )	2025-08-30 16:32:10 +02:00
common.h	server : enable /slots by default and make it secure (#15630 )	2025-08-31 20:11:58 +03:00
console.cpp
console.h
json-partial.cpp
json-partial.h
json-schema-to-grammar.cpp
json-schema-to-grammar.h
llguidance.cpp
log.cpp
log.h
ngram-cache.cpp
ngram-cache.h
regex-partial.cpp
regex-partial.h
sampling.cpp	sampling : optimize samplers by reusing bucket sort (#15665 )	2025-08-31 20:41:02 +03:00
sampling.h	sampling : optimize samplers by reusing bucket sort (#15665 )	2025-08-31 20:41:02 +03:00
speculative.cpp	sampling : optimize samplers by reusing bucket sort (#15665 )	2025-08-31 20:41:02 +03:00
speculative.h	server : implement universal assisted decoding (#12635 )	2025-07-31 14:25:23 +02:00