llama_cpp_for_radxa_dragon_wing_q6a

History

Xuan Son Nguyen 642330ac7c llama : add enum for built-in chat templates (#10623 ) * llama : add enum for supported chat templates * use "built-in" instead of "supported" * arg: print list of built-in templates * fix test * update server README		2024-12-02 22:10:19 +01:00
..
cmake
arg.cpp	llama : add enum for built-in chat templates (#10623 )	2024-12-02 22:10:19 +01:00
arg.h
base64.hpp
build-info.cpp.in
CMakeLists.txt	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
common.cpp	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
common.h	server: Add "tokens per second" information in the backend (#10548 )	2024-12-02 14:45:54 +01:00
console.cpp
console.h
json-schema-to-grammar.cpp	grammar : fix JSON Schema for string regex with top-level alt. (#9903 )	2024-10-16 19:03:24 +03:00
json-schema-to-grammar.h
json.hpp
log.cpp
log.h
ngram-cache.cpp
ngram-cache.h
sampling.cpp	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
sampling.h	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
speculative.cpp	server : add more information about error (#10455 )	2024-11-25 22:28:59 +02:00
speculative.h	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
stb_image.h