llama_cpp_for_radxa_dragon_wing_q6a

History

David Huang 7f323a589f Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386 )		2025-05-11 14:18:39 +02:00
..
cmake
minja	sync: minja (#12739 )	2025-04-04 21:16:39 +01:00
arg.cpp	Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386 )	2025-05-11 14:18:39 +02:00
arg.h	common : add common_remote_get_content (#13123 )	2025-04-26 22:58:12 +02:00
base64.hpp
build-info.cpp.in
chat.cpp	server : (webui) revamp the input area, plus many small UI improvements (#13365 )	2025-05-08 15:37:29 +02:00
chat.h
CMakeLists.txt	chore(llguidance): use tagged version that does not break the build (#13413 )	2025-05-09 23:15:39 +03:00
common.cpp	Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386 )	2025-05-11 14:18:39 +02:00
common.h	Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386 )	2025-05-11 14:18:39 +02:00
console.cpp
console.h
json-schema-to-grammar.cpp	grammar : handle maxItems == 0 in JSON schema (#13117 )	2025-04-26 10:10:20 +02:00
json-schema-to-grammar.h
json.hpp
llguidance.cpp	llguidance : set tokenizer slices to default (#13424 )	2025-05-10 17:19:52 +02:00
log.cpp
log.h
ngram-cache.cpp
ngram-cache.h
sampling.cpp	common : Add a warning when we can't match samplers from a string or char. (#13330 )	2025-05-07 11:23:28 +03:00
sampling.h
speculative.cpp
speculative.h
stb_image.h