llama_cpp_for_radxa_dragon_wing_q6a

pingu_98/llama_cpp_for_radxa_dragon_wing_q6a

History

Aman Gupta 48e2fa9fb7 CUDA: add fp kernel for larger batch size MoE (#16512 ) * CUDA: kernel for larger batch sizes for MoE * WIP * WIP * WIP * WIP * WIP * WIP * fixup * tests * Move mmq_ids_helper to mmid * cleanup * Remove redundant checks		2025-10-14 13:15:15 +02:00
..
.gitignore	gitignore : Ignore vim swap files in tests (#15901 )	2025-09-10 14:28:47 +03:00
CMakeLists.txt	devops: add s390x & ppc64le CI (#15925 )	2025-09-27 02:03:33 +08:00
get-model.cpp
get-model.h
run-json-schema-to-grammar.mjs
test-alloc.cpp	ggml : fix graph reallocation with multiple chunks (#16396 )	2025-10-03 13:49:08 +02:00
test-arg-parser.cpp	common : remove common_has_curl() (#16351 )	2025-09-30 17:39:44 +03:00
test-autorelease.cpp
test-backend-ops.cpp	CUDA: add fp kernel for larger batch size MoE (#16512 )	2025-10-14 13:15:15 +02:00
test-barrier.cpp	test-barrier : do not use more threads than physically available (#16389 )	2025-10-02 20:10:12 +02:00
test-c.c	ggml : remove kompute backend (#14501 )	2025-07-03 07:48:32 +03:00
test-chat-parser.cpp	common : handle unicode during partial json parsing (#16526 )	2025-10-12 16:18:47 +03:00
test-chat-template.cpp	chat : Granite Docling stopping (#16438 )	2025-10-06 18:59:40 +02:00
test-chat.cpp	chat : support Magistral thinking (#16413 )	2025-10-03 21:51:48 +03:00
test-double-float.cpp
test-gbnf-validator.cpp
test-gguf.cpp
test-grammar-integration.cpp
test-grammar-llguidance.cpp
test-grammar-parser.cpp
test-json-partial.cpp	common : handle unicode during partial json parsing (#16526 )	2025-10-12 16:18:47 +03:00
test-json-schema-to-grammar.cpp	json : support `enum` values within `allOf` (#15830 )	2025-09-08 16:14:32 -05:00
test-llama-grammar.cpp
test-log.cpp
test-lora-conversion-inference.sh	scripts : make the shell scripts cross-platform (#14341 )	2025-06-30 10:17:18 +02:00
test-model-load-cancel.cpp
test-mtmd-c-api.c
test-opt.cpp	tests : fix test-opt with GGML_BACKEND_DL (#15599 )	2025-08-26 22:14:38 +02:00
test-quantize-fns.cpp
test-quantize-perf.cpp	ci: run the x64 and arm ci on the github machines instead (#16183 )	2025-09-25 08:06:06 +03:00
test-quantize-stats.cpp
test-regex-partial.cpp
test-rope.cpp
test-sampling.cpp	sampling : optimize samplers by reusing bucket sort (#15665 )	2025-08-31 20:41:02 +03:00
test-thread-safety.cpp	tests : update for LLAMA_SET_ROWS=1 (#14961 )	2025-07-30 15:12:02 +03:00
test-tokenizer-0.cpp
test-tokenizer-0.py
test-tokenizer-0.sh	scripts : make the shell scripts cross-platform (#14341 )	2025-06-30 10:17:18 +02:00
test-tokenizer-1-bpe.cpp
test-tokenizer-1-spm.cpp
test-tokenizer-random.py	requirements : update transformers/torch for Embedding Gemma (#15828 )	2025-09-09 06:06:52 +02:00
test-tokenizers-repo.sh	devops: add s390x & ppc64le CI (#15925 )	2025-09-27 02:03:33 +08:00