llama_cpp_for_radxa_dragon_.../tests
Aman Gupta a972faebed
CUDA: Add mul_mat_id support for the mmf kernel (#15767)
* CUDA: Add mul_mat_id support the mmf

Add support for mul_mat_id for bs < 16

* Review: use warp_size, fix should_use_mmf condition

* Launch one block per expert, stride along n_expert_used

* templatize mul_mat_id

* Pad shmem to 16 bytes, add helper function mul_mat_f_switch_ids

* Reduce compile times by dividing mmf into f16, bf16 and f32 variants

* Divide mmf by ncols_dst

* Add missing files

* Fix MUSA/HIP builds
2025-09-09 14:38:02 +08:00
..
.gitignore
CMakeLists.txt finetune: SGD optimizer, more CLI args (#13873) 2025-08-14 12:03:57 +02:00
get-model.cpp
get-model.h
run-json-schema-to-grammar.mjs
test-arg-parser.cpp
test-autorelease.cpp
test-backend-ops.cpp CUDA: Add mul_mat_id support for the mmf kernel (#15767) 2025-09-09 14:38:02 +08:00
test-barrier.cpp
test-c.c ggml : remove kompute backend (#14501) 2025-07-03 07:48:32 +03:00
test-chat-parser.cpp chat : Deepseek V3.1 reasoning and tool calling support (OpenAI Style) (#15533) 2025-09-08 16:59:48 +02:00
test-chat-template.cpp model : add support for Seed-OSS (#15490) 2025-08-23 15:21:52 +02:00
test-chat.cpp chat : Deepseek V3.1 reasoning and tool calling support (OpenAI Style) (#15533) 2025-09-08 16:59:48 +02:00
test-double-float.cpp
test-gbnf-validator.cpp
test-gguf.cpp
test-grammar-integration.cpp
test-grammar-llguidance.cpp
test-grammar-parser.cpp
test-json-partial.cpp
test-json-schema-to-grammar.cpp json : support enum values within allOf (#15830) 2025-09-08 16:14:32 -05:00
test-llama-grammar.cpp
test-log.cpp
test-lora-conversion-inference.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
test-model-load-cancel.cpp
test-mtmd-c-api.c
test-opt.cpp tests : fix test-opt with GGML_BACKEND_DL (#15599) 2025-08-26 22:14:38 +02:00
test-quantize-fns.cpp
test-quantize-perf.cpp
test-quantize-stats.cpp
test-regex-partial.cpp
test-rope.cpp
test-sampling.cpp sampling : optimize samplers by reusing bucket sort (#15665) 2025-08-31 20:41:02 +03:00
test-thread-safety.cpp tests : update for LLAMA_SET_ROWS=1 (#14961) 2025-07-30 15:12:02 +03:00
test-tokenizer-0.cpp
test-tokenizer-0.py
test-tokenizer-0.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00
test-tokenizer-1-bpe.cpp
test-tokenizer-1-spm.cpp
test-tokenizer-random.py requirements : update transformers/torch for Embedding Gemma (#15828) 2025-09-09 06:06:52 +02:00
test-tokenizers-repo.sh scripts : make the shell scripts cross-platform (#14341) 2025-06-30 10:17:18 +02:00