llama_cpp_for_radxa_dragon_.../tests
Jeff Bolz 2bbe4c2cf8
vulkan: Use VK_EXT_shader_64bit_indexing to handle large mat_mul(_id) (#18678)
This fixes incoherent output in Llama-4-Maverick-17B-128E-PAB-Q8_0, which
has a mul_mat_id with an A matrix that's Q8_0 8192 x 5120 x 128.

This should work when the number of blocks in the A matrix is less than 2^32
(for mul_mat_vec or mul_mm_cm2), or for mul_mm I think the limit is like
2^32*LOAD_VEC_A elements.

- Divide batch_stride by QUANT_K earlier, so the block index calculation works in 32b.
- Each vk_pipeline_struct has a linked list of pipelines that will allow it to handle
variants. So far this change just adds a single use case for this, compiling with the
e64BitIndexingEXT flag.
- Use the 64b indexing variant when the A matrix is larger than maxStorageBufferRange.

64-bit indexing has some cost - around 3-5% in MoE models, so it's worth the effort
to avoid enabling it unconditionally.
2026-01-12 12:32:13 +01:00
..
peg-parser
.gitignore
CMakeLists.txt tests : refactor test-backend-sampler (#18753) 2026-01-11 17:31:03 +02:00
get-model.cpp
get-model.h
run-json-schema-to-grammar.mjs
test-alloc.cpp
test-arg-parser.cpp vendor : update cpp-httplib to 0.30.0 (#18660) 2026-01-08 13:53:54 +01:00
test-autorelease.cpp
test-backend-ops.cpp vulkan: Use VK_EXT_shader_64bit_indexing to handle large mat_mul(_id) (#18678) 2026-01-12 12:32:13 +01:00
test-backend-sampler.cpp tests : refactor test-backend-sampler (#18753) 2026-01-11 17:31:03 +02:00
test-barrier.cpp
test-c.c
test-chat-parser.cpp
test-chat-peg-parser.cpp
test-chat-template.cpp
test-chat.cpp chat: make tool description and parameters optional per OpenAI spec (#18478) 2025-12-31 17:21:37 -06:00
test-double-float.cpp
test-gbnf-validator.cpp
test-gguf.cpp
test-grammar-integration.cpp
test-grammar-llguidance.cpp tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
test-grammar-parser.cpp
test-json-partial.cpp
test-json-schema-to-grammar.cpp
test-llama-grammar.cpp
test-log.cpp
test-lora-conversion-inference.sh
test-model-load-cancel.cpp
test-mtmd-c-api.c
test-opt.cpp
test-peg-parser.cpp
test-quantize-fns.cpp
test-quantize-perf.cpp
test-quantize-stats.cpp
test-regex-partial.cpp common/grammar : replace problematic backtracking regex [\s\S]* (#18342) 2026-01-03 16:02:43 -06:00
test-rope.cpp
test-sampling.cpp
test-state-restore-fragmented.cpp
test-thread-safety.cpp
test-tokenizer-0.cpp tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
test-tokenizer-0.py
test-tokenizer-0.sh
test-tokenizer-1-bpe.cpp tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
test-tokenizer-1-spm.cpp tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
test-tokenizer-random.py
test-tokenizers-repo.sh