llama_cpp_for_radxa_dragon_wing_q6a

pingu_98/llama_cpp_for_radxa_dragon_wing_q6a

History

Max Krasnyansky 39fb81f875 hexagon refactor all Ops to use local context struct (#19819 ) * hexagon: refactor set/get/sum-rows ops to use local context * hexagon: refactor ROPE and Softmax Ops to use local context Improves performance a bit by precomputing things and saving in the context. * hexagon: refactor activation ops to use local context struct * hexagon: refactor unary ops to use local context struct and DMA/VTCM * hexagon: use aligned hvx_scale function * hexagon: remove unused fields from op_context * hexagon: rewrite ROPE to use DMA and VTCM scratchpad * hex-rope: keep N rows in scratchpad (instead of just two) * hex-rope: introduce rowidx cache * hex-rope: remove unused fields * hex-rope: rewrite dma prefetch logic to allow for multi-row fetch/compute also removes the need for fastdiv. * hex-rope: minor formatting * hex-rope: use indices and unroll the loops * hex-rope: more updates to cleanup rope-block handling * hexagon: cleanup supported type/dims checks * hexagon: all reduce funcs replicated across lanes There is no need to explicitly replicate the first value. * snapdragon: update adb and windows scripts to use ubatch-size 256 Updated Ops support handles larger ubatches.		2026-02-23 16:32:14 -08:00
..
apple
jinja
snapdragon	hexagon refactor all Ops to use local context struct (#19819 )	2026-02-23 16:32:14 -08:00
bench-models.sh
build-info.sh
check-requirements.sh
compare-commits.sh
compare-llama-bench.py
compare-logprobs.py
create_ops_docs.py
debug-test.sh
fetch_server_test_models.py
gen-authors.sh
gen-unicode-data.py
get-flags.mk
get-hellaswag.sh
get-pg.sh
get-wikitext-2.sh
get-wikitext-103.sh
get-winogrande.sh
get_chat_template.py
hf.sh
install-oneapi.bat
pr2wt.sh
serve-static.js
server-bench.py
sync-ggml-am.sh
sync-ggml.last
sync-ggml.sh
sync_vendor.py
tool_bench.py
tool_bench.sh
verify-checksum-models.py
xxd.cmake