llama_cpp_for_radxa_dragon_.../scripts
Max Krasnyansky 39fb81f875
hexagon refactor all Ops to use local context struct (#19819)
* hexagon: refactor set/get/sum-rows ops to use local context

* hexagon: refactor ROPE and Softmax Ops to use local context

Improves performance a bit by precomputing things and saving in the context.

* hexagon: refactor activation ops to use local context struct

* hexagon: refactor unary ops to use local context struct and DMA/VTCM

* hexagon: use aligned hvx_scale function

* hexagon: remove unused fields from op_context

* hexagon: rewrite ROPE to use DMA and VTCM scratchpad

* hex-rope: keep N rows in scratchpad (instead of just two)

* hex-rope: introduce rowidx cache

* hex-rope: remove unused fields

* hex-rope: rewrite dma prefetch logic to allow for multi-row fetch/compute

also removes the need for fastdiv.

* hex-rope: minor formatting

* hex-rope: use indices and unroll the loops

* hex-rope: more updates to cleanup rope-block handling

* hexagon: cleanup supported type/dims checks

* hexagon: all reduce funcs replicated across lanes

There is no need to explicitly replicate the first value.

* snapdragon: update adb and windows scripts to use ubatch-size 256

Updated Ops support handles larger ubatches.
2026-02-23 16:32:14 -08:00
..
apple
jinja
snapdragon hexagon refactor all Ops to use local context struct (#19819) 2026-02-23 16:32:14 -08:00
bench-models.sh benches : update models + numbers (#19359) 2026-02-05 14:34:07 +02:00
build-info.sh
check-requirements.sh
compare-commits.sh
compare-llama-bench.py ggml-cuda: enable cuda-graphs for n-cpu-moe (#18934) 2026-01-24 14:25:20 +08:00
compare-logprobs.py scripts: add script to compare logprobs of llama.cpp against other frameworks (#17947) 2025-12-13 22:33:29 +01:00
create_ops_docs.py
debug-test.sh refactor : remove libcurl, use OpenSSL when available (#18828) 2026-01-14 18:02:47 +01:00
fetch_server_test_models.py
gen-authors.sh
gen-unicode-data.py
get-flags.mk
get-hellaswag.sh
get-pg.sh
get-wikitext-2.sh
get-wikitext-103.sh
get-winogrande.sh
get_chat_template.py
hf.sh
install-oneapi.bat
pr2wt.sh scripts : add support for forks in pr2wt.sh (#19540) 2026-02-12 13:14:28 +01:00
serve-static.js refactor : remove libcurl, use OpenSSL when available (#18828) 2026-01-14 18:02:47 +01:00
server-bench.py
sync-ggml-am.sh
sync-ggml.last sync : ggml 2026-02-15 22:24:29 +02:00
sync-ggml.sh
sync_vendor.py vendor : update cpp-httplib to 0.34.0 (#19830) 2026-02-23 21:05:48 +01:00
tool_bench.py refactor : remove libcurl, use OpenSSL when available (#18828) 2026-01-14 18:02:47 +01:00
tool_bench.sh
verify-checksum-models.py
xxd.cmake