llama_cpp_for_radxa_dragon_wing_q6a

pingu_98/llama_cpp_for_radxa_dragon_wing_q6a

History

nullname ed75977717 ggml-hexagon: create generalized functions for cpu side op (#17500 ) * refactor: replace ggml_hexagon_mul_mat with template-based binary operation for improved flexibility * refactor: replace ggml_hexagon_mul_mat_id with template-based binary operation for improved flexibility * refactor: initialize buffer types and streamline dspqueue_buffers_init calls for clarity * add comment * refactor: remove redundant buffer checks in hexagon supported operations * wip * add missing include to fix weak symbol warning * add ggml_hexagon_op_generic * refactor: simplify tensor operation initialization and buffer management in hexagon implementation * refactor: streamline hexagon operation initialization and buffer management * refactor: update function signatures and streamline request handling in hexagon operations * wip * ggml-hexagon: clean up code formatting and improve unary operation handling * wip * rename * fix: add support for permuted F16 tensors and enhance quantization checks in matrix operations * refactor: replace ggml_hexagon_mul_mat with template-based binary operation for improved flexibility refactor: replace ggml_hexagon_mul_mat_id with template-based binary operation for improved flexibility refactor: initialize buffer types and streamline dspqueue_buffers_init calls for clarity refactor: remove redundant buffer checks in hexagon supported operations add missing include to fix weak symbol warning add ggml_hexagon_op_generic refactor: simplify tensor operation initialization and buffer management in hexagon implementation refactor: streamline hexagon operation initialization and buffer management refactor: update function signatures and streamline request handling in hexagon operations ggml-hexagon: clean up code formatting and improve unary operation handling fix: add support for permuted F16 tensors and enhance quantization checks in matrix operations # Conflicts: # ggml/src/ggml-hexagon/ggml-hexagon.cpp * hexagon: fix merge conflicts * hexagon: minor cleanup for buffer support checks * hexagon: factor out op_desc and the overal op logging * hexagon: further simplify and cleanup op dispatch logic * snapdragon: update adb scripts to use llama-cli and llama-completion * fix pipeline failure --------- Co-authored-by: Max Krasnyansky <maxk@qti.qualcomm.com>		2025-12-22 23:13:24 -08:00
..
apple
jinja
snapdragon	ggml-hexagon: create generalized functions for cpu side op (#17500 )	2025-12-22 23:13:24 -08:00
bench-models.sh	scripts : add script to bench models (#16894 )	2025-11-02 00:15:31 +02:00
build-info.sh
check-requirements.sh
compare-commits.sh
compare-llama-bench.py
compare-logprobs.py	scripts: add script to compare logprobs of llama.cpp against other frameworks (#17947 )	2025-12-13 22:33:29 +01:00
create_ops_docs.py
debug-test.sh
fetch_server_test_models.py
gen-authors.sh
gen-unicode-data.py
get-flags.mk
get-hellaswag.sh
get-pg.sh
get-wikitext-2.sh
get-wikitext-103.sh
get-winogrande.sh
get_chat_template.py
hf.sh
install-oneapi.bat
serve-static.js	ggml webgpu: add support for emscripten builds (#17184 )	2025-12-03 10:25:34 +01:00
server-bench.py
sync-ggml-am.sh
sync-ggml.last	sync : ggml	2025-12-14 08:33:51 +02:00
sync-ggml.sh
sync_vendor.py	server: introduce API for serving / loading / unloading multiple models (#17470 )	2025-12-01 19:41:04 +01:00
tool_bench.py
tool_bench.sh
verify-checksum-models.py
xxd.cmake