llama_cpp_for_radxa_dragon_wing_q6a

pingu_98/llama_cpp_for_radxa_dragon_wing_q6a

Modifiers to get Llama.cpp working using the built in NPU on the Radxa Dragon Wing Q6A SBC (Qualcomm QCS6490 cpu). Hacked together with Claude Code and Deepseek V4 Flash. It works, but the overall performance for TG is poor, ingestion is super fast - but

Find a file

Georgi Gerganov 26c0846629 Initial release		2023-03-10 20:56:40 +02:00
.gitignore	Initial release	2023-03-10 20:56:40 +02:00
convert-pth-to-ggml.py	Initial release	2023-03-10 20:56:40 +02:00
ggml.c	Initial release	2023-03-10 20:56:40 +02:00
ggml.h	Initial release	2023-03-10 20:56:40 +02:00
main.cpp	Initial release	2023-03-10 20:56:40 +02:00
Makefile	Initial release	2023-03-10 20:56:40 +02:00
quantize.cpp	Initial release	2023-03-10 20:56:40 +02:00
utils.cpp	Initial release	2023-03-10 20:56:40 +02:00
utils.h	Initial release	2023-03-10 20:56:40 +02:00