Modifiers to get Llama.cpp working using the built in NPU on the Radxa Dragon Wing Q6A SBC (Qualcomm QCS6490 cpu). Hacked together with Claude Code and Deepseek V4 Flash. It works, but the overall performance for TG is poor, ingestion is super fast - but
Find a file
2023-03-10 20:56:40 +02:00
.gitignore Initial release 2023-03-10 20:56:40 +02:00
convert-pth-to-ggml.py Initial release 2023-03-10 20:56:40 +02:00
ggml.c Initial release 2023-03-10 20:56:40 +02:00
ggml.h Initial release 2023-03-10 20:56:40 +02:00
main.cpp Initial release 2023-03-10 20:56:40 +02:00
Makefile Initial release 2023-03-10 20:56:40 +02:00
quantize.cpp Initial release 2023-03-10 20:56:40 +02:00
utils.cpp Initial release 2023-03-10 20:56:40 +02:00
utils.h Initial release 2023-03-10 20:56:40 +02:00