This website requires JavaScript.
Explore
Help
Sign In
pingu_98
/
llama_cpp_for_radxa_dragon_wing_q6a
Watch
1
Star
0
Fork
You've already forked llama_cpp_for_radxa_dragon_wing_q6a
0
Code
Issues
Pull requests
Projects
Releases
Packages
Wiki
Activity
Actions
dcad77cc3b
llama_cpp_for_radxa_dragon_...
/
ggml
History
Johannes Gäßler
9725a313be
CUDA: reduce MMQ stream-k overhead (
#22298
)
...
* CUDA: reduce MMQ stream-k overhead * use 32 bit integers for kbc
2026-04-25 14:15:03 +02:00
..
cmake
ggml: backend-agnostic tensor parallelism (experimental) (
#19378
)
2026-04-09 16:42:19 +02:00
include
CUDA: manage NCCL communicators in context (
#21891
)
2026-04-15 15:58:40 +02:00
src
CUDA: reduce MMQ stream-k overhead (
#22298
)
2026-04-25 14:15:03 +02:00
.gitignore
CMakeLists.txt
HIP: flip GGML_HIP_GRAPHS to default on (
#22254
)
2026-04-23 02:34:31 +02:00