llama_cpp_for_radxa_dragon_.../ggml
Johannes Gäßler cb5fad4c6c
CUDA: refactor and optimize IQ MMVQ (#8215)
* CUDA: refactor and optimize IQ MMVQ

* uint -> uint32_t

* __dp4a -> ggml_cuda_dp4a

* remove MIN_CC_DP4A checks

* change default

* try CI fix
2024-07-01 20:39:06 +02:00
..
cmake
include
src CUDA: refactor and optimize IQ MMVQ (#8215) 2024-07-01 20:39:06 +02:00
CMakeLists.txt ggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CUDA_FORCE_CUBLAS (cmake) (#8140) 2024-06-26 21:34:14 +02:00
ggml_vk_generate_shaders.py