llama_cpp_for_radxa_dragon_wing_q6a

History

Georgi Gerganov 38566680cd ggml : add IQ2 to test-backend-ops + refactoring (#4990 ) * ggml : add IQ2 to test-backend-ops + refactoring ggml-ci * cuda : update supports_op for IQ2 ggml-ci * ci : enable LLAMA_CUBLAS=1 for CUDA nodes ggml-ci * cuda : fix out-of-bounds-access in `mul_mat_vec_q` ggml-ci * tests : avoid creating RNGs for each Q tensor ggml-ci * tests : avoid creating RNGs for each tensor ggml-ci		2024-01-17 18:54:56 +02:00
..
CMakeLists.txt	metal : create autorelease pool during library build (#4970 )	2024-01-17 18:38:39 +02:00
test-autorelease.cpp	metal : create autorelease pool during library build (#4970 )	2024-01-17 18:38:39 +02:00
test-backend-ops.cpp	ggml : add IQ2 to test-backend-ops + refactoring (#4990 )	2024-01-17 18:54:56 +02:00
test-c.c
test-double-float.cpp
test-grad0.cpp	cuda : improve cuda pool efficiency using virtual memory (#4606 )	2023-12-24 14:34:22 +01:00
test-grammar-parser.cpp
test-llama-grammar.cpp
test-opt.cpp
test-quantize-fns.cpp	ggml : SOTA 2-bit quants (add IQ2_XS) (#4856 )	2024-01-11 21:39:39 +02:00
test-quantize-perf.cpp
test-rope.cpp
test-sampling.cpp
test-tokenizer-0-falcon.cpp
test-tokenizer-0-falcon.py
test-tokenizer-0-llama.cpp
test-tokenizer-0-llama.py
test-tokenizer-1-bpe.cpp
test-tokenizer-1-llama.cpp