llama_cpp_for_radxa_dragon_wing_q6a

pingu_98/llama_cpp_for_radxa_dragon_wing_q6a

History

slaren 0d56246f4b ggml : group all experts in a single ggml_mul_mat_id (#6505 ) * ggml : group all experts in a single ggml_mul_mat_id cuda : improve mmid row copy * cuda : fix bin bcast with non-cont src0 * test-backend-ops : only run all mul mat tests for base types * llama : disable moe offloading with SYCL --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>		2024-04-18 15:18:48 +02:00
..
build-info.cmake
build-info.sh
check-requirements.sh	python : add check-requirements.sh and GitHub workflow (#4585 )	2023-12-29 16:50:29 +02:00
ci-run.sh	ci : add model tests + script wrapper (#4586 )	2024-01-26 14:18:00 +02:00
compare-commits.sh	ggml : group all experts in a single ggml_mul_mat_id (#6505 )	2024-04-18 15:18:48 +02:00
compare-llama-bench.py	compare-llama-bench.py: fix long hexsha args (#6424 )	2024-04-01 13:30:43 +02:00
convert-gg.sh
gen-authors.sh	license : update copyright notice + add AUTHORS (#6405 )	2024-04-09 09:23:19 +03:00
gen-build-info-cpp.cmake
get-flags.mk	build : pass all warning flags to nvcc via -Xcompiler (#5570 )	2024-02-18 16:21:52 -05:00
get-hellaswag.sh	scripts : add get-winogrande.sh	2024-01-18 20:45:39 +02:00
get-pg.sh	scripts : improve get-pg.sh (#4838 )	2024-01-09 19:21:13 +02:00
get-wikitext-2.sh	model: support arch `DbrxForCausalLM` (#6515 )	2024-04-13 11:33:52 +02:00
get-wikitext-103.sh	lookup: complement data from context with general text statistics (#5479 )	2024-03-23 01:24:36 +01:00
get-winogrande.sh	scripts : add get-winogrande.sh	2024-01-18 20:45:39 +02:00
hf.sh	scripts : add --outdir option to hf.sh (#6600 )	2024-04-11 16:22:47 +03:00
install-oneapi.bat	support SYCL backend windows build (#5208 )	2024-01-31 08:08:07 +05:30
LlamaConfig.cmake.in	cuda : rename build flag to LLAMA_CUDA (#6299 )	2024-03-26 01:16:01 +01:00
pod-llama.sh	cuda : rename build flag to LLAMA_CUDA (#6299 )	2024-03-26 01:16:01 +01:00
qnt-all.sh
run-all-perf.sh
run-all-ppl.sh
run-with-preset.py	scripts : move run-with-preset.py from root to scripts folder	2024-01-26 17:09:44 +02:00
server-llm.sh	cuda : rename build flag to LLAMA_CUDA (#6299 )	2024-03-26 01:16:01 +01:00
sync-ggml-am.sh	license : update copyright notice + add AUTHORS (#6405 )	2024-04-09 09:23:19 +03:00
sync-ggml.last	sync : ggml	2024-04-09 20:29:06 +03:00
sync-ggml.sh	license : update copyright notice + add AUTHORS (#6405 )	2024-04-09 09:23:19 +03:00
verify-checksum-models.py