llama_cpp_for_radxa_dragon_.../examples
Ivy233 02082f1519
clip: Fix llama-llava-clip-quantize-cli quantization error under CUDA backend (#12566)
* [Fix] Compiling clip-quantize-cli and running it in a CUDA environment will cause ggml_fp16_to_fp32 to report an error when trying to access video memory. You need to switch to the CPU backend to run quantize.
After the fix, it will automatically run in the CPU backend and will no longer be bound to CUDA.

* [Fix]Roll back the signature and implementation of clip_model_load, and change the call in clip_model_quantize to clip_init.
2025-03-26 15:06:04 +01:00
..
batched
batched-bench llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
batched.swift llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
convert-llama2c-to-ggml
cvector-generator llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
deprecation-warning
embedding llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
eval-callback
export-lora common : refactor '-o' option (#12278) 2025-03-10 13:34:13 +02:00
gbnf-validator
gen-docs
gguf
gguf-hash
gguf-split
gritlm llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
imatrix llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
infill llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
jeopardy
llama-bench llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
llama.android llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
llama.swiftui llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
llava clip: Fix llama-llava-clip-quantize-cli quantization error under CUDA backend (#12566) 2025-03-26 15:06:04 +01:00
lookahead llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
lookup llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
main docs : bring llama-cli conversation/template docs up-to-date (#12426) 2025-03-17 21:14:32 +01:00
parallel llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
passkey llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
perplexity llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
quantize ggml : portability fixes for VS 2017 (#12150) 2025-03-04 18:53:26 +02:00
quantize-stats llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
retrieval llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
rpc
run run: de-duplicate fmt and format functions and optimize (#11596) 2025-03-25 18:46:11 +01:00
save-load-state llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
server server : Add verbose output to OAI compatible chat endpoint. (#12246) 2025-03-23 19:30:26 +01:00
simple
simple-chat llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
simple-cmake-pkg
speculative speculative : fix seg fault in certain cases (#12454) 2025-03-18 19:35:11 +02:00
speculative-simple llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 2025-03-13 12:35:44 +02:00
sycl [SYCL] Optimize mul_mat for Q4_0 on Intel GPU (#12035) 2025-02-24 22:33:23 +08:00
tokenize
tts llama-tts : avoid crashes related to bad model file paths (#12482) 2025-03-21 11:12:45 +02:00
chat-13B.bat
chat-13B.sh
chat-persistent.sh
chat-vicuna.sh
chat.sh
CMakeLists.txt
convert_legacy_llama.py
json_schema_pydantic_example.py
json_schema_to_grammar.py tool-call: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034) 2025-03-05 13:05:13 +00:00
llama.vim
llm.vim
Miku.sh
pydantic_models_to_grammar.py
pydantic_models_to_grammar_examples.py
reason-act.sh
regex_to_grammar.py
server-llama2-13B.sh
server_embd.py
ts-type-to-grammar.sh