llama_cpp_for_radxa_dragon_.../examples
Neo Zhang Jianyu 08d5986290
[SYCL] Optimize mul_mat for Q4_0 on Intel GPU (#12035)
* opt performance by reorder for Intel GPU

* detect hw type and save opt feature, and print opt feature

* correct name

* support optimize graph once when compute graph, record the opt status in tensor->extra, make CI passed

* add env variable GGML_SYCL_DISABLE_OPT for debug

* use syclex::architecture replace the custom hw define, update the guide for GGML_SYCL_DISABLE_OPT

* add performance data

* mv getrows functions to separeted files

* fix global variables

---------

Co-authored-by: arthw <14088817+arthw@users.noreply.github.com>
2025-02-24 22:33:23 +08:00
..
batched
batched-bench
batched.swift swift : fix llama-vocab api usage (#11645) 2025-02-04 13:15:24 +02:00
convert-llama2c-to-ggml
cvector-generator repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
deprecation-warning
embedding
eval-callback
export-lora export-lora : fix tok_embd tensor (#11330) 2025-01-21 14:07:12 +01:00
gbnf-validator Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639) 2025-01-30 19:13:58 +00:00
gen-docs
gguf
gguf-hash
gguf-split ci : use -no-cnv in gguf-split tests (#11254) 2025-01-15 18:28:35 +02:00
gritlm
imatrix examples: fix typo in imatrix/README.md (#11884) 2025-02-15 21:03:30 +02:00
infill
jeopardy
llama-bench llama-bench : fix unexpected global variable initialize sequence issue (#11832) 2025-02-14 02:13:43 +01:00
llama.android repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
llama.swiftui llama.swiftui : add "Done" dismiss button to help view (#11998) 2025-02-22 06:33:29 +01:00
llava llava: build clip image from pixels (#11999) 2025-02-22 15:28:28 +01:00
lookahead repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
lookup repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
main tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900) 2025-02-18 18:03:23 +00:00
parallel
passkey repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
perplexity Fix: Compile failure due to Microsoft STL breaking change (#11836) 2025-02-12 21:36:11 +01:00
quantize repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
quantize-stats
retrieval repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
rpc
run run: allow to customize prompt by env var LLAMA_PROMPT_PREFIX (#12041) 2025-02-23 17:15:51 +00:00
save-load-state
server server : disable Nagle's algorithm (#12020) 2025-02-22 11:46:31 +01:00
simple
simple-chat Add Jinja template support (#11016) 2025-01-21 13:18:51 +00:00
simple-cmake-pkg repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
speculative repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
speculative-simple
sycl [SYCL] Optimize mul_mat for Q4_0 on Intel GPU (#12035) 2025-02-24 22:33:23 +08:00
tokenize
tts tts : add guide tokens support (#11186) 2025-01-18 12:20:57 +02:00
chat-13B.bat
chat-13B.sh
chat-persistent.sh
chat-vicuna.sh
chat.sh
CMakeLists.txt
convert_legacy_llama.py
json_schema_pydantic_example.py
json_schema_to_grammar.py
llama.vim repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
llm.vim
Miku.sh
pydantic_models_to_grammar.py
pydantic_models_to_grammar_examples.py repo : update links to new url (#11886) 2025-02-15 16:40:57 +02:00
reason-act.sh
regex_to_grammar.py
server-llama2-13B.sh
server_embd.py
ts-type-to-grammar.sh