..
CMakeLists.txt
llama-adapter.cpp
llama-adapter.h
llama-arch.cpp
chat : fix kimi-k2 chat template ( #14852 )
2025-07-24 13:59:56 +02:00
llama-arch.h
model : add EXAONE 4.0 support ( #14630 )
2025-07-18 10:45:49 +02:00
llama-batch.cpp
llama : reuse compute graphs ( #14482 )
2025-07-17 19:08:33 +03:00
llama-batch.h
llama : reuse compute graphs ( #14482 )
2025-07-17 19:08:33 +03:00
llama-chat.cpp
chat : fix kimi-k2 chat template ( #14852 )
2025-07-24 13:59:56 +02:00
llama-chat.h
model : add EXAONE 4.0 support ( #14630 )
2025-07-18 10:45:49 +02:00
llama-context.cpp
llama : clarify comment about pp and tg graphs [no ci] ( #14895 )
2025-07-27 12:10:51 +02:00
llama-context.h
context : restore preemptive sched reset when LLAMA_SET_ROWS=0 ( #14870 )
2025-07-25 14:28:06 +03:00
llama-cparams.cpp
llama-cparams.h
llama : add high-throughput mode ( #14363 )
2025-07-16 16:35:42 +03:00
llama-grammar.cpp
llama-grammar.h
llama-graph.cpp
metal : fuse add, mul + add tests ( #14596 )
2025-07-18 20:37:26 +03:00
llama-graph.h
graph : refactor context to not pass gf explicitly ( #14629 )
2025-07-18 08:29:28 +03:00
llama-hparams.cpp
llama : add high-throughput mode ( #14363 )
2025-07-16 16:35:42 +03:00
llama-hparams.h
model : make rope_yarn_log_mul optional for deepseek2 ( #14896 )
2025-07-27 11:18:37 +03:00
llama-impl.cpp
llama-impl.h
llama-io.cpp
llama-io.h
llama-kv-cache-unified-iswa.cpp
llama : add high-throughput mode ( #14363 )
2025-07-16 16:35:42 +03:00
llama-kv-cache-unified-iswa.h
llama : add high-throughput mode ( #14363 )
2025-07-16 16:35:42 +03:00
llama-kv-cache-unified.cpp
kv-cache : fix k-shift for multiple streams ( #14742 )
2025-07-17 20:52:33 +03:00
llama-kv-cache-unified.h
llama : reuse compute graphs ( #14482 )
2025-07-17 19:08:33 +03:00
llama-kv-cells.h
kv-cache : use ggml_set_rows ( #14285 )
2025-07-03 10:53:35 +03:00
llama-memory-hybrid.cpp
llama : fix parameter order for hybrid memory initialization ( #14725 )
2025-07-16 21:17:25 +02:00
llama-memory-hybrid.h
kv-cache : use ggml_set_rows ( #14285 )
2025-07-03 10:53:35 +03:00
llama-memory-recurrent.cpp
memory : handle saving/loading null layers in recurrent memory ( #14675 )
2025-07-23 11:16:41 +03:00
llama-memory-recurrent.h
llama-memory.cpp
memory : correctly handle failure in apply() ( #14438 )
2025-06-30 18:03:03 +03:00
llama-memory.h
memory : correctly handle failure in apply() ( #14438 )
2025-06-30 18:03:03 +03:00
llama-mmap.cpp
llama-mmap.h
llama-model-loader.cpp
llama-model-loader.h
llama-model-saver.cpp
llama-model-saver.h
llama-model.cpp
model : make rope_yarn_log_mul optional for deepseek2 ( #14896 )
2025-07-27 11:18:37 +03:00
llama-model.h
model: add Ernie 4.5 MoE support ( #14658 )
2025-07-17 23:15:32 +02:00
llama-quant.cpp
quantize : fix minor logic flaw in --tensor-type ( #14572 )
2025-07-13 18:02:17 +02:00
llama-quant.h
llama-sampling.cpp
llama-sampling.h
llama-vocab.cpp
model : add EXAONE 4.0 support ( #14630 )
2025-07-18 10:45:49 +02:00
llama-vocab.h
Support diffusion models: Add Dream 7B ( #14644 )
2025-07-16 20:03:51 +08:00
llama.cpp
unicode-data.cpp
unicode-data.h
unicode.cpp
model : add Kimi-K2 support ( #14654 )
2025-07-15 21:54:22 +02:00
unicode.h
model : add Kimi-K2 support ( #14654 )
2025-07-15 21:54:22 +02:00