* Add AFMOE model support * Update to vocab * Add model sizing * Undo Rope change for ARCEE model * Address review comments * Update modeling code is_sliding -> use_rope, replace hard-coded logic * Fix AFMOE tokenizer * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update AFMoE tokenizer class identification to be more unique --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> |
||
|---|---|---|
| .. | ||
| models | ||
| CMakeLists.txt | ||
| llama-adapter.cpp | ||
| llama-adapter.h | ||
| llama-arch.cpp | ||
| llama-arch.h | ||
| llama-batch.cpp | ||
| llama-batch.h | ||
| llama-chat.cpp | ||
| llama-chat.h | ||
| llama-context.cpp | ||
| llama-context.h | ||
| llama-cparams.cpp | ||
| llama-cparams.h | ||
| llama-grammar.cpp | ||
| llama-grammar.h | ||
| llama-graph.cpp | ||
| llama-graph.h | ||
| llama-hparams.cpp | ||
| llama-hparams.h | ||
| llama-impl.cpp | ||
| llama-impl.h | ||
| llama-io.cpp | ||
| llama-io.h | ||
| llama-kv-cache-iswa.cpp | ||
| llama-kv-cache-iswa.h | ||
| llama-kv-cache.cpp | ||
| llama-kv-cache.h | ||
| llama-kv-cells.h | ||
| llama-memory-hybrid.cpp | ||
| llama-memory-hybrid.h | ||
| llama-memory-recurrent.cpp | ||
| llama-memory-recurrent.h | ||
| llama-memory.cpp | ||
| llama-memory.h | ||
| llama-mmap.cpp | ||
| llama-mmap.h | ||
| llama-model-loader.cpp | ||
| llama-model-loader.h | ||
| llama-model-saver.cpp | ||
| llama-model-saver.h | ||
| llama-model.cpp | ||
| llama-model.h | ||
| llama-quant.cpp | ||
| llama-quant.h | ||
| llama-sampling.cpp | ||
| llama-sampling.h | ||
| llama-vocab.cpp | ||
| llama-vocab.h | ||
| llama.cpp | ||
| unicode-data.cpp | ||
| unicode-data.h | ||
| unicode.cpp | ||
| unicode.h | ||