llama_cpp_for_radxa_dragon_.../tests
jaime-m-p 37bef89433
tokenizer : BPE fixes (#7530)
* Random test: add_bos_token, add_eos_token
* Random test: add BPE models for testing
* Custom regex split fails with codepoint 0
* Fix falcon punctuation regex
* Refactor llm_tokenizer_bpe: move code to constructor
* Move 'add_special_bos/eos' logic to llm_tokenizer_bpe
* Move tokenizer flags to vocab structure.
* Default values for special_add_bos/eos
* Build vocab.special_tokens_cache using vocab token types
* Generalize 'jina-v2' per token attributes
* Fix unicode whitespaces (deepseek-coder, deepseek-llm)
* Skip missing byte tokens (falcon)
* Better unicode data generation
* Replace char32_t with uint32_t
2024-06-18 18:40:52 +02:00
..
.gitignore
CMakeLists.txt ggml : fix loongson compile warnings (#7537) 2024-05-31 14:17:10 +03:00
get-model.cpp
get-model.h
run-json-schema-to-grammar.mjs
test-autorelease.cpp
test-backend-ops.cpp Add support for sqrt on CUDA (#7953) 2024-06-17 00:23:04 +02:00
test-c.c
test-chat-template.cpp Fix phi3 chat template confusion with zephyr (#7449) 2024-05-23 16:15:15 +02:00
test-double-float.cpp
test-grad0.cpp ggml : refactor rope norm/neox (#7634) 2024-06-05 11:29:20 +03:00
test-grammar-integration.cpp Added support for . (any character) token in grammar engine. (#6467) 2024-06-06 06:08:52 -07:00
test-grammar-parser.cpp grammars: x{min,max} repetition operator (#6640) 2024-06-06 10:07:06 +01:00
test-json-schema-to-grammar.cpp tests : check the Python version (#7872) 2024-06-11 10:10:20 +03:00
test-llama-grammar.cpp
test-model-load-cancel.cpp
test-opt.cpp
test-quantize-fns.cpp
test-quantize-perf.cpp
test-rope.cpp ggml : refactor rope norm/neox (#7634) 2024-06-05 11:29:20 +03:00
test-sampling.cpp
test-tokenizer-0.cpp tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 2024-05-04 08:32:32 +03:00
test-tokenizer-0.py py : logging and flake8 suppression refactoring (#7081) 2024-05-05 08:07:48 +03:00
test-tokenizer-0.sh tests : fix test-tokenizer-0.sh 2024-05-28 15:04:09 +03:00
test-tokenizer-1-bpe.cpp llama : lookup word in vocab before doing BPE merges (#7193) 2024-05-11 11:12:06 +03:00
test-tokenizer-1-spm.cpp
test-tokenizer-random.py tokenizer : BPE fixes (#7530) 2024-06-18 18:40:52 +02:00