llama_cpp_for_radxa_dragon_.../examples
Henri Vasserman 20568fe60f
[Fix] Reenable server embedding endpoint (#1937)
* Add back embedding feature

* Update README
2023-06-20 01:12:39 +03:00
..
baby-llama build : fix and ignore MSVC warnings (#1889) 2023-06-16 21:23:53 +03:00
benchmark build : fix and ignore MSVC warnings (#1889) 2023-06-16 21:23:53 +03:00
embedding build : fix and ignore MSVC warnings (#1889) 2023-06-16 21:23:53 +03:00
jeopardy hooks : setting up flake8 and pre-commit hooks (#1681) 2023-06-17 13:32:48 +03:00
main minor : warning fixes 2023-06-17 20:24:11 +03:00
metal examples : fix examples/metal (#1920) 2023-06-18 10:52:10 +03:00
perplexity build : fix and ignore MSVC warnings (#1889) 2023-06-16 21:23:53 +03:00
quantize Allow "quantizing" to f16 and f32 (#1787) 2023-06-13 04:23:23 -06:00
quantize-stats build : fix and ignore MSVC warnings (#1889) 2023-06-16 21:23:53 +03:00
save-load-state build : fix and ignore MSVC warnings (#1889) 2023-06-16 21:23:53 +03:00
server [Fix] Reenable server embedding endpoint (#1937) 2023-06-20 01:12:39 +03:00
simple examples : add "simple" (#1840) 2023-06-16 21:58:09 +03:00
train-text-from-scratch train : get raw text instead of page with html (#1905) 2023-06-17 09:51:54 +03:00
alpaca.sh
chat-13B.bat
chat-13B.sh
chat-persistent.sh
chat-vicuna.sh examples : add chat-vicuna.sh (#1854) 2023-06-15 21:05:53 +03:00
chat.sh
CMakeLists.txt llama : fix kv_cache n init (close #1903) 2023-06-17 19:31:20 +03:00
common.cpp Only one CUDA stream per device for async compute (#1898) 2023-06-17 19:15:02 +02:00
common.h CUDA full GPU acceleration, KV cache in VRAM (#1827) 2023-06-14 19:47:19 +02:00
gpt4all.sh
Miku.sh
reason-act.sh