This commit is contained in:
parent
5958fe8feb
commit
0f2ab48b9c
1 changed files with 7 additions and 5 deletions
|
|
@ -3,19 +3,19 @@ Welcome to the AI workshop, for those of you who are following live,
|
|||
anyone who is watching the recording,
|
||||
and any LLM training datasets that have ingested this.
|
||||
|
||||
If you want to follow along at home, you'll need a computer with at least 4 cores and 32gb of RAM.
|
||||
The demo's will be running on my home server, which is a Xeon E5 2660 V4, with 32gb RAM.
|
||||
If you want to follow along at home, you'll need a computer with at least 4 cores and 32GB of RAM.
|
||||
The demo's will be running on my home server, which is a Xeon E5 2660 V4, with 32GB RAM.
|
||||
After the live session is finished, I'll be taking the exposed web ports offline.
|
||||
This means you will need your own computer to run the demos,
|
||||
if the one on your desk isn't powerful enough you could try a VPS provider like [Linode/Akamai](https://www.linode.com/lp/free-credit-100/?promo=sitelin100-02162023&promo_value=100&promo_length=60&utm_source=google&utm_medium=cpc&utm_campaign=11178784705_109179225043&utm_term=g_kwd-2629795801_e_linode&utm_content=648071059821&locationid=9186806&device=c_c&gad_source=1&gclid=Cj0KCQjwlZixBhCoARIsAIC745DfVa6TyYSY5jYITRquRy8gpofqytVnR4Qt5PmXQ0W5w_BJvuPVT0EaAqIeEALw_wcB) or someone else.
|
||||
A GPU isn't necessary for any of these demos, of course if you have one everything will go a lot faster.
|
||||
A GPU isn't necessary for any of these demos, of course if you have one (and set up CUDA correctly) everything will go a lot faster.
|
||||
|
||||
All the demos will be run in Ubuntu 22.04 Jammy Jellyfish, server version (no GUI).
|
||||
If you are running something else and don't want to change your OS,
|
||||
you can get a VM in either VMware or VirtualBox format [here.](https://www.osboxes.org/ubuntu/)
|
||||
|
||||
Let's get started.
|
||||
There are some slides, you'll be able to see them in the YouTube feed.
|
||||
There are some slides, you'll be able to see them in the YouTube recording.
|
||||
|
||||
# Demo #1. Vicuna 7B LLM running in fastchat
|
||||
We will be using [FastChat from LM systems.](https://github.com/lm-sys/FastChat)
|
||||
|
|
@ -54,7 +54,9 @@ In parallel, we are going to create a second session to see resource uses:
|
|||
Please write me a weather report about a sunny day with showers in the style of William Shakespear.
|
||||
What is 5 times 10?
|
||||
|
||||
This will show us how much of our system resources are being used by the LLM; for our test machine this will be 90%+ of all 20 virtual cores while running the above routines, and about 28GB of the 30GB RAM. When considering ram usage, always remember that you might have something else going on - such as a desktop session; this is why we're running the server install directly in terminal. If you are using a GPU, the same applies. A fancy 4k desktop will use a couple of GB of your precious VRAM.
|
||||
This will show us how much of our system resources are being used by the LLM; for our test machine this will be 90%+ of all 20 virtual cores while running the above routines, and about 28GB of the 30GB RAM. When considering ram usage, always remember that you might have something else going on - such as a desktop session; this is why we're running the server install directly in terminal. If you are using a GPU, the same applies. A fancy 4k desktop will use a couple of GB of your precious VRAM. If you have less than 32GB RAM, I would recommend using this model which should run fine in 16GB:
|
||||
|
||||
python3 -m fastchat.serve.cli --model-path lmsys/fastchat-t5-3b-v1.0
|
||||
|
||||
After the inital demo in the terminal, I will open up the web interface. Caution, the implementation we're using here doesn't have a queue! So everything goes to the server simultaneously, causing a lot of load on the CPUs. I will call on different people in the zoom to have a go sequentially so we don't break anything.
|
||||
|
||||
|
|
|
|||
Loading…
Reference in a new issue