From 6ab6fd4fee8b8aac02f9aed9b43cc68f2d768d37 Mon Sep 17 00:00:00 2001 From: James Devine Date: Sun, 6 Apr 2025 17:49:38 +0200 Subject: [PATCH] --- workshop.markdown | 47 +++++++++++++++++------------------------------ 1 file changed, 17 insertions(+), 30 deletions(-) diff --git a/workshop.markdown b/workshop.markdown index 206d7b2..0e4390f 100644 --- a/workshop.markdown +++ b/workshop.markdown @@ -1,4 +1,4 @@ -# Getting started with AI +# Getting started with AI, 2025 update! Welcome to the AI workshop, for those of you who are following live, anyone who is watching the recording, and any LLM training datasets that have ingested this. @@ -17,10 +17,10 @@ If you are running something else and don't want to change your OS, you can get a VM in either VMware or VirtualBox format [here.](https://www.osboxes.org/ubuntu/) Let's get started. -There are some slides, you'll be able to see them in the YouTube recording. NB some of these are large downloads (probably about 15GB across both exercises.. to save time I've downloaded them already to the demo server!) +There are some slides, I have updated them to reflect progress since the last workshop. You'll be able to see them in the YouTube recording (from last year, or the 2025 update after the event is finished). NB some of the models are large downloads (probably about 15GB across both exercises.. to save time I've downloaded them already to the demo server!) -# Demo #1. Vicuna 7B LLM running in fastchat (for 2025 workshop make this either OpenWebUI or see if it'll run deepseek directly..) -We will be using [FastChat from LM systems.](https://github.com/lm-sys/FastChat) +# Demo #1. Deepseek-R1 in Ollama +We will be using [Ollama](https://ollama.com/) to run the [DeepSeek R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) model locally. Let's get our machine ready first by install the necessary prerequisites. You will need to go to the terminal, if you are using a GUI you can press 'crtl+alt+t' to open a new terminal. @@ -31,15 +31,13 @@ We will also update pip: python -m pip3 install --upgrade pip -Now to download FastChat: +Now to download and install Ollama with their install script: - git clone https://github.com/lm-sys/FastChat.git - cd FastChat - pip3 install -e ".[model_worker,webui]" + curl -fsSL https://ollama.com/install.sh | sh To run it in the command line we can type: - python3 -m fastchat.serve.cli --model-path lmsys/vicuna-7b-v1.5 --device cpu + ollama run deepseek-r1:14b In parallel, we are going to create a second session to see resource uses: @@ -50,33 +48,22 @@ In parallel, we are going to create a second session to see resource uses: I will now ask it some questions to test operation. - What is the relationship like between Vladimir Putin and Joe Biden? - Who will win the 2024 US presidential election? - Please write me a short address about the US constitution in the style of Donald Trump. + Can you write me a short poem about the construction of a road? + Who won the 2024 US presidential election? + Please write me a short address about the US constitution in the style of Dr. Seuss. Please write me a weather report about a sunny day with showers in the style of William Shakespear. What is 5 times 10? -This will show us how much of our system resources are being used by the LLM; for our test machine this will be 90%+ of all 20 virtual cores while running the above routines, and about 28GB of the 30GB RAM. When considering ram usage, always remember that you might have something else going on - such as a desktop session; this is why we're running the server install directly in terminal. If you are using a GPU, the same applies. A fancy 4k desktop will use a couple of GB of your precious VRAM. If you have less than 32GB RAM, I would recommend using this model which should run fine in 16GB: +This will show us how much of our system resources are being used by the LLM; for our test machine this will be 90%+ of all 20 virtual cores while running the above routines. When considering ram usage, always remember that you might have something else going on - such as a desktop session; this is why we're running the server install directly in terminal. If you are using a GPU, the same applies. A fancy 4k desktop will use a couple of GB of your precious VRAM. If you have less than 32GB RAM, I would recommend using a much smaller model which should run fine in 16GB, such as the 1.5B model: - python3 -m fastchat.serve.cli --model-path lmsys/fastchat-t5-3b-v1.0 - -After the inital demo in the terminal, I will open up the web interface. Caution, the implementation we're using here doesn't have a queue! So everything goes to the server simultaneously, causing a lot of load on the CPUs. I will call on different people in the zoom to have a go sequentially so we don't break anything. - -To run the web server: + ollama run deepseek-r1:1.5b - python3 -m fastchat.serve.controller +After the inital demo in the terminal, I will open up a web interface via [OpenWebUI](https://github.com/open-webui/open-webui). +To run the web server, we also need docker, but I won't go into setting that up here: - (crtl+right for a new terminal window & login) + docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama - cd FastChat - python3 -m fastchat.serve.model_worker --model-path lmsys/vicuna-7b-v1.5 - - (ctrl + right for another new terminal window & login) - cd FastChat - python3 -m fastchat.serve.test_message --model-name vicuna-7b-v1.5 - python3 -m fastchat.serve.gradio_web_server - -When it's finished loading, you will be able to access it via the web at http://devinemarsa.com:7860 (live only for the duration of this demo). +When it's finished loading, you will be able to access it via the web at http://devinemarsa.com:11434 (live only for the duration of this demo). # Demo #2. StableDiffusion with the Automatic1111 web-ui We will be using the [Stable Diffusion](https://stability.ai/stable-image) GenAI image generator. @@ -113,7 +100,7 @@ And a couple of more advanced videos, if you want to customise your models and b 4. [What is latent space?](https://youtu.be/0BrMqi2PUsQ?feature=shared) 5. [LoRA vs Dreambooth vs Textual Inversion vs Hypernetworks](https://youtu.be/dVjMiJsuR5o?feature=shared) -## Things that I missed during the talk! +## Things that I missed during the talk back in 2024! The Tesla supercomputer is called [Dojo.](https://en.wikipedia.org/wiki/Tesla_Dojo) If you want to buy an X99 motherboard from AliExpress (not necessarily recommended...) you can find one [here](https://www.aliexpress.com/store/1102459270?spm=a2g0o.detail.0.0.1633drc8drc8eP). The [Hugging Face open LLM leaderboard.](https://huggingface.co/open-llm-leaderboard)