Exploring Llama2 on CPU only VM
Since Meta released the open source large language model Llama2, thanks to the effort of the community, the barrier to access a LLM to developers and normal users is largely removed, which is the so called democratised LLM.
Lets explore running a LLM on a KVM based VM.
The VM
Create a VM on a host machine of AMD processor 2.3Ghz (No GPU available) with 24 vCPU and 64GB memory, Ubuntu 22.04 Jammy Jellyfish.
Install the base build tools,
sudo apt update -y
sudo apt install -y build-essential
The Engine to Run LLM
Thanks to the open source community, especially the llama.cpp project, its now possible to run a quantised LLM model (as in the latest format of GGUF) even without GPU.
Let’s built it.
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
make
A sample program is also built for you to test the LLM as a command line tool, let’s make install it to the system level.
sudo cp main /usr/local/bin/llm
The LLM Model
We have the command line engine, where to get the quantised model files? TheBlok, purveyor of fine local LLMs for your fun and profit…