Exploring Llama2 on CPU only VM

Zhimin Wen
6 min readSep 11, 2023
Image by Regina from Pixabay

Since Meta released the open source large language model Llama2, thanks to the effort of the community, the barrier to access a LLM to developers and normal users is largely removed, which is the so called democratised LLM.

Lets explore running a LLM on a KVM based VM.

The VM

Create a VM on a host machine of AMD processor 2.3Ghz (No GPU available) with 24 vCPU and 64GB memory, Ubuntu 22.04 Jammy Jellyfish.

Install the base build tools,

sudo apt update -y
sudo apt install -y build-essential

The Engine to Run LLM

Thanks to the open source community, especially the llama.cpp project, its now possible to run a quantised LLM model (as in the latest format of GGUF) even without GPU.

Let’s built it.

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
make

A sample program is also built for you to test the LLM as a command line tool, let’s make install it to the system level.

sudo cp main /usr/local/bin/llm

The LLM Model

We have the command line engine, where to get the quantised model files? TheBlok, purveyor of fine local LLMs for your fun and profit

--

--