Member-only story

Exploring Llama2 on CPU only VM

6 min readSep 11, 2023

Since Meta released the open source large language model Llama2, thanks to the effort of the community, the barrier to access a LLM to developers and normal users is largely removed, which is the so called democratised LLM.

Lets explore running a LLM on a KVM based VM.

The VM

Create a VM on a host machine of AMD processor 2.3Ghz (No GPU available) with 24 vCPU and 64GB memory, Ubuntu 22.04 Jammy Jellyfish.

Install the base build tools,

sudo apt update -y
sudo apt install -y build-essential

The Engine to Run LLM

Thanks to the open source community, especially the llama.cpp project, its now possible to run a quantised LLM model (as in the latest format of GGUF) even without GPU.

Let’s built it.

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
make

A sample program is also built for you to test the LLM as a command line tool, let’s make install it to the system level.

sudo cp main /usr/local/bin/llm

The LLM Model

We have the command line engine, where to get the quantised model files? TheBlok, purveyor of fine local LLMs for your fun and profit…

Exploring Llama2 on CPU only VM

The VM

The Engine to Run LLM

The LLM Model

Written by Zhimin Wen

No responses yet