Member-only story
Retrieval-Augmented Generation with Postgres Vector DB and LlamaIndex
To allow Large Language Model deal with data that is not available during the pre-training stage or to deal with private data, one of the common technique without retraining the model is the Retrieval-Augmented Generation (RAG). In essence, this technique retrieve the relevant data and supplies them to the LLM for it to generate the result.
These text data can be represented with its embeddings format and saved into a vector database. When required, the similar embeddings of a question text can be retrieved based on the vector search algorithm from the database. The result together with the original question forms a relevant context, from which the LLM can inference its answer.
Lets explore RAG with Postgres vector DB. We will implement it using LLamaIndex framework without excessive magics, such as condensing the framework into 4 lines of code, instead we will drill down a bit deeper to understand how RAG system works.
Install the Postgres Vector DB on OpenShift
Install the Postgres DB with the operator of CloudNativePG.
oc apply -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.20/releases/cnpg-1.20.2.yaml