Tuesday, September 30, 2025

LLM application with Microsoft Phi3 LLM.

 To give kick start to learn and understanding the LLM and LLM application using Ollama. here is the step followed to gain the confidence and keep moving.

Ollama is kind of LLM model orchestration tool to run the LLMs on your local. 

1, Downloaded the Ollama for my Ubuntu linux . This will install Ollama and CLI to pull and run the LLM.

 curl -fsSL https://ollama.com/install.sh | sh

You can get the code from github here 

https://github.com/CodethinkerSP/ai/tree/master/Simple-RAG

Terminal, type 

ollama server

ollama pull phi3:latest

ollama run phi3:latest

Then follow these steps in the Visual Studio code and any of your favourite editor. I personally prefer Jupyter in VS code extension.

Steps followed on my local machine

  1. Installed Ollama and pulled the phi3:latest llm
  2. Ollama CLI : ollama serve and the ollama run phi3:latest
  3. Python code
    1. Load text file
    2. Chunked and embedding the data
    3. Storing the embeddings in the chromdb
    4. Hitting the Ollama API endpoint --> localhost:11434/api/generate with required "json" payload
from sentence_transformers import SentenceTransformer
import chromadb
dataset = []
# data loading
with open("output.txt", 'r', encoding='utf-8') as f:
dataset = f.readlines()

VECTOR_DB = []
EMBEDDING_MODEL = 'all-MiniLM-L6-v2'
LANGUAGE_MODEL = 'phi3:latest'
# initialize vector db
chroma_client = chromadb.PersistentClient(path="./chroma_db3")
# create collection
collection = chroma_client.get_or_create_collection(name="mydataset")
#embedding model
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
for data in dataset:
data = data.strip()
if data:
embedding = model.encode(data).tolist()
# store in vector db
VECTOR_DB.append((data, embedding))
print(f"Inserted {len(VECTOR_DB)} records into the vector database.")