Arun Pandian M

Android Dev | Full-Stack & AI Learner

Jun 5, 2026

Written by: Arun Pandian M•Published on: Jun 5, 2026

Understanding LLMs, Ollama, and Inference

Before building AI applications, we need to understand three fundamental concepts:

LLM
↓
Ollama
↓
Inference

https://storage.googleapis.com/lambdabricks-cd393.firebasestorage.app/img_understand_olm_inference.svg?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=firebase-adminsdk-fbsvc%40lambdabricks-cd393.iam.gserviceaccount.com%2F20260722%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20260722T203118Z&X-Goog-Expires=3600&X-Goog-SignedHeaders=host&X-Goog-Signature=88d71f10fcbebadd69e5ac2d829b830a14dfb23b33e0141415d5b450e4583e24ac4e2b6457403e3e3775bffe838c1c045d92a132ce42656eac5c420e6b16feb98f770fe999618047949427a32d8770c16e07221a58cba035f76674a4313f90cb59b51cdfba01c1bdb65e21e6ff1a5569f63b3adca1fddb7a3abcbb3e723f694f5f3d535d962caf5e101f352842ec320aff513f1f30a2bad17b1e928ff3b552b4b0641efd0d4b112beac452a47ea53887e188708bde0e030430d7ae4f09c7e1956c7f9040cc3ed4c38f937b0a30bddada97c87f5da601493d7f0a3d13b94931bc3068a357ba9da0906e8e63c0eaaf00d5e6680a0287e5a22af8f5f9653ef3b388

What is an LLM?

LLM stands for Large Language Model.

Examples:

Llama

Phi

Mistral

A language model predicts the next piece of text.

Example:

Input:

The capital of France is

Prediction:

Paris

Every response from an LLM is generated one token at a time.

Training vs Inference

Two terms you’ll hear frequently:

Training

The model learns patterns.

Books
Code
Articles
↓
Training
↓
Model

Inference

The model answers questions.

Question
↓
Model
↓
Answer

As AI application engineers, we mostly perform inference.

What is Ollama?

Think of Ollama as a runtime.

Java
↓
JVM

Python
↓
Interpreter

LLM
↓
Ollama

Java
↓
JVM

Python
↓
Interpreter

LLM
↓
Ollama

Ollama loads and runs models on your machine.

Example:

ollama run phi3:mini

Calling a Model

Once Ollama is running:

import ollama

response = ollama.chat(
    model="phi3:mini",
    messages=[
        {
            "role": "user",
            "content": "What is Kotlin?"
        }
    ]
)

print(response["message"]["content"])

Flow:

Python
↓
Ollama
↓
Model
↓
Response

Experiment

Try:

What is Android?

Then:

Explain Android to a beginner.

Notice how the model changes its answer based on the input.

#MachineLearning#SoftwareEngineering#BuildInPublic#LearningInPublic#LocalLLM#TechEducation#AIJourney#ArtificialIntelligence#AIEngineering#AIAgents#LLM#GenerativeAI#LargeLanguageModels#Ollama#Inference#OpenSourceAI#Python#PromptEngineering#AIApplications#DeveloperTools

Next →Understanding Ollama: Installing, Managing, and Running Local AI Models

Recommended for you

Basic Interaction with LLMs — The Concepts Every AI Engineer Must Learn First

1 min read

Understanding Ollama: Installing, Managing, and Running Local AI Models

1 min read

LB LAMBDA BRICKS