Get the tech stack up and running
Install Docker, pull the required images, and launch the full local stack with a single command.
What you'll build
By the end of this chapter you'll have all four services running locally inside Docker containers:
- Ollama — local LLM inference
- Qdrant — vector database for semantic search
- Langflow — visual local pipeline builder
- Postgres — relational store used by Langflow
Select your platform below — the rest of this guide will adapt automatically.
Select your platform — the guide will adapt
Step 1 — Install Docker
Linux (Ubuntu / Debian)
# Remove any old versions
sudo apt-get remove -y docker docker-engine docker.io containerd runc
# Add Docker's GPG key and repo
sudo apt-get update
sudo apt-get install -y ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
| sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" \
| sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker Engine + Compose plugin
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io \
docker-buildx-plugin docker-compose-plugin
# Run Docker without sudo
sudo usermod -aG docker $USER
newgrp dockerVerify:
docker --version
docker compose versionStep 2 — Create the project folder
mkdir local-pipeline && cd local-pipelineStep 3 — Write the docker-compose.yml
Linux supports both CPU-only and NVIDIA GPU acceleration. Choose the variant that matches your machine.
CPU only
services:
# ── Ollama (CPU) ────────────────────────────────────
ollama:
image: ollama/ollama:latest
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
restart: unless-stopped
# ── Qdrant ─────────────────────────────────────────
qdrant:
image: qdrant/qdrant:latest
container_name: qdrant
ports:
- "6333:6333"
- "6334:6334"
volumes:
- qdrant_data:/qdrant/storage
restart: unless-stopped
# ── Postgres ────────────────────────────────────────
postgres:
image: postgres:16-alpine
container_name: langflow_postgres
environment:
POSTGRES_USER: langflow
POSTGRES_PASSWORD: langflow
POSTGRES_DB: langflow
volumes:
- postgres_data:/var/lib/postgresql/data
restart: unless-stopped
# ── Langflow ────────────────────────────────────────
langflow:
image: langflowai/langflow:latest
container_name: langflow
ports:
- "7860:7860"
environment:
LANGFLOW_DATABASE_URL: postgresql://langflow:langflow@postgres:5432/langflow
LANGFLOW_SUPERUSER: admin
LANGFLOW_SUPERUSER_PASSWORD: changeme
depends_on:
- postgres
volumes:
- langflow_data:/app/langflow
restart: unless-stopped
volumes:
ollama_data:
qdrant_data:
postgres_data:
langflow_data:NVIDIA GPU
First, install the NVIDIA Container Toolkit:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
| sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart dockerVerify Docker can see your GPU:
docker run --rm --gpus all nvidia/cuda:12.3.0-base-ubuntu22.04 nvidia-smiThen use this docker-compose.yml — identical to the CPU version except for the deploy block on Ollama:
services:
# ── Ollama (NVIDIA GPU) ─────────────────────────────
ollama:
image: ollama/ollama:latest
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
restart: unless-stopped
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
# ── Qdrant ─────────────────────────────────────────
qdrant:
image: qdrant/qdrant:latest
container_name: qdrant
ports:
- "6333:6333"
- "6334:6334"
volumes:
- qdrant_data:/qdrant/storage
restart: unless-stopped
# ── Postgres ────────────────────────────────────────
postgres:
image: postgres:16-alpine
container_name: langflow_postgres
environment:
POSTGRES_USER: langflow
POSTGRES_PASSWORD: langflow
POSTGRES_DB: langflow
volumes:
- postgres_data:/var/lib/postgresql/data
restart: unless-stopped
# ── Langflow ────────────────────────────────────────
langflow:
image: langflowai/langflow:latest
container_name: langflow
ports:
- "7860:7860"
environment:
LANGFLOW_DATABASE_URL: postgresql://langflow:langflow@postgres:5432/langflow
LANGFLOW_SUPERUSER: admin
LANGFLOW_SUPERUSER_PASSWORD: changeme
depends_on:
- postgres
volumes:
- langflow_data:/app/langflow
restart: unless-stopped
volumes:
ollama_data:
qdrant_data:
postgres_data:
langflow_data:A 4 GB VRAM card (e.g. RTX 3050) handles llama3.2:3b. Use 8 GB+ for the llama3.2:8b model. With GPU, inference is roughly 10–20× faster than CPU.
Change LANGFLOW_SUPERUSER_PASSWORD before exposing Langflow outside localhost.
Step 4 — Start everything
docker compose up -dDocker pulls all images on first run (~3–4 GB). Subsequent starts are instant.
Check every container is running:
docker compose psExpected output:
NAME IMAGE STATUS
langflow langflowai/langflow:latest Up
langflow_postgres postgres:16-alpine Up (healthy)
ollama ollama/ollama:latest Up
qdrant qdrant/qdrant:latest UpStep 5 — Pull your first LLM
Ollama is running but has no models yet:
docker exec -it ollama ollama pull llama3.2:3bVerify:
docker exec -it ollama ollama listNAME ID SIZE MODIFIED
llama3.2:3b ... 2.0 GB a few seconds agollama3.2:3b works on CPU with 8 GB RAM. With NVIDIA GPU enabled, try llama3.2:8b (requires 8 GB VRAM) for much better quality.
Step 6 — Verify all services
Open each URL in your browser:
| Service | URL | What you should see |
|---|---|---|
| Ollama API | http://localhost:11434 | Ollama is running |
| Qdrant Dashboard | http://localhost:6333/dashboard | Collections UI |
| Langflow | http://localhost:7860 | Login page |
Quick test — chat with your LLM
curl http://localhost:11434/api/generate -d '{
"model": "llama3.2:3b",
"prompt": "In one sentence, what is a vector database?",
"stream": false
}'You'll get a JSON response with the model's answer in the response field.
Summary
You now have a complete local local stack running in Docker:
- Ollama serving a local LLM on
localhost:11434 - Qdrant ready for vector embeddings on
localhost:6333 - Langflow's visual editor available on
localhost:7860
In the next chapter, we'll log into Langflow and build your first local pipeline.