ChromaDB Tutorial: The Complete Beginner's Guide (2026)

What is ChromaDB? (Quick Answer)

ChromaDB is an open-source, in-process vector database for Python. It stores embeddings alongside metadata, enables fast similarity search, and requires zero infrastructure — pip install chromadb and you have a working vector database in five lines of code. It supports persistent storage, metadata filtering, and integrates directly into RAG pipelines.

ChromaDB is the fastest way to add vector search to a Python application. It runs entirely in-process — no Docker container, no cloud account, no server to manage. You pip install chromadb, write five lines of Python, and you have a working vector database.

That simplicity makes ChromaDB the standard starting point for learning vector databases and the go-to choice for RAG prototypes and small-to-medium production deployments. This tutorial covers everything you need: collections, embedding functions, metadata filtering, persistence, updates, deletions, and building a real document search pipeline.

If you are new to vector databases, read What is a Vector Database? first — this tutorial assumes you understand what embeddings and similarity search are.

What is ChromaDB and Why Use It?

ChromaDB is an open-source vector database designed to store and query high-dimensional embeddings. It handles the HNSW index, persistence, and metadata filtering so you can focus on your application logic. You install one Python package, choose a client mode (in-memory, persistent, or HTTP), and start adding documents — ChromaDB embeds them automatically using your chosen embedding function.

Installing ChromaDB

bash

For sentence-transformers (the free local embedding model used in most examples below):

bash

ChromaDB requires Python 3.8 or later. All examples in this tutorial were tested on Python 3.11.

ChromaDB Client Modes

ChromaDB has three operating modes:

Ephemeral (in-memory) — data exists only for the lifetime of your Python process. Fastest, but nothing is saved to disk. Useful for unit tests and quick experiments.

python

Persistent (local disk) — data is saved to a directory on disk and survives process restarts. The standard mode for development and small deployments.

python

HTTP Client — connects to a separately running ChromaDB server (started with chroma run --path ./chroma_data). Used when multiple processes or services need to share the same vector database.

python

For this tutorial, use PersistentClient — it behaves identically to a production setup but requires no separate service.

Collections: ChromaDB's Core Concept

In ChromaDB, a collection is the equivalent of a table in SQL. It stores:

Documents: the raw text (or other content) you want to search
Embeddings: the vector representations of those documents
Metadata: structured key-value pairs associated with each document
IDs: a unique string identifier for each item

You can have multiple collections in a single client, each with its own embedding function and distance metric.

python

Embedding Functions

An embedding function tells ChromaDB how to convert your documents into vector embeddings. ChromaDB ships with built-in support for several providers.

Option 1: Sentence Transformers (Free, Local)

The default and most convenient option for most use cases. The model downloads once (~80 MB) and then runs entirely offline.

python

Popular sentence-transformer models for different use cases:

Model	Dimensions	Speed	Use Case
`all-MiniLM-L6-v2`	384	Very fast	General text, best starting point
`all-mpnet-base-v2`	768	Moderate	Higher quality general text
`multi-qa-MiniLM-L6-cos-v1`	384	Very fast	Question/answer retrieval
`paraphrase-multilingual-MiniLM-L12-v2`	384	Fast	Multilingual support

Option 2: OpenAI Embeddings

Higher quality embeddings, especially for domain-specific content, at a cost per token.

python

Option 3: Custom Embedding Function

Wrap any model — Cohere, Anthropic, a local GGUF model — by subclassing EmbeddingFunction:

python

Lock Your Embedding Function

Once you add documents to a collection, you must always use the same embedding function to query it. Changing the model invalidates all existing embeddings. If you need to switch models, create a new collection and re-embed all documents.

Distance Metrics

Set the distance metric at collection creation time via the metadata parameter:

python

For text: always use cosine. For image embeddings from models like CLIP: l2 or ip depending on whether the model normalises its outputs.

Adding Documents

python

IDs must be unique strings. If you call add() with an ID that already exists, ChromaDB raises an error. Use upsert() instead when you are not sure whether a document already exists.

Providing Pre-computed Embeddings

If you have already computed embeddings outside ChromaDB (e.g., in a batch preprocessing job), pass them directly:

python

When you provide embeddings directly, ChromaDB does not call the collection's embedding function for those documents.

Querying: The Core Operation

python

Output:

text

The results dictionary always has this shape. results["documents"] is a list of lists — one inner list per query. If you pass multiple queries at once:

python

What can you include in results?

The include parameter controls what ChromaDB returns:

"documents" — original text
"embeddings" — raw vector arrays
"distances" — similarity scores
"metadatas" — metadata dictionaries
"uris" — URIs if you stored them
"data" — raw data if multi-modal

Distance vs Similarity

ChromaDB returns distances, not similarity scores. For cosine distance: similarity = 1 - distance. A distance of 0.0 means identical; 2.0 means opposite. For L2 distance, smaller is more similar — there is no simple normalisation to a 0–1 similarity score.

Metadata Filtering

Metadata filters let you combine vector similarity search with structured constraints. This is one of ChromaDB's most powerful features.

Basic Filters

python

Operators

ChromaDB supports a rich set of filter operators via the $ prefix:

python

Boolean Logic: $and, $or

python

Full-Text Filter: $contains

python

where_document filters on the actual document text, while where filters on metadata. Both can be used together in the same query.

CRUD Operations: Update and Delete

Upsert (Insert or Update)

upsert() inserts new documents or updates existing ones — the safest choice for most ingestion pipelines:

python

Update

update() modifies existing documents. It raises an error if an ID does not exist:

python

You can update documents, metadatas, or embeddings independently — you do not need to provide all three.

Get by ID

python

Delete

python

Deletions are Permanent

ChromaDB does not have a recycle bin or soft-delete. Deleted items are gone immediately. For production systems where auditability matters, consider soft-deleting by updating a metadata field (e.g., status: 'deleted') and filtering it out in queries, while keeping the actual record.

Inspecting and Managing Collections

python

Building a Document Search Pipeline

Here is a complete, production-style document ingestion and query pipeline — the kind you would use in a RAG application.

python

ChromaDB with LangChain and LlamaIndex

ChromaDB has first-class integrations with both major RAG frameworks.

LangChain:

python

LlamaIndex:

python

HNSW Index Tuning

ChromaDB uses HNSW internally. For most use cases the defaults are fine. For large collections or when you need to optimise recall vs speed, tune the HNSW parameters at collection creation:

python

When to Tune HNSW

The defaults work well for collections under 100,000 documents. If you notice queries returning obviously poor results (low recall), increase hnsw:search_ef first — it costs query time but improves result quality. Increase hnsw:construction_ef only if you are accepting poor recalls even at query time, as it only affects indexing.

Key Takeaways

ChromaDB runs entirely in-process — no server, no Docker, no cloud account needed for development
Use PersistentClient for development and small production deployments; HttpClient when multiple services share one database
The embedding function must stay consistent — adding and querying must use the same model
Use cosine distance for all text-based collections
Combine vector search with metadata filters using the where parameter for scoped queries
Use upsert() rather than add() in production ingestion pipelines to handle re-runs safely
HNSW parameters can be tuned for recall vs speed trade-offs on large collections

What's Next in the Vector Database Series

Next post: ChromaDB vs Pinecone vs pgvector: Which Vector Database Should You Use?
Coming up: Build a Semantic Search Engine from Scratch
Advanced: Vector Database Optimisation for Production

This post is part of the Vector Database Series. Previous post: What is a Vector Database? The Complete Beginner's Guide.

If you want to apply ChromaDB in a full retrieval-augmented generation pipeline, see Claude RAG: Retrieval-Augmented Generation Guide for a step-by-step walkthrough using the Claude API.

For production deployments with ChromaDB, the official ChromaDB documentation covers Docker deployment, authentication, and the HTTP client in detail. The ChromaDB GitHub repository is also worth watching for releases and breaking changes.

External references: