What are Vector Databases?
Vector databases store and search embeddings (high-dimensional vectors). Unlike traditional databases that search for exact matches, vector DBs find semantically similar content.
| Traditional Database | Vector Database |
|---|---|
| Exact keyword matching | Semantic similarity |
| "dog" ≠ "puppy" | "dog" ≈ "puppy" (similar meaning) |
| SQL queries | Vector similarity (cosine, euclidean) |
| Fast for structured data | Fast for unstructured data (text, images) |
🎯 Use Cases for RAG & AI
📚 Document Q&A
Store document embeddings, retrieve relevant chunks for LLM context
🔍 Semantic Search
Find similar products, articles, or content by meaning
💬 Chatbot Memory
Store conversation history, retrieve relevant context
🎨 Image Search
Search images by description or visual similarity
🛒 Recommendations
Find similar items based on embeddings
🔬 Anomaly Detection
Find outliers by distance in vector space
🏆 Popular Vector Databases
Pinecone (Managed, Cloud-native)
import pinecone
from sentence_transformers import SentenceTransformer
# Initialize
pinecone.init(api_key="your-key", environment="us-west1-gcp")
# Create index
pinecone.create_index("my-index", dimension=384, metric="cosine")
index = pinecone.Index("my-index")
# Embed and insert
model = SentenceTransformer('all-MiniLM-L6-v2')
texts = ["AI is amazing", "Machine learning rocks", "Pizza is delicious"]
embeddings = model.encode(texts)
# Upsert vectors
vectors = [
(f"id-{i}", embedding.tolist(), {"text": text})
for i, (embedding, text) in enumerate(zip(embeddings, texts))
]
index.upsert(vectors=vectors)
# Query
query_embedding = model.encode(["What is AI?"])[0]
results = index.query(vector=query_embedding.tolist(), top_k=2, include_metadata=True)
for match in results["matches"]:
print(f"Score: {match['score']:.4f} - {match['metadata']['text']}")
- ✅ Fully managed, no infrastructure
- ✅ Auto-scaling, high performance
- ✅ Free tier: 1 index, 100K vectors
- ⚠️ Paid for production use
Chroma (Open-source, Local/Cloud)
import chromadb
from chromadb.utils import embedding_functions
# Initialize
client = chromadb.Client() # In-memory
# Or persistent:
# client = chromadb.PersistentClient(path="./chroma_db")
# Create collection
collection = client.create_collection(
name="my_collection",
embedding_function=embedding_functions.SentenceTransformerEmbeddingFunction(
model_name="all-MiniLM-L6-v2"
)
)
# Add documents (embeddings created automatically)
collection.add(
documents=["AI is amazing", "Machine learning rocks", "Pizza is delicious"],
metadatas=[{"source": "doc1"}, {"source": "doc2"}, {"source": "doc3"}],
ids=["id1", "id2", "id3"]
)
# Query
results = collection.query(
query_texts=["What is AI?"],
n_results=2
)
print(results["documents"]) # Most similar documents
print(results["distances"]) # Similarity scores
- ✅ Open-source, free
- ✅ Easy to get started
- ✅ Great for prototyping
- ⚠️ Basic scalability
Weaviate (Open-source, GraphQL)
import weaviate
# Connect
client = weaviate.Client("http://localhost:8080")
# Create schema
schema = {
"classes": [{
"class": "Document",
"vectorizer": "text2vec-openai",
"properties": [
{"name": "content", "dataType": ["text"]},
{"name": "source", "dataType": ["string"]}
]
}]
}
client.schema.create(schema)
# Insert data
client.data_object.create(
data_object={"content": "AI is amazing", "source": "doc1"},
class_name="Document"
)
# Query with GraphQL
result = client.query.get("Document", ["content", "source"]) \
.with_near_text({"concepts": ["artificial intelligence"]}) \
.with_limit(2) \
.do()
print(result)
- ✅ Open-source with managed cloud option
- ✅ GraphQL API
- ✅ Built-in vectorization
- ✅ Hybrid search (vector + keyword)
Qdrant (Open-source, Rust-based)
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct
# Initialize
client = QdrantClient(":memory:") # In-memory
# Or: client = QdrantClient(path="./qdrant_db")
# Create collection
client.create_collection(
collection_name="my_collection",
vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)
# Insert vectors
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
texts = ["AI is amazing", "Machine learning rocks"]
embeddings = model.encode(texts)
points = [
PointStruct(
id=i,
vector=embedding.tolist(),
payload={"text": text}
)
for i, (embedding, text) in enumerate(zip(embeddings, texts))
]
client.upsert(collection_name="my_collection", points=points)
# Search
query_embedding = model.encode(["What is AI?"])[0]
results = client.search(
collection_name="my_collection",
query_vector=query_embedding.tolist(),
limit=2
)
for result in results:
print(f"Score: {result.score:.4f} - {result.payload['text']}")
- ✅ High performance (Rust)
- ✅ Rich filtering capabilities
- ✅ Open-source with cloud option
- ✅ Easy Docker deployment
🔧 Choosing the Right Vector DB
| Database | Best For | Pricing | Deployment |
|---|---|---|---|
| Pinecone | Production, no devops | Free tier → $70/mo | Managed only |
| Chroma | Prototyping, RAG apps | Free (open-source) | Local, self-hosted |
| Weaviate | Hybrid search, GraphQL | Free → Enterprise | Self-hosted or cloud |
| Qdrant | High performance, filtering | Free → Enterprise | Self-hosted or cloud |
| Milvus | Large scale (billions) | Free (open-source) | Self-hosted, complex |
🚀 Integration with LangChain
from langchain.vectorstores import Chroma, Pinecone, Weaviate
from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
# Load documents
loader = TextLoader("document.txt")
documents = loader.load()
# Split into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)
# Create embeddings
embeddings = OpenAIEmbeddings()
# Store in vector DB (choose one)
# Chroma
vectorstore = Chroma.from_documents(chunks, embeddings, persist_directory="./chroma_db")
# Pinecone
# import pinecone
# pinecone.init(api_key="...", environment="...")
# vectorstore = Pinecone.from_documents(chunks, embeddings, index_name="my-index")
# Weaviate
# vectorstore = Weaviate.from_documents(chunks, embeddings, weaviate_url="http://localhost:8080")
# Query
results = vectorstore.similarity_search("What is the main topic?", k=3)
for doc in results:
print(doc.page_content)
⚡ Performance Optimization
1. Choose the Right Embedding Model
# Small & Fast (384 dimensions)
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2') # 80MB
# Better Quality (768 dimensions)
model = SentenceTransformer('all-mpnet-base-v2') # 420MB
# Best Quality (1024 dimensions)
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large") # API call
2. Indexing Strategy
- HNSW: Fast approximate search (most DBs use this)
- IVF: Good for very large datasets
- Flat: Exact search, slower but 100% accurate
3. Metadata Filtering
# Filter before similarity search (faster!)
results = vectorstore.similarity_search(
"AI safety",
k=5,
filter={"source": "research_papers", "year": {"$gte": 2023}}
)
4. Batch Operations
# Insert in batches for better performance
batch_size = 100
for i in range(0, len(documents), batch_size):
batch = documents[i:i+batch_size]
vectorstore.add_documents(batch)
🎯 Key Takeaways
- Vector DBs enable semantic search for AI applications
- Essential for RAG - store and retrieve relevant context
- Chroma great for prototyping, Pinecone for production
- Embeddings quality matters more than the database
- LangChain integration makes it easy to get started