What it's used for

Qdrant is a high-performance, open-source vector search engine written in Rust, designed for production workloads that demand fast filtering, efficient memory usage, and reliable performance under load. It stands out from other vector databases with its advanced payload filtering that executes during the search itself (not as a post-filter), ensuring accurate results even with complex filter conditions.

Advanced filtering — filter by metadata fields (geo, range, keyword, nested) during vector search, not after, ensuring you always get the requested number of results
Quantization — reduce memory usage by 4-32x with scalar, product, and binary quantization while maintaining search quality
Sparse vectors — store and search sparse vectors (BM25, SPLADE) alongside dense vectors for hybrid retrieval
Multi-vector support — store multiple named vectors per point (e.g., title embedding + body embedding) with independent search
Payload indexing — create indexes on payload fields for fast filtered search across millions of points
Snapshot & replication — built-in snapshotting, replication, and sharding for production reliability

Backend engineers and ML teams building production RAG systems choose Qdrant when they need fast filtered search at scale — particularly in multi-tenant SaaS applications where every query must be scoped to a specific user, organization, or permission level. Qdrant's Rust-based engine delivers consistent low-latency performance.

Qdrant's combination of open-source self-hosting, a managed cloud, and advanced features like multi-vector search and built-in quantization make it a strong choice for teams that need production-grade vector search with full control over their infrastructure.

Getting started

Run locally with Docker:

docker pull qdrant/qdrant
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

Or use Qdrant Cloud — create a managed cluster at cloud.qdrant.io (free tier: 1GB cluster).
Install the Python client:
```
pip install qdrant-client
```

Create a collection and upsert vectors:

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient('localhost', port=6333)

client.create_collection(
    collection_name='documents',
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

client.upsert(
    collection_name='documents',
    points=[
        PointStruct(id=1, vector=[0.1, 0.2, ...], payload={'source': 'docs', 'category': 'ml'}),
        PointStruct(id=2, vector=[0.3, 0.4, ...], payload={'source': 'faq', 'category': 'api'})
    ]
)

Search with filtering:

from qdrant_client.models import Filter, FieldCondition, MatchValue

results = client.query_points(
    collection_name='documents',
    query=[0.1, 0.2, ...],
    query_filter=Filter(
        must=[FieldCondition(key='category', match=MatchValue(value='ml'))]
    ),
    limit=5
)

Pricing: Open-source: free, self-hosted. Qdrant Cloud Free: 1GB cluster at no cost. Cloud: from ~$25/month for small clusters. Enterprise: dedicated clusters with SLAs and premium support. Full pricing details.

Tip: Enable scalar quantization (quantization_config=ScalarQuantization()) to reduce memory usage by ~4x with minimal impact on search quality — this is especially valuable when running on cost-sensitive infrastructure. For multi-tenant applications, use payload indexes on your tenant ID field for consistently fast filtered queries.

Qdrant

What it's used for

Getting started

Commonly paired with

No case studies yet

AI leaders using Qdrant

Andre Zayarni

Related tools in Data

Need a Qdrant expert?