Qdrant

Qdrant

High-performance vector search

Data Infrastructure

What it's used for

Qdrant is a high-performance, open-source vector search engine written in Rust, designed for production workloads that demand fast filtering, efficient memory usage, and reliable performance under load. It stands out from other vector databases with its advanced payload filtering that executes during the search itself (not as a post-filter), ensuring accurate results even with complex filter conditions.

  • Advanced filtering — filter by metadata fields (geo, range, keyword, nested) during vector search, not after, ensuring you always get the requested number of results
  • Quantization — reduce memory usage by 4-32x with scalar, product, and binary quantization while maintaining search quality
  • Sparse vectors — store and search sparse vectors (BM25, SPLADE) alongside dense vectors for hybrid retrieval
  • Multi-vector support — store multiple named vectors per point (e.g., title embedding + body embedding) with independent search
  • Payload indexing — create indexes on payload fields for fast filtered search across millions of points
  • Snapshot & replication — built-in snapshotting, replication, and sharding for production reliability

Backend engineers and ML teams building production RAG systems choose Qdrant when they need fast filtered search at scale — particularly in multi-tenant SaaS applications where every query must be scoped to a specific user, organization, or permission level. Qdrant's Rust-based engine delivers consistent low-latency performance.

Qdrant's combination of open-source self-hosting, a managed cloud, and advanced features like multi-vector search and built-in quantization make it a strong choice for teams that need production-grade vector search with full control over their infrastructure.

Getting started

  1. Run locally with Docker:
    docker pull qdrant/qdrant
    docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
  2. Or use Qdrant Cloud — create a managed cluster at cloud.qdrant.io (free tier: 1GB cluster).
  3. Install the Python client:
    pip install qdrant-client
  4. Create a collection and upsert vectors:
    from qdrant_client import QdrantClient
    from qdrant_client.models import Distance, VectorParams, PointStruct
    
    client = QdrantClient('localhost', port=6333)
    
    client.create_collection(
        collection_name='documents',
        vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
    )
    
    client.upsert(
        collection_name='documents',
        points=[
            PointStruct(id=1, vector=[0.1, 0.2, ...], payload={'source': 'docs', 'category': 'ml'}),
            PointStruct(id=2, vector=[0.3, 0.4, ...], payload={'source': 'faq', 'category': 'api'})
        ]
    )
  5. Search with filtering:
    from qdrant_client.models import Filter, FieldCondition, MatchValue
    
    results = client.query_points(
        collection_name='documents',
        query=[0.1, 0.2, ...],
        query_filter=Filter(
            must=[FieldCondition(key='category', match=MatchValue(value='ml'))]
        ),
        limit=5
    )

Pricing: Open-source: free, self-hosted. Qdrant Cloud Free: 1GB cluster at no cost. Cloud: from ~$25/month for small clusters. Enterprise: dedicated clusters with SLAs and premium support. Full pricing details.

Tip: Enable scalar quantization (quantization_config=ScalarQuantization()) to reduce memory usage by ~4x with minimal impact on search quality — this is especially valuable when running on cost-sensitive infrastructure. For multi-tenant applications, use payload indexes on your tenant ID field for consistently fast filtered queries.

No case studies yet

Be the first to share a Qdrant case study and get discovered by clients.

Submit a case study

Thought leaders

AI leaders using Qdrant

Follow for insights, tutorials, and thought leadership

Related tools in Data

Need a Qdrant expert?

Submit a brief and we'll match you with vetted specialists who have proven Qdrant experience.

Submit a brief — it's free