What it's used for

Weaviate is an open-source vector database with a unique feature: built-in vectorization modules that can automatically generate embeddings when you insert or query data. Instead of managing a separate embedding pipeline, you send raw text (or images, or audio) to Weaviate and it handles the embedding step using configurable providers (OpenAI, Cohere, Hugging Face, or local models).

Auto-vectorization — configure a vectorizer module (e.g., text2vec-openai) and Weaviate generates embeddings automatically on insert and query
Hybrid search — combine dense vector search with BM25 keyword search in a single query for superior retrieval accuracy
Generative search — run RAG queries natively in Weaviate by combining retrieval with LLM generation in a single API call
Multi-modal — store and search across text, images, and other modalities using appropriate vectorizer modules
GraphQL API — query your data using a rich GraphQL schema with filtering, aggregation, and pagination
Multi-tenancy — native support for tenant isolation, ideal for SaaS applications where each customer's data must be separated

Full-stack developers, AI engineers, and teams building RAG applications choose Weaviate when they want a vector database that does more than just store vectors. The built-in vectorization and generative modules reduce the amount of code you need to write and maintain for a complete AI search pipeline.

Weaviate is available as open-source (self-hosted via Docker or Kubernetes), through Weaviate Cloud (managed service), or as an embedded library for development. This flexibility makes it popular with teams that want to start locally and scale to production without switching databases.

Getting started

Run locally with Docker:

docker run -d -p 8080:8080 -p 50051:50051 \
  -e ENABLE_MODULES='text2vec-openai,generative-openai' \
  -e OPENAI_APIKEY='your-key' \
  cr.weaviate.io/semitechnologies/weaviate:latest

Or use Weaviate Cloud — create a managed cluster at console.weaviate.cloud (free sandbox available).
Install the Python client:
```
pip install weaviate-client
```

Connect and create a collection:

import weaviate
import weaviate.classes as wvc

client = weaviate.connect_to_local()  # or connect_to_weaviate_cloud()

collection = client.collections.create(
    name='Documents',
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(),
    generative_config=wvc.config.Configure.Generative.openai()
)

Insert data (embeddings generated automatically):

collection.data.insert({'title': 'My doc', 'content': 'Some text...'})

Search:

results = collection.query.hybrid(
    query='machine learning basics',
    limit=5
)

Pricing: Open-source: free, self-hosted. Weaviate Cloud Sandbox: free (14-day clusters for testing). Serverless: from $25/month. Enterprise: custom pricing with SLAs and dedicated infrastructure. Full pricing details.

Tip: Start with the Embedded Weaviate client for development — it runs Weaviate as an embedded process without Docker, which is the fastest way to prototype. For production, use hybrid search (combining vector + keyword) as it consistently outperforms pure vector search for RAG retrieval quality.

Weaviate

What it's used for

Getting started

Commonly paired with

No case studies yet

AI leaders using Weaviate

Bob van Luijt

Related tools in Data

Need a Weaviate expert?