What it's used for

Langfuse is an open-source LLM observability and tracing platform that captures every prompt, completion, latency, cost, and error across your LLM application. It gives you full visibility into what your AI application is doing in production.

Key use cases include:

Tracing — capture the full execution trace of every LLM call, tool invocation, and retrieval step
Cost monitoring — track token usage and spend across providers and models
Quality evaluation — run LLM-as-judge evaluations and human annotation workflows
Prompt management — version and deploy prompts with A/B testing
Dataset management — build evaluation datasets from production traces
Debugging — quickly identify why specific requests failed or produced poor results

Langfuse is used by engineering teams running LLM applications in production who need to monitor quality, control costs, and debug issues. It integrates with LangChain, LlamaIndex, OpenAI, and every major framework through decorators, callbacks, or the REST API.

Available as Langfuse Cloud (hosted) or self-hosted via Docker for full data control.

Getting started

Sign up at cloud.langfuse.com or self-host with Docker.
Install the SDK:
```
pip install langfuse
```

Set your keys:

export LANGFUSE_PUBLIC_KEY='pk-...'
export LANGFUSE_SECRET_KEY='sk-...'
export LANGFUSE_HOST='https://cloud.langfuse.com'

Add tracing with the @observe() decorator:

from langfuse.decorators import observe

@observe()
def my_llm_function(query):
    # Your LLM call here
    return response

View traces and analytics in the Langfuse dashboard.

Pricing: Free tier with 50K observations/month. Pro plan at $59/month. Self-hosting is free (MIT license). See langfuse.com/pricing.

Case studies

Real Langfuse projects

Submitted by verified specialists

$2.3k/month savings found AI SaaS

Token Budget Regression Found — $2.3k/Month Saved

Series B AI startup, 6 production LLM pipelines

› Challenge

An AI startup's inference costs grew 40% in a month with no engineering changes. The culprit was hidden somewhere in 6 interconnected LLM pipelines — impossible to debug without request-level tracing.

› Solution

Instrumented all 6 pipelines with Langfuse traces, capturing prompt/completion tokens, latency, model version, and user context per request. Built cost dashboards by pipeline, model, and user segment.

› Results

Identified an inefficient system prompt in the document summarization pipeline generating 3x more tokens than necessary. Fix took 2 hours. Monthly inference cost reduced from $8.1k to $5.8k — $2.3k/month savings.

Tools

Langfuse LangChain LlamaIndex Pinecone

Hire an expert

Used Langfuse professionally?

Add your case study and get discovered by clients.

Submit a case study