Langfuse

Langfuse

LLM observability & tracing

Data Dev Framework

What it's used for

Langfuse is an open-source LLM observability and tracing platform that captures every prompt, completion, latency, cost, and error across your LLM application. It gives you full visibility into what your AI application is doing in production.

Key use cases include:

  • Tracing — capture the full execution trace of every LLM call, tool invocation, and retrieval step
  • Cost monitoring — track token usage and spend across providers and models
  • Quality evaluation — run LLM-as-judge evaluations and human annotation workflows
  • Prompt management — version and deploy prompts with A/B testing
  • Dataset management — build evaluation datasets from production traces
  • Debugging — quickly identify why specific requests failed or produced poor results

Langfuse is used by engineering teams running LLM applications in production who need to monitor quality, control costs, and debug issues. It integrates with LangChain, LlamaIndex, OpenAI, and every major framework through decorators, callbacks, or the REST API.

Available as Langfuse Cloud (hosted) or self-hosted via Docker for full data control.

Getting started

  1. Sign up at cloud.langfuse.com or self-host with Docker.
  2. Install the SDK:
    pip install langfuse
  3. Set your keys:
    export LANGFUSE_PUBLIC_KEY='pk-...'
    export LANGFUSE_SECRET_KEY='sk-...'
    export LANGFUSE_HOST='https://cloud.langfuse.com'
  4. Add tracing with the @observe() decorator:
    from langfuse.decorators import observe
    
    @observe()
    def my_llm_function(query):
        # Your LLM call here
        return response
  5. View traces and analytics in the Langfuse dashboard.

Pricing: Free tier with 50K observations/month. Pro plan at $59/month. Self-hosting is free (MIT license). See langfuse.com/pricing.

Case studies

Real Langfuse projects

$2.3k/month savings found AI SaaS

Token Budget Regression Found — $2.3k/Month Saved

Series B AI startup, 6 production LLM pipelines

Challenge

An AI startup's inference costs grew 40% in a month with no engineering changes. The culprit was hidden somewhere in 6 interconnected LLM pipelines — impossible to debug without request-level tracing.

Solution

Instrumented all 6 pipelines with Langfuse traces, capturing prompt/completion tokens, latency, model version, and user context per request. Built cost dashboards by pipeline, model, and user segment.

Results

Identified an inefficient system prompt in the document summarization pipeline generating 3x more tokens than necessary. Fix took 2 hours. Monthly inference cost reduced from $8.1k to $5.8k — $2.3k/month savings.

Used Langfuse professionally?

Add your case study and get discovered by clients.

Submit a case study

For hire

Langfuse specialists

Thought leaders

AI leaders using Langfuse

Follow for insights, tutorials, and thought leadership

Related tools in Data

Need a Langfuse expert?

Submit a brief and we'll match you with vetted specialists who have proven Langfuse experience.

Submit a brief — it's free