Langfuse is an open-source LLM observability and tracing platform that captures every prompt, completion, latency, cost, and error across your LLM application. It gives you full visibility into what your AI application is doing in production.
Key use cases include:
Langfuse is used by engineering teams running LLM applications in production who need to monitor quality, control costs, and debug issues. It integrates with LangChain, LlamaIndex, OpenAI, and every major framework through decorators, callbacks, or the REST API.
Available as Langfuse Cloud (hosted) or self-hosted via Docker for full data control.
pip install langfuseexport LANGFUSE_PUBLIC_KEY='pk-...'
export LANGFUSE_SECRET_KEY='sk-...'
export LANGFUSE_HOST='https://cloud.langfuse.com'@observe() decorator:from langfuse.decorators import observe
@observe()
def my_llm_function(query):
# Your LLM call here
return responsePricing: Free tier with 50K observations/month. Pro plan at $59/month. Self-hosting is free (MIT license). See langfuse.com/pricing.
Case studies
Series B AI startup, 6 production LLM pipelines
An AI startup's inference costs grew 40% in a month with no engineering changes. The culprit was hidden somewhere in 6 interconnected LLM pipelines — impossible to debug without request-level tracing.
Instrumented all 6 pipelines with Langfuse traces, capturing prompt/completion tokens, latency, model version, and user context per request. Built cost dashboards by pipeline, model, and user segment.
Identified an inefficient system prompt in the document summarization pipeline generating 3x more tokens than necessary. Fix took 2 hours. Monthly inference cost reduced from $8.1k to $5.8k — $2.3k/month savings.
Thought leaders
Follow for insights, tutorials, and thought leadership
Submit a brief and we'll match you with vetted specialists who have proven Langfuse experience.