(Deprecated) Callback-based LlamaIndex Integration
This integration is deprecated. We recommend using the new instrumentation-based integration with Langfuse as described here.
Add Langfuse to your LlamaIndex application
Make sure you have both llama-index
and langfuse
installed.
pip install llama-index langfuse
At the root of your LlamaIndex application, register Langfuse’s LlamaIndexCallbackHandler
in the LlamaIndex Settings.callback_manager
. When instantiating LlamaIndexCallbackHandler
, make sure to configure it correctly with your Langfuse API keys and the Host URL.
LANGFUSE_SECRET_KEY="sk-lf-..."
LANGFUSE_PUBLIC_KEY="pk-lf-..."
LANGFUSE_HOST="https://cloud.langfuse.com" # 🇪🇺 EU region
# LANGFUSE_HOST="https://us.cloud.langfuse.com" # 🇺🇸 US region
from llama_index.core import Settings
from llama_index.core.callbacks import CallbackManager
from langfuse.llama_index import LlamaIndexCallbackHandler
langfuse_callback_handler = LlamaIndexCallbackHandler()
Settings.callback_manager = CallbackManager([langfuse_callback_handler])
Done! Traces and metrics from your LlamaIndex application are now automatically tracked in Langfuse. If you construct a new index or query an LLM with your documents in context, your traces and metrics are immediately visible in the Langfuse UI.
Check out the notebook for end-to-end examples of the integration:
Additional configuration
Queuing and flushing
The Langfuse SDKs queue and batches events in the background to reduce the number of network requests and improve overall performance. In a long-running application, this works without any additional configuration.
If you are running a short-lived application, you need to flush Langfuse to ensure that all events are flushed before the application exits.
langfuse_handler.flush()
Learn more about queuing and batching of events here.
Custom trace parameters
You can update trace parameters at any time to add additional context to a trace, such as a user ID, session ID, or tags. See the Python SDK Trace documentation for more information. All subsequent traces will include these set parameters.
Property | Description |
---|---|
name | Identify a specific type of trace, e.g. a use case or functionality. |
metadata | Additional information that you want to see in Langfuse. Can be any JSON. |
session_id | The current session. |
user_id | The current user_id. |
tags | Tags to categorize and filter traces. |
version | The specified version to trace experiments. |
release | The specified release to trace experiments. |
sample_rate | Sample rate for tracing. |
from llama_index.core import Settings
from llama_index.core.callbacks import CallbackManager
from langfuse import langfuse
# Instantiate a new LlamaIndexCallbackHandler and register it in the LlamaIndex Settings
langfuse_handler = LlamaIndexCallbackHandler()
Settings.callback_manager = CallbackManager([langfuse_handler])
def my_func():
# Set trace parameters before executing your LlamaIndex code
langfuse_callback_handler.set_trace_params(
user_id="user-123",
session_id="session-abc",
tags=["production"]
)
# Your LlamaIndex code, trace will include the set parameters
Notes
- The params will be applied to all traces and spans created after the
set_trace_params
call. You can unset them by calling e.g.set_trace_params(user_id=None)
.- If you run this in a Jupyter Notebook, you need to run
set_trace_params
in the same cell as your LlamaIndex code.- When setting a root trace or span, this setting will have no effect as the root trace or span will be used. See next section for more information.
Interoperability with Langfuse SDK
The Langfuse Python SDK is fully interoperable with the LlamaIndex integration.
This is useful when your LlamaIndex executions are part of a larger application and you want to link all traces and spans together. This can also be useful when you’d like to group multiple LlamaIndex executions to be part of the same trace or span.
When using the Langfuse @observe()
decorator, langfuse_context.get_current_llama_index_handler()
exposes a callback handler scoped to the current trace context, in this case llama_index_fn()
. Pass it to the LlamaIndex Settings.callback_manager
to trace subsequent LlamaIndex executions.
from langfuse.decorators import langfuse_context, observe
from llama_index.core import Document, VectorStoreIndex
from llama_index.core import Settings
from llama_index.core.callbacks import CallbackManager
@observe()
def llama_index_fn(question: str):
# Set callback manager for LlamaIndex, will apply to all LlamaIndex executions in this function
langfuse_handler = langfuse_context.get_current_llama_index_handler()
Settings.callback_manager = CallbackManager([langfuse_handler])
# Run application
index = VectorStoreIndex.from_documents([doc1,doc2])
response = index.as_query_engine().query(question)
return response
Notes
- The Llamaindex intergation will not make any changes to your provided root trace or span. If you want to add additional context or input/output to your root trace or span, you can do so via the Python SDK.
- This uses context vars and will work reliably when run in the same cell in Jupyter.