Monitor LLM Security
There are a host of potential safety risks involved with LLM-based applications. These include prompt injection, leakage of personally identifiable information (PII), or harmful prompts. Langfuse can be used to monitor and protect against these security risks, and investigate incidents when they occur.
What is LLM Security?
LLM Security involves implementing protective measures to safeguard LLMs and their infrastructure from unauthorized access, misuse, and adversarial attacks, ensuring the integrity and confidentiality of both the model and data. This is crucial in AI/ML systems to maintain ethical usage, prevent security risks like prompt injections, and ensure reliable operation under safe conditions.
How does LLM Security work?
LLM Security can be addressed with a combination of
- LLM Security libraries for run-time security measures
- Langfuse for the ex-post evaluation of the effectiveness of these measures
1. Run-time security measures
There are several popular security libraries that can be used to mitigate security risks in LLM-based applications. These include: LLM Guard, Prompt Armor, NeMo Guardrails, Microsoft Azure AI Content Safety, Lakera. These libraries help with security measures in the following ways:
- Catching and blocking a potentially harmful or inappropriate prompt before sending to the model
- Redacting sensitive PII before being sending into the model and then un-redacting in the response
- Evaluating prompts and completions on toxicity, relevance, or sensitive material at run-time and blocking the response if necessary
2. Monitoring and evaluation of security measures with Langfuse
Use Langfuse tracing to gain visibility and confidence in each step of the security mechanism. These are common workflows:
- Manually inspect traces to investigate security issues.
- Monitor security scores over time in the Langfuse Dashboard.
- Validate security checks. You can use Langfuse scores to evaluate the effectiveness of security tools. Integrating Langfuse into your team’s workflow can help teams identify which security risks are most prevalent and build more robust tools around those specific issues. There are two main workflows to consider:
- Annotations (in UI). If you establish a baseline by annotating a share of production traces, you can compare the security scores returned by the security tools with these annotations.
- Automated evaluations. Langfuse’s model-based evaluations will run asynchronously and can scan traces for things such as toxicity or sensitivity to flag potential risks and identify any gaps in your LLM security setup. Check out the docs to learn more about how to set up these evaluations.
- Track Latency. Some LLM security checks need to be awaited before the model can be called, others block the response to the user. Thus they quickly are an essential driver of overall latency of an LLM application. Langfuse can help disect the latencies of these checks within a trace to understand whether the checks are worth the wait.
Get Started
See how we use the open source library LLM Guard to anonymize and deanonymize PII and trace with Langfuse. All examples easily translate to other libraries.