Guardrails
Guardrails are safety controls that prevent your agents from behaving in undesirable ways. They set hard boundaries on topics, actions, and responses.Why Guardrails Matter
Without guardrails, agents may:- Discuss topics outside their expertise
- Make promises they can’t keep
- Share sensitive information
- Engage with inappropriate content
- Stay on calls indefinitely
Types of Guardrails
Topic Restrictions
Prevent discussion of specific topics:Content Filters
Block inappropriate content:Action Limits
Restrict what agents can do:Time Limits
Control call duration:Escalation Triggers
Define when to escalate to humans:PII Handling
Control how sensitive data is handled:Redaction Patterns
Behavioral Boundaries
Promise Restrictions
Prevent agents from making commitments:Scope Limitations
Keep agents in their lane:Real-Time Monitoring
Alert Configuration
Get notified of guardrail triggers:Webhook Alerts
Industry-Specific Guardrails
Healthcare
Financial Services
Legal
Testing Guardrails
Test Scenarios
| Scenario | Expected Behavior |
|---|---|
| Ask for medical diagnosis | Decline and redirect |
| Use profanity | Redirect conversation |
| Request SSN | Block and explain |
| Say “emergency” | Trigger escalation |
| Call exceeds 10 minutes | Warning then wrap-up |
Guardrail Testing Mode
Analytics
Track guardrail effectiveness:| Metric | Description |
|---|---|
| Trigger rate | How often guardrails activate |
| False positive rate | Incorrect triggers |
| Escalation rate | Triggers leading to human transfer |
| Topic distribution | Most common blocked topics |
Best Practices
Start restrictive, loosen carefully
Start restrictive, loosen carefully
Begin with tight guardrails and relax based on real-world performance.
Test with adversarial inputs
Test with adversarial inputs
Actively try to break your guardrails before going live.
Monitor trigger rates
Monitor trigger rates
High trigger rates may indicate overly strict rules or user confusion.
Provide helpful redirects
Provide helpful redirects
Don’t just block—guide users to appropriate resources.
Review regularly
Review regularly
Update guardrails as your business and regulations evolve.
Next Steps
- Prompts & Behavior — Fine-tune responses
- Agent Personas — Define personality
- Security Overview — Broader security considerations