For regulated, high stakes use cases in telehealth, financial services, insurance and other sectors, there are some customer queries that are time urgent and must absolutely comply with deterministic scripts. Typical examples involve regulatory compliance, financial advice disclaimers, safety-critical escalations. When the stakes are high, AI agents in deployment must perform tasks accurately and precisely, 100% of the time.
Guardrails is the runtime layer in Lorikeet's defense in depth approach to AI accuracy. While other layers handle agent quality at the foundation, pre-deployment testing, and post-ticket QA, Guardrails operates in real-time, evaluating every message and response as conversations happen.
Always-on protection
Every Lorikeet agent ships with built-in guardrails that run automatically. For example:
Response grounding ensures agent responses are based on your knowledge base, data, or instructions
Profanity filter prevents inappropriate language
Jailbreak detection blocks prompt injection attempts before they reach the agent
These guardrails are always on and don’t need to be configured because they're always on, protecting you from day one.
Custom guardrails for your business
In addition to always-on protection, every business has specific policies, industry regulations, or edge cases that matter to them. That's where custom guardrails come in.
Custom guardrails let you define your own checks. For example:
Financial services: "Agent must not provide specific investment advice"
Insurance: "Agent cannot estimate claim values"
Healthcare: "Escalate immediately if customer mentions self-harm"
Any industry: "Never mention competitor products by name"
Because the check runs outside the agent's reasoning loop, it produces an unbiased result. The agent can't talk itself out of a violation.
Two layers of checks
Message checks evaluate incoming customer messages before they reach the agent. Financial vulnerability, legal threats, life-or-death situations, these get flagged immediately so you control what happens next.
Agent guardrails check outgoing responses before they're sent. If the agent is about to offer an unauthorized refund, share incorrect information, or respond in a way that violates policy, the guardrail blocks it.
What happens when a guardrail triggers
When a guardrail fires, you choose what happens:
Alert: Log for analytics without interrupting the conversation. One customer uses this to monitor how often users report app errors; spikes indicate a production issue.
Apply a tag: Categorize for routing or reporting.
Send Slack message: Ping a channel in real-time.
Escalate: Hand off to a human immediately.
Guide the agent: Inject just-in-time instructions. If a customer mentions a specific error code, tell the agent exactly how to resolve it.
Run a workflow: Trigger a specific workflow for highly sensitive situations.
Silently escalate: Queue for human review but let the agent finish responding first.
Testing and iteration
Every custom guardrail can be tested with saved scenarios with exact customer messages, draft agent responses to verify correct behavior. Coach helps refine detection criteria until guardrails trigger reliably on the right situations and stay quiet on the rest.
Configure via Coach or MCP
Following our launch last week of Lorikeet MCP, everything you can do with custom guardrails including create, test, update, monitor guardrails, are accessible through Lorikeet Coach and MCP. Use Coach for conversational configuration, or integrate directly via MCP for programmatic control.
Analytics and auditability
When guardrails trigger, you see exactly what happened: the blocked response, the explanation, and links to affected tickets. Analytics show trigger frequency over time, broken down by type and action.
This visibility feeds the broader quality flywheel: patterns surface, root causes get identified, fixes get validated through simulation, and monitoring confirms the improvement.
Guardrails is one layer in Lorikeet's defense in depth architecture. Read the full framework to understand how training, simulation, runtime checks, and post-ticket QA work together.
Book a call
See what Lorikeet is capable of









