Proactive Standards: How regulated companies stay in control of AI

Thomas Wing-Evans

Jul 1, 2026

0 Mins

For most of history, a standard was a physical thing. The official pound weight or yardstick, kept under lock by the king, that every scale and measure in the kingdom was checked against; a fixed reference, backed by authority, that everything else was held to. A merchant's weights meant nothing on their own. They were trusted only once they had been calibrated against the standard. Ride ye to the castle and standardize.

Nowadays, a regulated business has its standard written down. SOPs, disclosure requirements, tone rules, escalation triggers, its own definition of a good customer conversation. But it is written for a human team, and like the king's weights it is checked by sample. The question that plagued the minds of the kingdom's inspectors remains for AI: how do you check every single artifact against the standard, before it is used and continuously thereafter? Our answer is Proactive Standards.

The discipline already exists

Regulated industries have a settled way of trusting things that can fail. They test the design before relying on it, then test the operation continuously once they do.

An auditor evaluates a control twice. First for design effectiveness: could this control, as built, do its job? Then for operating effectiveness: did it keep doing its job, in production, across months of evidence? A SOC 2 Type 1 report attests to the first; a point in time. A Type 2 report attests to the second; continual. Sophisticated buyers treat Type 1 as table stakes and require Type 2.

Bank model risk teams work the same way. The Federal Reserve's model risk guidance, SR 11-7, asks validators to assess a model's conceptual soundness before it goes into use, then monitor its performance on an ongoing basis once it does, with independent experts providing what the guidance calls "effective challenge." Validation does not stop at launch. It continues for as long as the model is relied on.

This is the discipline. Trust is a privilege you earn in the moment and re-earn ad infinitum.

From sample to census

Apply that two-phase discipline to our Concierge and it splits cleanly.

Before go-live, the question is design effectiveness: does your Concierge, as you have configured it, meet your standard across the situations your customers find themselves in? Lorikeet's Simulations answer it the way validation does for a risk model. Run your Concierge against hundreds of synthetic versions of real scenarios and edge cases, repeatedly, and try to make it fail in every possible way, so a customer never does.

After go-live, the question becomes operating effectiveness: is your Concierge still meeting the standard? This is where AI changes what is possible. Our Automated QA removes the constraint that makes sampling necessary. It scores every conversation against your standard as it happens, and records the reasoning behind every score. It takes a census of all conversations and continues in perpetuity.

Proactive Standards builds quality in before launch, and continuous scoring watches the live process. This is material in regulated industries, because you carry the cost of every unreviewed conversation as risk. Compliance breaches aren't confined only to observed tickets.

The named person

Support and risk leaders at banks, insurers and healthcare companies have seen the demos and believe the capability. A named person is still personally accountable for every customer interaction, and no vendor can take that on.

UK financial regulation is the clearest version. Under the Senior Managers regime, an individual senior manager is personally accountable for their area, and the conduct rules are explicit that delegating a responsibility does not discharge it. You must delegate to an appropriate person and oversee how it is carried out. Handing the work to an AI agent, or to the vendor behind it, changes nothing about who answers for the result.

The same principle holds across regimes. The US insurance regulators' model AI bulletin requires insurers to comply with the law "regardless of the tools and methods" they use, with no exemption for third-party systems. The EU AI Act puts obligations directly on the company deploying a high-risk system, including human oversight, monitoring its operation, and keeping its logs, with the deployer duties applying from August 2026. Australia's corporate regulator has warned of a "governance gap" that widens when AI adoption outpaces the controls around it.

A vendor can be SOC 2 certified, contractually warranted to the hilt, and independently audited, and your senior manager is still the one the regulator calls. A vendor's assurances were never designed to discharge your obligations.

Standards you own

This is where Proactive Standards differs from certification.

External AI assurance is emerging, and it is useful. There are now standards bodies positioning an audit as the SOC 2 for AI agents, backed by insurers. But a certification is a point-in-time verdict, the Type 1, not the Type 2. It tells you the design passed on the day of the audit. It does not score the conversation your customer had this morning.

Proactive Standards is the operating practice underneath. It takes the bar you have already set, requires your Concierge to clear it before launch, holds every live interaction to it after, and keeps the reasoning on file. It is how you stay ready to pass the certification, not a substitute for it.

Owning the standard also means the standard can move. When a policy changes, you change the check and rerun the simulations, and the next thousand conversations are scored against the new bar. The record is what changes the conversation with a regulator or an auditor. Asked how you know the AI is behaving, you answer with evidence: the standard itself, every interaction scored against it, and a log of what failed and what changed. The same evidence covers the human team, held to the same bar.

Raise your standard

Deployed this way, an AI agent asks a regulated business for less faith than its human operation ever did. You train a human team up front, sample its work occasionally, and trust the gaps between the samples. Lorikeet's instrumented Concierge proves its design before launch and keeps proving itself on every interaction after, with reasoning kept on file.

The flag and the measure were the same word for six hundred years. For industries that have always trusted evidence over assurances, raising the standard and proving you still clear it are now the same act.

Book a call

See what Lorikeet is capable of

Claude Code for Teams: How to Roll It Out and Get It Right in 2026

Jun 25, 2026

0 Mins

Guided agents and guarded agents

May 29, 2026

0 Mins

Product

Industries

Customers

Pricing

Company

Get a demo

Ready to deploy human-quality CX?

Get a demo

Product

Pricing

Customer Stories

Integrations

FAQ

Nominate

Toolshed

Company

About

Careers

Blog

Partnership

Trust Center

Glossary

ABN: 53 669 390 149

Ready to deploy human-quality CX?

Get a demo

Product

Pricing

Customer Stories

Integrations

FAQ

Nominate

Toolshed

Company

About

Careers

Blog

Partnership

Trust Center

Glossary

ABN: 53 669 390 149

Ready to deploy human-quality CX?

Get a demo

Product

Pricing

Customer Stories

Integrations

FAQ

Nominate

Toolshed

Company

About

Careers

Blog

Partnership

Trust Center

Glossary

ABN: 53 669 390 149

The discipline already exists

From sample to census

The named person

Standards you own

Raise your standard

Book a call

Related posts

Claude Code for Teams: How to Roll It Out and Get It Right in 2026

Jun 25, 2026

Guided agents and guarded agents

May 29, 2026

Ready to deploy human-quality CX?

Ready to deploy human-quality CX?

Ready to deploy human-quality CX?