The false promise of self-training AI

The false promise of self-training AI

Steve Hind

Steve Hind

|

Jul 14, 2025

The false promise of self-training AI
The false promise of self-training AI

"Our AI learns from every interaction!" It's the holy grail promise in customer support AI sales pitches. Set it up once, let it learn from your team's feedback, and watch it get smarter over time. No ongoing training needed. No expert configuration required. Just thumbs up, thumbs down, and AI magic.

Unfortunately, it’s a pipedream.

Self-training AI systems are a vendor convenience disguised as a customer benefit. They reduce the vendor's support costs because they don’t need to help you configure the system properly. But they shift all the risk to you—the customer whose reputation is on the line with every AI interaction.

Here's why self-training systems fail, why explicit instruction-based systems work better, and why this matters especially for businesses with complex compliance requirements.

"Garbage in, garbage out" becomes inevitable at scale

The fundamental problem with self-training systems is that they learn from human feedback. And the feedback humans give is incomplete and inconsistent, especially when giving feedback across a team.

Picture this: Your support team has ten agents. Agent A thinks a response is great because it's friendly and resolves the issue. Agent B thinks the same response is poor because it doesn't follow the exact script. Agent C gives it a thumbs up because they're rushing through feedback and it seems "good enough."

The AI gets contradictory signals about the same type of response. Over time, it learns... what exactly? A muddled average of inconsistent human judgment.

Quality assurance becomes impossible when you can't control what the AI is learning from. You're essentially crowdsourcing your customer experience standards from whoever happens to be giving feedback that day. And when busy, scaling teams provide feedback, there's zero guarantee that inconsistent guidance won't mislead the model in ways you'll never detect.

You also lose the ability to transform and improve. Chances are a scaled QA team are dutifully applying the existing QA rubric. Who is asking “how can we do much better” and focusing feedback on that?

Thumbs up/down systems are fundamentally flawed

Binary feedback is nearly useless for training AI systems to handle complex customer interactions.

A thumbs up tells you nothing about why something worked. Was it the tone? The accuracy? The speed of resolution? The specific information provided? A self trainedThe AI system effectively has to guess, and it will guess wrong.

Worse, agents routinely give thumbs up to responses that are "good enough" but not actually great. They're busy, they want to move on, and the response didn't cause any obvious problems. So the AI learns to optimize for "didn't break anything" rather than "delivered an exceptional experience."

The AI draws its own conclusions from this feedback, which may be completely wrong. You end up with a system optimized for mediocrity, not excellence.

Black box learning eliminates control

When an AI system learns on its own, you lose the ability to understand or control what it's actually doing.

You can't inspect what the AI learned from the feedback or why it's making specific decisions. When something goes wrong you have no way to trace back to the root cause. Did it learn something incorrect from bad feedback? Is it applying a rule in the wrong context? You have no idea.

Debugging becomes impossible. You're reduced to hoping that the next round of feedback somehow fixes mysterious problems you can't even identify. Meanwhile, your customers are experiencing the consequences of these invisible failures in real time.

Explicit instructions beat "smart" systems

Instruction-based systems work because they're transparent and controllable.

Instead of hoping the AI figures out what you want from indirect feedback, you tell it exactly how to handle different situations. You specify your escalation criteria, your tone of voice, your policy exceptions, your compliance requirements.

Changes are transparent and auditable. You know exactly what you changed and why. When something isn't working, you can pinpoint the specific instruction that needs adjustment.

Subject matter experts—people who actually understand your business and customer needs—can review and refine the instructions. You're not outsourcing your customer experience standards to an algorithm's interpretation of thumbs up/down feedback.

You maintain control over your CX standards instead of abdicating them to a black box.

Compliance and risk management require predictability

For businesses in regulated industries, self-training systems are a compliance nightmare.

Financial services, healthcare, and other regulated businesses need to explain their AI's decision-making to auditors and regulators. "The AI learned from feedback" isn't an acceptable explanation when a regulator asks why a customer was denied service or given incorrect medical information.

Self-training systems make compliance audits nearly impossible because you can't document or explain the decision-making process. The AI's reasoning is buried in layers of algorithmic interpretation that even the vendor can't fully explain.

Explicit instructions create a clear audit trail. You can show exactly what rules the AI follows and why. Risk management requires knowing exactly how your AI will behave in edge cases, not hoping it learned the right lessons from past feedback.

The vendor incentive problem

Here's the part vendors won't tell you: self-training systems are primarily designed to reduce their operational costs, not improve your results.

When you buy a self-training system, the vendor doesn't need to invest in helping you configure it properly. They don't need subject matter experts who understand your business. They don't need to provide ongoing training optimization services.

They can just say "let it learn from your feedback" and walk away. It's sold as "advanced AI" but it's actually lazy product development.

The promise of "set it and forget it" AI is appealing but unrealistic, especially for complex businesses. Great customer experiences require intentional design and ongoing refinement.

Self-training isn't the future of AI—it's a shortcut that puts your customer relationships at risk. Don't let vendors convince you otherwise.

"Our AI learns from every interaction!" It's the holy grail promise in customer support AI sales pitches. Set it up once, let it learn from your team's feedback, and watch it get smarter over time. No ongoing training needed. No expert configuration required. Just thumbs up, thumbs down, and AI magic.

Unfortunately, it’s a pipedream.

Self-training AI systems are a vendor convenience disguised as a customer benefit. They reduce the vendor's support costs because they don’t need to help you configure the system properly. But they shift all the risk to you—the customer whose reputation is on the line with every AI interaction.

Here's why self-training systems fail, why explicit instruction-based systems work better, and why this matters especially for businesses with complex compliance requirements.

"Garbage in, garbage out" becomes inevitable at scale

The fundamental problem with self-training systems is that they learn from human feedback. And the feedback humans give is incomplete and inconsistent, especially when giving feedback across a team.

Picture this: Your support team has ten agents. Agent A thinks a response is great because it's friendly and resolves the issue. Agent B thinks the same response is poor because it doesn't follow the exact script. Agent C gives it a thumbs up because they're rushing through feedback and it seems "good enough."

The AI gets contradictory signals about the same type of response. Over time, it learns... what exactly? A muddled average of inconsistent human judgment.

Quality assurance becomes impossible when you can't control what the AI is learning from. You're essentially crowdsourcing your customer experience standards from whoever happens to be giving feedback that day. And when busy, scaling teams provide feedback, there's zero guarantee that inconsistent guidance won't mislead the model in ways you'll never detect.

You also lose the ability to transform and improve. Chances are a scaled QA team are dutifully applying the existing QA rubric. Who is asking “how can we do much better” and focusing feedback on that?

Thumbs up/down systems are fundamentally flawed

Binary feedback is nearly useless for training AI systems to handle complex customer interactions.

A thumbs up tells you nothing about why something worked. Was it the tone? The accuracy? The speed of resolution? The specific information provided? A self trainedThe AI system effectively has to guess, and it will guess wrong.

Worse, agents routinely give thumbs up to responses that are "good enough" but not actually great. They're busy, they want to move on, and the response didn't cause any obvious problems. So the AI learns to optimize for "didn't break anything" rather than "delivered an exceptional experience."

The AI draws its own conclusions from this feedback, which may be completely wrong. You end up with a system optimized for mediocrity, not excellence.

Black box learning eliminates control

When an AI system learns on its own, you lose the ability to understand or control what it's actually doing.

You can't inspect what the AI learned from the feedback or why it's making specific decisions. When something goes wrong you have no way to trace back to the root cause. Did it learn something incorrect from bad feedback? Is it applying a rule in the wrong context? You have no idea.

Debugging becomes impossible. You're reduced to hoping that the next round of feedback somehow fixes mysterious problems you can't even identify. Meanwhile, your customers are experiencing the consequences of these invisible failures in real time.

Explicit instructions beat "smart" systems

Instruction-based systems work because they're transparent and controllable.

Instead of hoping the AI figures out what you want from indirect feedback, you tell it exactly how to handle different situations. You specify your escalation criteria, your tone of voice, your policy exceptions, your compliance requirements.

Changes are transparent and auditable. You know exactly what you changed and why. When something isn't working, you can pinpoint the specific instruction that needs adjustment.

Subject matter experts—people who actually understand your business and customer needs—can review and refine the instructions. You're not outsourcing your customer experience standards to an algorithm's interpretation of thumbs up/down feedback.

You maintain control over your CX standards instead of abdicating them to a black box.

Compliance and risk management require predictability

For businesses in regulated industries, self-training systems are a compliance nightmare.

Financial services, healthcare, and other regulated businesses need to explain their AI's decision-making to auditors and regulators. "The AI learned from feedback" isn't an acceptable explanation when a regulator asks why a customer was denied service or given incorrect medical information.

Self-training systems make compliance audits nearly impossible because you can't document or explain the decision-making process. The AI's reasoning is buried in layers of algorithmic interpretation that even the vendor can't fully explain.

Explicit instructions create a clear audit trail. You can show exactly what rules the AI follows and why. Risk management requires knowing exactly how your AI will behave in edge cases, not hoping it learned the right lessons from past feedback.

The vendor incentive problem

Here's the part vendors won't tell you: self-training systems are primarily designed to reduce their operational costs, not improve your results.

When you buy a self-training system, the vendor doesn't need to invest in helping you configure it properly. They don't need subject matter experts who understand your business. They don't need to provide ongoing training optimization services.

They can just say "let it learn from your feedback" and walk away. It's sold as "advanced AI" but it's actually lazy product development.

The promise of "set it and forget it" AI is appealing but unrealistic, especially for complex businesses. Great customer experiences require intentional design and ongoing refinement.

Self-training isn't the future of AI—it's a shortcut that puts your customer relationships at risk. Don't let vendors convince you otherwise.

Blu background with lorikeet flypaths

Ready to deploy human-quality CX?

Businesses with the highest CX standards choose Lorikeet's AI agents to

solve the most complicated support cases in the most complex industries.

Blu background with lorikeet flypaths

Ready to deploy human-quality CX?

Businesses with the highest CX standards choose Lorikeet's AI agents to

solve the most complicated support cases in the most complex industries.