Keeping your CX up when cloud providers fall down

Steve Hind

Jun 16, 2025

0 Mins

Last week, a major Google Cloud Platform (GCP) outage disrupted services across the globe. Among the affected was Anthropic, whose systems went offline. Thankfully, our customers’ ticket processing remained uninterrupted. Our systems automatically failed over to another provider, ensuring continuous support for our customers (and their customers).

This incident underscores a critical point I’ve emphasized before: a key way that application-layer vendors add value is by providing customers with higher reliability than the raw infrastructural building blocks offer alone. This ensures continuity for end customers even as infrastructure continues to struggle to scale under fast-growing loads.

Resilience to upstream outages like these is something we have and will continue to invest in as we scale. It's an expensive investment, but one that our customers need and deserve.

The fragility of single provider dependence

Relying solely on a single AI infrastructure provider is akin to putting all your eggs in one basket. Customers don’t see the backend complexities; they see a service that’s suddenly unavailable, leading to long wait times, frustration, and potential loss of trust. At the risk of stating the obvious, AI agents have very different reliability capabilities than human ones. AI agents don’t call in sick or quit at short notice. But they can all go down if a single service fails, while human agents aren’t all going to call in sick on the same day.

Designing for resilience

Everything AI -related is growing so fast right now; the reliability of infrastructure providers and foundational models is impacted e.g. Anthropic has had 99.34% up time over the last 90 days, significantly less than the 99.999% (or '5 9s' in tech lingo) reliability we’ve come to expect from technology providers.

That’s why at Lorikeet, we’ve architected our systems with redundancy at their core. Our AI agents are designed to handle complex, multi-step support requests, and they do so by leveraging a leveraging a multi-provider infrastructure. This means that if one provider experiences issues, our systems seamlessly transition to another, ensuring that our clients’ support operations remain unaffected.

Our automated failovers rely on knowing - at the level of each LLM call - what the next best model is. We do this based on a robust abstraction framework and set of evals. We've made this investment because we're acutely aware of the trust our customers put in us, and need to ensure we honor it, instead of relying on an easy out like "Anthropic went down".

The broader implications

GCP wasn't on its own. In the last thirty days alone, we’ve seen outages from:

Cloudflare
OpenAI
IBM Cloud
Microsoft Azure
Pinecone
LangChain

If you're building your own solution, you will need to ensure it's robust against future outages like these, further increasing the cost of building versus buying.

Moving forward

As we continue to build out the Lorikeet platform, we won’t just focus on capabilities. We’ll maintain our deep investment in reliability. At the end of the day, our AI agents, no matter how advanced, are only as effective as the infrastructure supporting them.

Book a call

See what Lorikeet is capable of

Guided agents and guarded agents

May 29, 2026

0 Mins

If a voice AI agent can't hear, it can't help

May 20, 2026

0 Mins

Product

Industries

Customers

Pricing

Company

Get a demo

Ready to deploy human-quality CX?

Get a demo

Product

Pricing

Customer Stories

Integrations

FAQ

Nominate

Toolshed

Company

About

Careers

Blog

Partnership

Trust Center

Glossary

ABN: 53 669 390 149

Ready to deploy human-quality CX?

Get a demo

Product

Pricing

Customer Stories

Integrations

FAQ

Nominate

Toolshed

Company

About

Careers

Blog

Partnership

Trust Center

Glossary

ABN: 53 669 390 149

Ready to deploy human-quality CX?

Get a demo

Product

Pricing

Customer Stories

Integrations

FAQ

Nominate

Toolshed

Company

About

Careers

Blog

Partnership

Trust Center

Glossary

ABN: 53 669 390 149

The fragility of single provider dependence

Designing for resilience

The broader implications

Moving forward

Book a call

Related posts

Guided agents and guarded agents

May 29, 2026

If a voice AI agent can't hear, it can't help

May 20, 2026

Ready to deploy human-quality CX?

Ready to deploy human-quality CX?

Ready to deploy human-quality CX?