Rising to the AI resilience challenge

Man and woman discussing a project
  • June 2026

As if risk officers, crisis leaders, and operational resilience teams didn’t have enough to keep them awake at night, they now also need to stay in control of their AI-enabled services. Fortunately, with the right safeguards, they can build and maintain the AI resilience and trust their organisations will need as their AI footprint expands.

Keeping control as AI goes mainstream

AI is no longer sitting at the edge of the enterprise as an experimental technology. Increasingly it’s embedded in decision-making, workflows, and critical services. The shift from model risk to operational dependency changes the stakes and, as a result, its failure isn’t confined to technical teams. Operational continuity, service delivery, and decision quality can all be disrupted, ultimately impacting customer experience, leadership accountability, business outcomes, and stakeholder confidence.

As AI-driven autonomy – especially agentic AI – scales at pace, organisations will require a discipline to detect and contain failure, recover in a controlled manner, and retain clear accountability when systems behave in unanticipated ways. That discipline is AI resilience.

Why AI has become a core continuity issue

AI failures manifest in many shapes and forms, from a system generating hallucinated results that go unchecked to the unexpected consequences of AI coding agents using unorthodox methods. Given that we’re at the early stages of the AI journey, these issues are likely to proliferate.

Often, AI failures don’t appear as obvious system outages. They can appear as: silent drift; diminished decision quality; brittle dependencies; overloaded autonomous processes; hidden supply-chain weaknesses; and loss of explainability under pressure. That’s why AI can’t be treated purely as a model governance issue. For crisis and resilience leaders, the real question is what AI failure means in practical terms for their organisation. What are its impacts? And can the organisation continue to operate with credibility when there’s uncertainty about how its AI will behave?

What do we mean by AI resilience?

Before digging deeper, it’s important to clarify our terminology. AI risk is focused on reducing the likelihood of disruption. AI resilience is the ability to detect, contain, continue, and recover while preserving control and impact tolerance, including accountability and decision integrity when disruption occurs. And AI trust is built when organisations prove they can withstand failure without losing control or credibility under operational, board, customer, and regulatory scrutiny.

It’s not about eliminating all failure – that would be neither realistic nor practical. Traditional incident models simply aren’t designed for AI systems that can degrade silently or fail across dependencies. To close the gap, AI resilience deploys practical capabilities across architecture, evaluations, incident readiness, and supply-chain oversight. This is how organisations earn AI trust at scale.

Three steps to build AI resilience in practice

Every organisation can get started by taking three practical steps to manage AI risk, build an effective AI resilience culture, and establish strong foundations for AI trust.

  1. Enable operational and architectural readiness
    Resilient AI architecture protects continuity, not just performance. That’s why it’s imperative to adopt an AI-resilience mindset and focus on design choices that allow systems to fail safely rather than expansively. As with cyber-resilience, there’s no silver bullet. But the better prepared your organisation is with clear guardrails and a comprehensive, well-rehearsed plan B, the better able you’ll be to retain control and recover from a compromise.

    It’s also about who takes ownership – for example, is it a CTO, CISO, or CRO issue? In practice, you need a plan with effective checks and balances. Typically, these will include fallback pathways; rollback and reversion routes; human intervention points; defined escalation thresholds; scenario testing for drift, overload, corruption and dependency failure; safe degraded modes of operation; and clear ownership of decision override and intervention authority.
  2. Embrace behavioural evaluation and resilience testing
    By its nature, AI resilience can only be proven in adverse conditions – whether stress, ambiguity, misuse, or any number of challenges. Organisations need an initial view of where AI is embedded, where dependencies are forming, and whether existing controls, ownership and escalation routes remain fit for purpose. Rigorous AI behavioural evaluation and testing can then provide a more dynamic way to understand how AI-enabled systems perform over time and ensure AI does not become a blind spot in wider operational or cyber-resilience initiatives.

    Adversarial testing and stress-testing are critical to probe dependency weaknesses, pinpoint vulnerabilities, and check if controls are strong enough to surface weak signals before they escalate. The potential for agentic misalignment needs to be identified early. For instance, where an agentic system could contravene company policy, sidestep ethical boundaries, or operate counter to corporate culture to achieve a goal.
  3. Embed incident readiness and supply-chain oversight
    Modern operating systems are complex and multifaceted. What if an AI agent used by a third- or fourth-party supply chain partner isn’t delivering the outcomes you’d expect, or introduces toxicity that becomes a wider risk? Or if the AI running a core part of your business is no longer supported by its provider? Resilient organisations don’t just build AI into their operations. They prepare to investigate, contain, and explain it so that they can respond and recover from AI failure.

    Comprehensive incident readiness means embedding clear triggers and escalation routes along with forensic capture of logs, lineage, and model states to support investigation, attribution, and defensible decision-making. It also demands accountability across models, APIs, cloud services, and open-source components. Plus, clarity around crisis roles and hand-offs; continuity checks on suppliers and critical dependencies; and evidence and disclosure readiness under pressure.

Of course, all of these steps require resilience leaders to look inward. As AI becomes embedded across operations, the boundaries between operational resilience, cyber resilience, business continuity and crisis management become less distinct. AI can strengthen signal detection, monitoring, triage, analysis and response coordination, but only if the resilience function itself is integrated, enabled and able to maintain control over the AI-enabled environment it increasingly relies on.

Scale AI with confidence

AI resilience is not a brake on innovation. On the contrary, it’s key to building the AI trust every organisation will need to embrace and scale AI with confidence. Because ultimately, it won’t be the organisations that deploy AI fastest that benefit from it the most. It will be the ones that can absorb AI shocks without losing control, continuity, accountability, or trust.

Are you ready to rise to the AI resilience challenge?

Contact us

Neil Houston

Neil Houston

Director, PwC United Kingdom

Tel: +44 (0)7808 105638

James Houston

James Houston

Crisis and resilience Partner, PwC United Kingdom

Tel: +44 (0)7876 207850

Follow us