Managing AI Incidents

A practical approach to preparing for when, not if, you face an AI safety incident

Sep 01, 2025

In late 2022, Air Canada’s website chatbot told a grieving traveller that he could buy now and claim a bereavement refund later. He did. Months on, the airline refused the refund and even argued they were not liable because the bot was a “separate legal entity.” A tribunal disagreed, ordered compensation, and made the obvious explicit: if an AI system speaks in your name, you own what it says and does.1

A year later, a Cruise robotaxi in San Francisco struck and then dragged a pedestrian as it attempted to pull over, an incident I described in depth in one of the first articles here2. Regulators blasted Cruise’s incomplete reporting; California pulled permits and federal authorities first imposed stricter oversight, and later major fines. The technical failure was bad, but the response failures made it much worse. The incident led directly to the corporate collapse of Cruise.

Now contrast those with two examples of faster containment and clearer ownership. When early “Sydney” conversations with Bing Chat went off‑track in February 2023, Microsoft was quick to cap session length to 5 prompts and put in place some tightened guardrails. It wasn’t pretty but it worked, limiting blast radius while they worked on the fix3. And when a Redis library bug exposed fragments of other users’ ChatGPT data in March 2023, OpenAI took the service offline, published a technical post‑mortem, and notified affected users4.

In AI, incidents aren’t hypothetical, they’re not even rare. The complexity and non-deterministic nature of AI makes incidents inevitable, even predictable. In this article, I’m going to share some of my work and experiences on how to prepare for that inevitable day when your AI System hits a major glitch. I’ll borrow from some real cases, with the help of the AI Incident Database5, where you’ll find three of these four cases (entries #639, #726, #503) and over a thousand more. By sharing these real world problems and practices, I hope you can reduce the pain and damage of messy crisis response, perhaps even avoid having your own AI system included as an avoidable database entry.

Why AI Incident Response Matters

Unlike traditional software bugs or IT outages, AI failures often involve complex, emergent behavior. A chatbot suddenly professing love to a user or threatening them, as happened with the Bing AI Chatbot6, isn’t a typical “error”. It’s an AI-specific incident born of the system’s design and training data. Similarly, an AI quietly leaking sensitive data isn’t a straightforward server misconfiguration; it’s a byproduct of how the AI was set up to learn.

In traditional IT, incident response is a well-honed discipline. Product teams and IT have playbooks for server outages or cybersecurity breaches. But AI incidents really stretch the playbook. They can involve questions of ethics, safety, and trust that are highly subjective, not just uptime or a clear data loss event. They might stem from a model doing exactly what it was designed to do, but in a context the designers never anticipated. It’s not always easy to determine if a reported issue even should be treated as an incident or not. But any organisation deploying or adopting non-trivial AI systems in their organisation should assume that sooner or later something will go wrong, whether it’s a minor hiccup or a major event. The difference between a contained incident and a crisis often comes down to preparedness.

That’s where an AI Incident Management Policy comes in. A robust policy, supported by genuine commitment from leadership, provides a framework to detect problems early, respond rapidly and effectively, and learn from failures. This definitely isn’t about bureaucratic box-checking or appeasing regulators with a document on the shelf, nor can it be about a manual process disconnected from the speed and scale of AI. Engineers hear the word ‘policy’ and cringe, assuming this means yet another corporate-drafted, disconnected, pointless document. But this isn’t about the paperwork, it’s about the act of making decisions and building mechanisms in advance that are fit for the scale and speed of AI. It’s about protecting your customers, your stakeholders, and your company’s mission when the unexpected happens. It’s also about reinforcing trust: how you handle an AI failure is the ultimate test of your proclaimed values around safe, secure and lawful AI. As I’ve discussed in earlier parts of this series, it’s not enough to build AI with good intentions; you need guardrails for when things slip through. Incident response is the safety net of your AI governance program.

I’ve found that there’s a real lack of published practical guidance on how to do AI incident management well. Established frameworks like NIST SP 800-61 Rev. 37 and the SANS Incident Handling process8 are excellent starting points borrowed from cybsecurity; by contrast, ISO/IEC 270359 often feels too bureaucratic to be practical. What’s still missing is hands-on guidance for AI-specific incidents.

So this is my attempt to pull together some research, and combine it with my own experience to guide an AI incident management policy that can be effective in practice. I’ll emphasise a few key themes in this first article: a high-integrity mindset, recognising the human and cultural factors that can make or break incident handling, setting up detection and escalation mechanisms (both human and technical) to catch issues quickly, and making sure that every incident becomes an opportunity to improve.

In a second article, I’ll walk through some practical tools like severity classification, response playbooks, communication plans, and how all of this ties back into the AI lifecycle and broader risk management. By the end, I hope you’ll have a clearer picture of how to turn the chaos of an AI incident into an orderly process. Anyone who has worked in large IT system operations will tell you: incidents can be traumatic (and too frequently they are) or they can be a learning opportunity. They can even be enjoyable. Some of my fondest working memories have been in the thick of an incident crisis! I’m hoping these two articles help you experience more of the joy, less of the trauma.

Incident management can’t be a compliance checkbox

First, let’s talk governance mindset. An AI incident response policy is only as good as the governance environment around it. If it’s treated as just another checkbox on a compliance list, its value evaporates the moment it’s truly tested. We’ve all seen scenarios where policies exist on paper but not in spirit: safety procedures that employees quietly bypass, or guidelines that get lip service when convenient. A superficial incident response plan might get you a shiny ISO42001 certificate, and even satisfy a regulator in an inspection, but in a real crisis it will show its cracks. High-integrity governance, on the other hand, means the organisation is genuinely committed to doing the right thing even when no one is looking, or when everyone is looking, as during a major incident.

What does high-integrity incident management look like? It starts at the top. Leaders set the tone by insisting on transparency, accountability, and learning, rather than finger-pointing or reputation management at all costs. They allocate adequate resources to incident response preparation (like training and drills) before an incident forces their hand. They empower teams to act swiftly in the event of an AI anomaly, even if that means pausing a flagship launch or admitting an embarrassing mistake. This is the opposite of the “bury it and hope no one notices” approach. In fact, hiding an AI failure is nearly always worse in the long run – not only is it unethical, it often backfires as the truth comes out. Organisations that respond to AI incidents with candor tend to maintain more trust, even if the incident itself is serious.

High-integrity governance also means resisting the urge to create policies just to satisfy external requirements while neglecting their implementation. I have witnessed firsthand teams that setup an incident response policy because a new regulation requires it (perhaps as an interpretation of EU AI Act Article 17.1 and 26.5), but then treat it as a formality. The policy gets filed away, responders are never trained, no drills are run, and no one really expects to use it. That is checkbox compliance at its worst. The whole effort is wasted when an incident actually hits, or even worse it undermines actions to respond because everyone is confused. By contrast, a company devoted to real governance will treat the policy as a living mechanism, updating it with lessons learned, integrating it into everyday operations, automating alerts and metrics, and evaluating its effectiveness regularly. In practice, it means doing things like internal audits or simulations to verify your AI incident processes work, not just asserting that they exist.

Cultural failure modes: Good tech and Bad habits

Even the best-written policy can be undermined by a poor organisational culture. In fact, if you’ve read my previous articles, you’ll know that I often highlight how often cultural failure modes are the real, hidden culprits behind disastrous incident responses. The history of safety engineering widely shows how those technical or human failures that turn into major incidents are almost always preceded by cultural failures. What do I mean by cultural failure modes? Basically these are normalised human behaviors within an organisation that cause delays, underreaction, or missteps in handling an incident. Let me give you a few examples - see which ones you recognise:

Underreporting or ignoring early signs: Perhaps you’re the engineer who notices a worrying trend, say, an uptick in user complaints about biased outputs from a model, but you hesitate to raise a flag. Maybe you fear being blamed for a mistake, or assume someone else will notice. In a culture without psychological safety, minor incidents or near-misses get swept under the rug. This underreporting means the organisation loses precious lead time to address an issue before it blows up. A good safety culture, by contrast, encourages surfacing even potential problems. Teams have to feel safe saying, “I think something’s off here,” without fear of punishment.
Over-reporting and over-escalation. The converse is just as corrosive. When people fear being blamed for not escalating, lack clarity in decision making authority, or lack the empowerment to simply act and resolve, everything becomes an alarm. My former colleagues at Amazon Web Services sure know how that can feel. Alarm fatigue sets in, triage channels melt down, more time is spent on escalation comms than actually fixing the problem, and teams lose the ability to prioritise real incidents.
Delay and Denial: This is the “maybe it will go away if we wait” syndrome. When an incident becomes evident, there’s a natural human impulse to minimise its significance, especially if careers, performance ratings or reputations are on the line. Managers might delay escalating the issue to senior leadership, hoping to fix it quietly. Or they might downplay its severity: “It’s just a glitch, nothing to worry about,” until it’s undeniably something to really worry about. In fast-moving AI incidents, every hour of denial can amplify damage.
Minimisation and Spin: Once an incident is public or reaching leadership, another cultural trap is spinning the narrative instead of confronting the facts. Communications teams (and often lawyers) might instinctively try to control the story, which of course is their job and sensible, but if spin crosses into misrepresentation or omission, you’re in trouble. Just ask the Cruise GM team. Minimisation can also happen internally: calling a major failure a “hiccup” or an “edge case” when it’s actually symptomatic of deeper issues. This mindset prevents allocating the necessary urgency and resources to truly fix the problem. Culturally, the better approach is to just confront reality head-on. Being frank about how bad it is can start to rebuild trust because people forgive failures; they don’t forgive cover-ups.
Blame Games: Oh, the scapegoat! Sometimes the hunt for a scapegoat begins as soon as something goes wrong. Was it the data scientist’s fault? The product manager’s oversight? The senior engineer who (conveniently) left to join a competitor last month. Such finger-pointing is toxic to effective incident response. It causes team members to go defensive, hiding information or deflecting, rather than collaborating to resolve the issue. It also discourages others from coming forward with related problems (“I don’t want to be the next person thrown under the bus.”). A healthy incident response culture adopts a “no-blame post-mortem” ethos (except in cases of willful misconduct). The focus has to be on what systemic factors enabled the incident, not which individual to punish. Many tech organisations have learned from DevOps and Site Reliability Engineering practices: blameless retrospectives lead to more honesty and more learning, which in turn prevents future incidents.

To build real resilience, you need to codify the behaviours you want, which ones you will coach, and which you just won’t accept. I’m a huge fan of Sydney Dekker’s Just Culture10 approach, which frames:

Encouraged behaviours (reward): early reporting of anomalies and near-misses; “stop-the-line” when in doubt; thorough logging; proactive red-teaming; raising ethical concerns even if inconvenient. An organisational approach of public praise, visible credit in performance reviews, reduced career risk for candid reporting all go a long way to encouraging this behaviour.
At-risk behaviours (coach): well-intentioned shortcuts (skipping a peer review to hit a date), “escalate everything” to avoid blame, weak data provenance, bypassing non-blocking alerts. This requires coaching and system fixes, like removing perverse incentives, clarifying guardrails, and improving tooling, not punishment.
Reckless behaviours (sanction): knowingly disabling safety controls; deploying unapproved models/data; misrepresenting facts of an incident; suppressing reports; retaliating against reporters. These need explicit, visible consequences (up to removal), mandatory disclosure where required, and clear reinforcement that integrity beats velocity. The unfortunate difficulty is that these kinds of behaviours are most commonly performed by ‘brilliant jerks’, talented senior engineers and domain experts who confidently sidestep process because they know better. The answer is always the same - remove them, the sooner the better.

Operationalising this matters more than posters or policies. To make it stich, you have to bake it into playbooks and training ( especially table-top exercises and drills), give front-line owners explicit rollback/kill-switch authority, and route low-severity items to batch triage so real-time paging is reserved for Sev-3/4. It starts to become engrained and begins to seem as easy as this: speak up early, contain the problem fast, follow the process, tell the truth, and we all get safer over time.

Early Detection: Human and Technical Eyes on Glass

It seems obvious, but the best way to handle an incident is to catch it before it causes major damage, or even better before it fully materialises. Early detection and rapid response can mean the difference between a minor hiccup and a full-blown crisis. To achieve that, you need a combination of technical monitoring and human oversight keeping watch on your AI systems. Think of it as a safety net with two layers: machines catching anomalies that humans might miss, and humans catching context and implications that automated monitors might not understand.

Technical Monitoring and Proactive Detection

AI systems, especially those in production, should be instrumented with robust monitoring just like any mission-critical software. But beyond uptime and performance, we need to monitor for AI-specific anomalies. What might those be? For a machine learning model, it could be a drift in input data distribution or a sudden change in output patterns. For example, if a content moderation AI that usually flags 5% of posts suddenly flags 15% one day, that’s a signal worth investigating. Likewise, an uptick in confidence scores dropping below a threshold might indicate the model is unsure in novel situations. Monitoring tools can be set to alert on these metrics. Another example: if our chatbot starts seeing a lot of user messages like “Are you okay?” or “That was offensive,” (basically user feedback signals), that could be text analytics to trigger an alert that the bot is misbehaving or at least not providing satisfying answers.

Beyond model metrics, it makes sense to consider systemic checks: data pipelines, latency spikes, unusual memory or CPU usage. An AI incident might start as a technical anomaly, e.g., a spike in API errors from an image recognition service might indicate it’s failing on certain inputs (maybe a new kind of image that breaks it). So existing application performance monitoring and logging infrastructure needs to be extended to cover AI-specific logs and metrics. Specialised model monitoring systems might make sense to detect bias drift or performance degradation over time.

The tough thing is that some AI incidents have no obvious numeric signature – the outputs are syntactically fine but semantically wrong or harmful. For those, automated detection is harder. This is where red teaming can help running stress tests and adversarial simulations (like prompt injections, or testing extreme inputs) to identify how the AI might fail. But even post-deployment, running periodic simulated attacks or weird inputs in a sandbox can reveal vulnerabilities proactively. If you find out, for example, that by phrasing a question a certain way the chatbot reveals other users’ data, you’ve caught a silent leak scenario before a malicious actor does.

Human-in-the-Loop Monitoring

As good as automated monitors get, we still need human eyes on the AI’s behaviour in the real world. Humans notice qualitative issues, like a reply from the chatbot that just feels off, or a vision system making a common-sense error that numbers might not flag. This is why it makes sense to roll out AI features gradually (e.g. to a small percentage of users or in a pilot phase) with enhanced human oversight. During that period, staff might manually review a sample of AI outputs daily or use internal channels where employees can report odd behavior easily. User feedback is gold for detection. Even a “Report this response” button on a chatbot gives early warning if multiple users report inappropriate answers

Another aspect of human monitoring is cross-functional oversight, having your legal or compliance teams periodically check what the AI is doing in practice. They might spot, say, that the AI is inadvertently mentioning protected health information in logs, or that it’s drifting into a regulated domain without the proper controls. Domain experts can notice content flaws that a layperson might not catch.

Early detection hinges on having clear thresholds and triggers defined in your policy. Decide in advance what kinds of anomalies warrant pausing a system. For example, “If more than 5% of outputs in an hour are flagged by users, shut down the AI service and escalate to the incident team.” Or “If an AI system experiences any unauthorised data exposure, treat it as a security incident immediately.” Predefining these triggers removes hesitation in the heat of the moment. It gives your operations people the mandate to hit the big red button when certain criteria are met, without waiting for a committee decision. Yes, this might result in a false alarm once in a while (shutting down for something that turned out benign), but that’s a small price to pay.

Clear escalation paths and assigned roles

Once an incident (or even a suspicious anomaly) is detected, who ya gonna call? If your answer is “Umm… maybe the AI team? Or IT support? Not sure…,” then you have a problem. Clarity in roles and escalation paths is absolutely essential in incident response. Every member of the organisation should know, broadly, what to do and whom to alert if they spot an AI incident. And the people directly responsible for managing the incident should have no ambiguity about their responsibilities and authority.

I think the first step is to create an AI Incident Response Team. I am not talking about creating an actual new dedicated group of personnel, unless you work for a really large company. For most organisations, it makes most sense to leverage the existing incident management structure (like the IT incident or security incident team) and augment it with AI expertise. Typical roles that need to be assigned might include:

Incident Leader / Coordinator: This is the person who runs the show when an incident is declared. This could be the AI Technical Lead or an assigned Incident Manager for AI. They are responsible for assessing the situation, assembling the needed experts, and driving the response process end-to-end. They’re the point person for all internal communications about the incident and ensure things don’t fall through the cracks. Importantly, they need the authority to make quick decisions (e.g. shutting down a system) in the early stages.
Technical responders (AI engineers / data scientists): These are the people who know the system intimately – the ones who can dig into model behavior, check logs, reproduce problems, and eventually fix the issue. Often it’s the product’s engineering team, but you might also have specialised ML engineers or researchers to consult if it’s a complex model issue. They focus on containment and remediation from the technical angle: can we stop the AI from causing more harm right now (disable a feature, revert to a previous model version, apply a patch)? And then, what’s needed to properly fix or retrain it.
Domain experts: If the incident involves a specific domain (e.g. a medical AI misdiagnosing), bring in a doctor or medical expert; if it’s a financial model error, get a finance risk expert involved. They help assess impact (“Is this mistake life-threatening or just a minor inconvenience?”) and guide the correct course of action relative to the domain’s norms and regulations.
Communications and PR: Don’t underestimate their importance for medium or high-severity incidents. This role crafts the messages that go out to customers, the public, or even internal audiences. They should be looped in early to prepare holding statements or FAQs while the tech folks are still diagnosing the issue. In the initial hours, even “We are aware of an issue and are investigating” might be a critical message to get out. The comms person/team will also advise on how to communicate transparently but carefully – balancing honesty, liability, and reassurance.
Legal/Compliance: If there’s any regulatory, legal, or ethical dimension (and in AI incidents there often is), legal advisors need to be at the table. They’ll check if there are statutory obligations (e.g. data breach notification laws, product safety reporting requirements) triggered by the incident. They also help with the wording of public statements to ensure accuracy and to avoid unnecessary admissions of liability while still being truthful. Importantly, they’ll be taking the long view of any legal exposure: “We need to inform regulator X within 72 hours”, or “This incident might lead to litigation, let’s document carefully and preserve evidence.”
Business Owner / Product Manager: The person responsible for the AI product or service from a business standpoint should be involved. They provide perspective on user impact, can help prioritise decisions, and will coordinate any customer support or business continuity issues (for example, ensuring there’s a manual fallback if an automated process is offline). They’ll also be the one to communicate with any key clients or stakeholders one-on-one if needed.
Executive Oversight: For critical incidents, senior leadership should be notified and possibly involved in decision-making. Many organisations set thresholds: e.g., a Critical (Severity 4) incident mandates notifying the CEO or a responsible executive within an hour, and perhaps forming a crisis management team including them. Executives need to be in the loop for major strategic calls – like approving a public statement that might have market implications, or committing budget/resources for emergency measures (say, bringing in an external firm to help, or offering compensation to affected users). Having an executive already identified as the sponsor for AI governance helps; that person will take the lead in briefing the rest of leadership.

Next, define your escalation pathways. This means establishing how an incident is declared and how it moves up the chain. A typical flow might be: front-line personnel (maybe an on-call engineer or customer support lead) detects a potential incident and immediately notifies the Incident Leader and/or an on-duty incident hotline. The Incident Leader does a quick severity assessment (more on classification soon) and if it’s above a certain threshold, they formally declare an incident, which triggers paging the rest of the response team. If it’s truly critical, top executives and the board might be alerted within a day or less. The key is that this is pre-defined. During an incident is not the time to be figuring out “Do we tell the CTO about this? Should legal be on this call?”

Escalation isn’t only vertical (to higher-ups) but also horizontal: involving the right adjacent teams. For example, if your AI incident might have a cybersecurity angle, you need to loop in the security incident response team right away. If the incident affects a particular client significantly, account managers might need to be notified to handle that relationship. All these linkages should be thought through in your plan.

Finally, to ensure these roles and paths work, training and drills are invaluable. It’s one thing to know in theory who’s in charge; it’s another to actually execute under pressure. Running a tabletop exercise (a role-play of an incident) with your team can expose gaps: maybe two teams assumed the other would inform the customers, or no one knew who had the authority to approve turning off the AI system. By simulating an incident, you find those issues and refine the process. Do surprise drills (“GameDays”) where you simulate an incident without prior warning, to test real-time response. Fair warning: some of your team will love the break from routine, others will consider you cruel! While I think that level of rigour is still emerging for AI (I’ve at least never seen an AI game-day, even though they’re common in cybersecurity), you might find it worth considering them for high-stakes AI deployments.

When roles and escalation are crystal clear, your response will be faster, more coordinated, and far less prone to the delays and confusion that give incidents time to wreak havoc.

So that’s a starting point: AI incidents are inevitable; what separates damage from trust is your preparation, integrity and culture. In the next article, I’ll get concrete and go through a few pieces you’ll need: a usable severity classification (Sev-1→Sev-4 with hard triggers), detection metrics and thresholds to watch (drift, abuse and user-flag rates), an escalation matrix and RACI, a step-by-step playbook for the big four scenarios (harmful output, privacy/data exposure, adversarial attack, outage/degradation), and how to deal with communications.

Hopefully that will help as a starting kit for your incident program. Please do subscribe for the next article, and as always, let me know what you think and how you do AI Governance. I welcome and learn from everyone doing this, we’re all figuring out the best way together.

The need for high-integrity AI Governance

James Kavanagh

Jan 20

Read full story

https://blogs.bing.com/search/february-2023/The-new-Bing-Edge-Updates-to-Chat

https://openai.com/index/march-20-chatgpt-outage

https://incidentdatabase.ai/cite/639

https://www.lbc.co.uk/tech/microsoft-bing-ai-chatbot-declares-love-wants-steal-nuclear-code/

https://csrc.nist.gov/pubs/sp/800/61/r3/final

https://www.sans.org/security-resources/glossary-of-terms/incident-response

https://www.iso.org/standard/78973.html

https://sidneydekker.com/just-culture

John Archbold

Genuinely fantastic article; sets out a great foundation.

I've been trying to articulate how AI incidents differ from cyber security for a while, your take that they're subjective, rather than hard fact based metrics, is an interesting point.

Expand full comment

1 reply by James Kavanagh

1 more comment...

Doing AI Governance

The need for high-integrity AI Governance

Discussion about this post