Article

Five AI Governance Scenarios Every Board Should War-Game

From rogue pricing agents to audit requests you cannot answer — five concrete scenarios to pressure-test your organisation's AI readiness. If any of these keep you up at night, you are paying attention.

Why war-gaming matters

Most AI governance discussions happen in the abstract. Risk registers list categories. Policies describe principles. Frameworks outline processes. But governance capability is only tested when something goes wrong — and by then it is too late to build the infrastructure you need.

War-gaming forces specificity. It reveals the gap between what organisations think they can handle and what they actually can. These five scenarios are drawn from real operational patterns we see across enterprises deploying AI in production.

Scenario 1: The rogue pricing agent

Situation: Your AI-powered pricing engine, deployed across your e-commerce platform, begins setting prices 40% below cost on a high-margin product line. The error is caused by a data pipeline issue that fed incorrect competitor pricing data to the model. By the time the anomaly is detected, 2,300 orders have been placed at the incorrect price.

Questions to answer:

  • How long before you detect the anomaly? Minutes, hours, or days?
  • Can you trace exactly which data caused the incorrect pricing?
  • Can you identify every affected order programmatically?
  • What is your remediation path? Honour the prices? Cancel orders? Offer a compromise?
  • Who has the authority to make that decision, and how quickly can they act?

Scenario 2: The biased hiring screen

Situation: An internal audit reveals that your AI-assisted candidate screening tool has been systematically scoring candidates from certain postcodes lower than others. The tool has been in production for eight months. 14,000 candidates have been screened. A journalist has submitted a freedom of information request.

Questions to answer:

  • Can you reproduce the exact scoring logic that was applied to each candidate?
  • Can you identify every candidate who was adversely affected?
  • Can you demonstrate what governance was in place at the time of each decision?
  • Can you produce this evidence in a format that satisfies a regulator?
  • What is the remediation plan for affected candidates?

Scenario 3: The unauditable model

Situation: A regulator requests a comprehensive audit of your customer risk scoring system under the EU AI Act. They want to see: every decision the system has made in the past 90 days, the data that informed each decision, the policy framework that governed the system, and evidence that human oversight mechanisms were in place and effective.

Questions to answer:

  • Can you produce a complete decision log for the past 90 days?
  • Can you link each decision to its input data and the model version that produced it?
  • Can you demonstrate that governance policies were enforced at runtime — not just documented?
  • How long would it take to compile this evidence? Days? Weeks? Is it even possible?

Scenario 4: The cascading agent failure

Situation: Your AI-powered customer service agent, integrated with your CRM, billing, and ticketing systems, begins issuing refunds to customers who do not qualify. The agent is following a policy rule that was updated incorrectly. Before the error is caught, it has processed 850 refunds totalling $340,000 and sent confirmation emails to every affected customer.

Questions to answer:

  • Can you identify every action the agent took across all connected systems?
  • Can you reverse the refunds? What about the confirmation emails?
  • Can you determine the root cause — the incorrect policy rule — and prove when it was changed and by whom?
  • Can you implement a fix that prevents recurrence without taking the entire system offline?

Scenario 5: The shadow AI deployment

Situation: A business unit has deployed an AI model using a third-party API to automate contract review. The model has been processing sensitive customer contracts for three months. IT was not informed. Legal was not consulted. The model's terms of service include a clause allowing the provider to use submitted data for model training.

Questions to answer:

  • How would you discover this deployment exists?
  • Can you identify which contracts were processed and what data was exposed?
  • What is your legal exposure under data protection regulations?
  • How do you prevent similar shadow deployments in the future?

The common thread

Every scenario above requires the same four capabilities:

  • Visibility: Knowing what your AI systems are doing, in real time.
  • Traceability: Being able to reconstruct exactly what happened, when, and why.
  • Enforcement: Having mechanisms that prevent non-compliant actions from executing.
  • Remediation: Being able to reverse or compensate for actions that went wrong.

If your organisation cannot confidently address these scenarios today, the gap is not in your policies or your people. It is in your infrastructure.


Tracemark provides the infrastructure to handle these scenarios — runtime interception, policy enforcement, tamper-proof provenance, and governed remediation. If you would like to war-game these scenarios with us, get in touch.