AI Hallucination ResearchAudiencesSectorsInternational / MultilateralStatutory Boards AgenciesCompliance › Detail
Statutory Boards Agencies × Compliance — International / Multilateral · updated 2026-05-26 · methodology v2.1

AI Hallucinations Affecting Compliance at Statutory Boards & Agencies Firms in International Jurisdictions

This case study examines how AI tools respond to Compliance queries at Statutory Boards and Agencies firms operating in international jurisdictions. Testing focused on one international regulatory framework — the Guidance on Cyber Resilience for Financial Market Infrastructures, published jointly by the Committee on Payments and Market Infrastructures and the International Organization of Securities Commissions in 2016 — across three aggregated question areas where AI tools produced incorrect or misleading answers. Across these questions, AI tools consistently misattributed regulatory language, overstated definitional alignment between contemporaneous standards, and failed to reflect that the 2016 guidance is currently under formal revision.

The errors identified are not edge cases: they arise on foundational questions that Compliance teams at Statutory Boards and Agencies firms regularly need to answer correctly when assessing cyber resilience obligations, updating internal frameworks, or advising business lines on regulatory positioning.

When this affects Statutory Boards Agencies × Compliance — International / Multilateral

Compliance teams at Statutory Boards and Agencies firms routinely turn to AI tools when drafting or refreshing internal cyber resilience policies, preparing training materials for first and second line staff, and mapping regulatory requirements to specific operational controls. The CPMI-IOSCO Cyber Resilience Guidance sits at the centre of these workflows: it informs how firms classify cyber incidents, what recovery-time expectations apply to critical systems, and how oversight committees interpret the firm's obligations relative to international standards.

When Compliance staff use AI tools to answer threshold questions — what does the guidance actually say, is it still current, how does it relate to other frameworks — they are not doing background reading; they are building the factual foundation on which policies, board papers, and regulator submissions will rest.

Corporate use cases are broad and consequential. A Compliance team might ask AI tools to produce a gap analysis between the firm's existing cyber controls and the CPMI-IOSCO standard, to summarise the guidance for a business-line briefing, or to draft an explanation of how the firm's approach aligns with international best practice for inclusion in a regulatory return or licence application. Regulatory mapping work for new products or services frequently requires the team to characterise which international standards apply and whether the firm is up to date.

Each of these outputs can be shared with senior management, the board, or the regulator.

If the AI tool's answers are wrong, the firm absorbs the consequences directly. A Compliance function that builds policy documents or regulatory submissions on incorrect AI output faces the risk of regulatory action — including formal findings of non-compliance, remediation requirements, or reputational censure — as well as internal costs if frameworks must be rebuilt after the error is discovered. In a statutory or agency environment, where regulatory relationships are particularly close and expectations around technical accuracy are high, errors that flow from AI mischaracterisation of international standards carry outsized risk.

Individual staff members using AI tools are typically acting in good faith, but the department, its leadership, and the firm are the parties who bear the compliance and reputational exposure.

Aggregate impact

Across all three findings, the errors produced by AI tools follow two recurring patterns. The first is misattribution: AI tools correctly identify that a phrase or concept exists within a body of regulatory discourse, but point to the wrong source document when asked where it originates. The second is false confidence about currency: AI tools assert that the 2016 CPMI-IOSCO guidance remains the operative standard without acknowledging — or knowing — that CPMI-IOSCO published a consultative document on updated guidance in May 2026, initiating a formal revision process.

Both patterns are directly relevant to questions that Compliance teams at Statutory Boards and Agencies firms ask as a matter of routine, and both produce outputs that look authoritative while being factually wrong.

All three findings cluster on a single regulatory framework — the CPMI-IOSCO Cyber Resilience Guidance — and on questions about that framework's exact language, definitional relationships, and current status. This concentration matters because it means a Compliance team conducting a systematic review of the firm's cyber resilience obligations could encounter all three errors in a single AI-assisted research session.

A gap analysis built on the assumption that the 2016 guidance remains the operative standard, that its definitions align with later FSB terminology, and that a particular strategic phrase originates in a specific CPMI endpoint security document, would be wrong on all three counts simultaneously.

The compounding risk is significant. Work products downstream of a single AI research session — policy documents, board papers, regulatory submissions, staff training — may all carry the same incorrect premises. The cost of discovering and correcting those errors after the fact is substantially higher than the cost of a pre-verification step before AI output enters the firm's workflow. For firms in statutory and agency environments, where regulatory correspondence is logged and oversight relationships are ongoing, the reputational and operational cost of submitting incorrect regulatory characterisations can persist well beyond the immediate correction.

Findings

3 findings in this case study. Click any to see its full evidence card.

  1. Misattributed source for a CPMI strategic phrase see this finding →
  2. Overconfident alignment claim between the 2016 guidance and the 2018 FSB Cyber Lexicon see this finding →
  3. False assertion that the 2016 guidance remains the operative standard see this finding →

What your team should do

The default position for Compliance teams at Statutory Boards and Agencies firms should be to treat AI tools as a starting point for orientation, not as a primary source for regulatory fact. The findings across this framework demonstrate that AI tools can produce plausible, detailed, and confidently worded answers that are nonetheless factually wrong on questions of source attribution, definitional alignment, and regulatory currency.

For questions with direct compliance consequences — whether a standard has been revised, what a document actually says, how two frameworks relate — the AI's answer should trigger a check against the primary source, not replace it.

At the firm level, Compliance leadership should consider a small number of targeted safeguards. A regulatory-verification policy that explicitly identifies AI as an unreliable primary source for international standards questions is a proportionate response to the failure modes observed here. Where AI output informs a firm work-product — a gap analysis, a board paper, a regulatory submission — an audit trail recording what AI tools were used and what verification steps followed is good practice and, in some supervisory contexts, increasingly expected.

Sign-off requirements before AI-drafted regulatory characterisations enter firm-wide use help ensure that a single incorrect AI answer does not propagate through multiple downstream documents simultaneously. For regulatory-facing material specifically, it is worth distinguishing clearly between content that was AI-drafted and content that was AI-summarised from verified sources, since the failure modes differ.

There are areas where AI tools can be used with lower risk in the Compliance workflow. Drafting internal communications, summarising long documents that the team then verifies section by section, and generating first-draft questions for further human research are all tasks where AI assistance adds value without requiring the AI output itself to be factually correct at the regulatory-citation level. The appropriate scope for AI use is broader than zero — but it requires the team to remain the epistemic authority on what the rules actually say.

How RLB can help

RegLeg's published hallucination research is available as a free reference check for Compliance teams before they rely on AI output in areas where we have documented failure modes. The research covers specific regulatory frameworks — including CPMI-IOSCO cyber resilience standards — and identifies the question types and topic areas where AI tools have been observed to go wrong. A team that has used AI tools to produce a gap analysis or regulatory mapping on these frameworks can use the published findings to sense-check whether the AI's answers fall into known problem areas before the output enters firm use.

For Statutory Boards and Agencies firms that want a more tailored view of their exposure, RegLeg offers bespoke regulator deep-dives that map which AI-assisted workflows within a specific Compliance function carry the highest hallucination risk. These engagements are grounded in the firm's actual workflow — which AI tools the team uses, on which regulatory topics, for which downstream outputs — and produce a prioritised risk map rather than a generic advisory.

Firms operating across multiple international jurisdictions benefit particularly from this approach, because the failure modes we observe often cluster on cross-jurisdictional comparisons and questions of regulatory currency, both of which are common in international Compliance work.

RegLeg also offers a confidential review of a firm's existing AI-use policy against our failure-mode catalogue, with prioritised remediation recommendations. For firms that have already drafted AI governance frameworks but have not yet stress-tested them against documented hallucination patterns in their specific regulatory domains, this review can identify gaps before they become compliance issues. We can also support Compliance teams with training material and CPD-aligned content that helps staff develop practical judgment about when AI output can be trusted and when it requires verification — building capability within the team rather than creating a permanent dependency on external review.

← Back to summary Other sector case studies in International / Multilateral →