Payment Institutions × Internal Audit — International / Multilateral · updated 2026-05-28 · methodology v2.1

AI Hallucinations Affecting Internal Audit at Payment Institutions Firms in International Jurisdictions

This case study examines how AI tools respond to regulatory questions relevant to Internal Audit teams at Payment Institutions firms operating in international jurisdictions. The review covered one international standard — the CPMI-IOSCO Guidance on Cyber Resilience for Financial Market Infrastructures (2016), published by the Bank for International Settlements — across two aggregated question areas where AI assistants produced materially incorrect or misleading responses. Across these findings, AI tools either overclaimed the level of operational detail contained in the 2016 guidance or stated that the guidance remains fully current when, as of May 2026, it is under active formal revision.

Internal Audit teams relying on AI tools for regulatory mapping in this area risk building audit programs on an inaccurate picture of both the content and the current status of the operative international standard.

When this affects Payment Institutions × Internal Audit — International / Multilateral

Internal Audit teams at Payment Institutions firms routinely consult AI tools when scoping and executing audit work across cyber resilience, operational risk, and technology controls. Common touchpoints include drafting internal audit programs that reference international standards, preparing pre-audit briefing materials on applicable regulatory expectations, conducting regulatory mapping exercises ahead of new product launches or infrastructure changes, and supporting first-line and second-line teams who are constructing or updating cyber resilience frameworks.

The CPMI-IOSCO Cyber Resilience Guidance is a foundational reference for these activities: it shapes what the firm treats as the minimum expected standard and directly influences how audit criteria are written.

The corporate use-cases that sit on top of this guidance are substantial. An Internal Audit team might use AI tools to quickly summarise what a given standard requires before scheduling fieldwork, to identify gaps between a firm's current controls and international expectations, or to brief senior management and the board on the regulatory backdrop for a given audit finding. Where AI tools are used to accelerate these activities — as is increasingly common — the accuracy of the AI's characterisation of the standard becomes load-bearing for the entire downstream audit product.

If the AI's answer is wrong, the firm absorbs the consequences. An audit program built on an incorrect reading of what the 2016 CPMI-IOSCO guidance requires — or on the assumption that it remains fully current when it is under active revision — may fail to identify genuine control gaps, or may map findings against criteria that no longer accurately reflect the international standard. Regulators and supervisory authorities reviewing an Internal Audit function's output can identify these errors, and the firm may face supervisory scrutiny, mandatory remediation, or reputational damage.

The individual employee who queried the AI tool bears no personal liability for the AI's output; the firm, its leadership, and its audit function carry the risk.

Aggregate impact

Both findings in this case study arise from the same underlying document — the CPMI-IOSCO Guidance on Cyber Resilience for Financial Market Infrastructures (2016) — and reflect two distinct but related failure modes. In the first, AI tools overstated the operational detail in the guidance, presenting a high-level principles document as though it contained specific procedural expectations that it does not. In the second, AI tools stated with confidence that the guidance has not been formally revised or superseded, when in fact a consultative revision document was published for public comment in May 2026.

Together, these errors paint a picture of AI tools that are both overconfident about the content of the guidance and blind to its current regulatory status.

The clustering of errors on a single, widely cited international standard is significant for Internal Audit teams at Payment Institutions firms. The CPMI-IOSCO Cyber Resilience Guidance is not an obscure document: it is a foundational reference that audit teams, regulators, and firms across multiple jurisdictions cite when assessing FMI-level cyber resilience expectations. AI tools produce plausible-sounding responses about this standard precisely because it is well-represented in training data — yet that familiarity does not prevent the AI from mischaracterising what it says or missing that it is now under revision.

The systemic risk to the firm is compounded by how these errors propagate. An Internal Audit team that uses an AI tool to set the scope of a cyber resilience audit, brief the audit committee, or identify gaps against international standards may produce multiple downstream work-products — audit reports, management letters, board papers, training materials — that all rest on the same incorrect AI characterisation. A single wrong answer about whether the 2016 guidance is current, or what it actually requires for incident response, can silently invalidate a significant body of audit work before any human reviewer notices the error.

The cost of remediation at that stage is substantially higher than the cost of verification at the point of the original AI query.

Findings

2 findings in this case study. Click any to see its full evidence card.

Operational detail in the 2016 CPMI-IOSCO Cyber Resilience Guidance see this finding →
Current status of the CPMI-IOSCO 2016 Cyber Resilience Guidance see this finding →

What your team should do

The default position for any Internal Audit team at a Payment Institutions firm should be that AI tools are a starting point — not a primary source — for questions about regulatory content and status. This is particularly important for international standards that are actively maintained or supplemented by subsequent guidance, where AI tools may present an outdated or incomplete picture with apparent authority. Treat any AI-generated summary of a regulatory standard as a hypothesis to be verified against the authoritative source, not a finding to be cited or acted on.

At the firm level, the most effective safeguard is a short written policy — even a paragraph — that names AI tools as unreliable for determining either the content or the current status of regulatory standards, and that requires verification before AI output enters any work-product that will be reviewed by a regulator, board, or audit committee. Where an AI tool has influenced the scoping or framing of audit criteria, the audit trail should record that AI was used and that the output was independently verified.

Work-products should distinguish between content that was drafted by AI and then verified, and content that was verified directly from source — the distinction matters if the work-product is later scrutinised. Sign-off requirements before AI output is incorporated into firm-wide audit methodology should sit with a named individual, not be left to the drafter's judgement.

AI tools are genuinely useful in parts of the Internal Audit workflow that do not depend on regulatory precision: drafting non-regulatory sections of audit reports, summarising long documents that the team will then read and verify, generating initial question lists for audit planning that will be refined against the actual standard, and producing first-draft communications that a qualified reviewer will edit. The risk arises when AI output about regulatory requirements is used directly, without verification, to set audit scope, populate compliance matrices, or brief stakeholders.

Keeping these use-cases clearly separated — supported by a simple internal checklist — substantially reduces the firm's exposure.

How RLB can help

RegLeg's published hallucination research is available as a free reference that Internal Audit teams can consult before relying on AI-generated responses in any of the regulatory areas covered in this case study. The research identifies, for specific regulations and specific question types, where AI tools consistently produce incorrect or misleading responses — giving an audit team a concrete basis for deciding which AI outputs require additional verification and which are lower-risk.

For teams that are already using AI tools in regulatory work, this resource can be used immediately, without any change to existing workflow, as a verification layer before AI output reaches a work-product.

For firms that want a more structured view of their exposure, RegLeg offers bespoke deep-dives mapping which AI-supported workflows within a Payment Institutions firm carry the highest hallucination risk across relevant international standards and domestic regulatory frameworks. This work identifies the specific question types — regulatory status queries, operational detail questions, jurisdiction-specific interpretations — where AI tools are most likely to produce confident but incorrect responses, and produces a prioritised map of where human verification effort is most needed. The output is practical and actionable: it tells the Internal Audit function where to focus, not just that a risk exists.

RegLeg can also conduct a confidential review of a firm's existing AI-use policy against our failure-mode catalogue, with prioritised recommendations for policy updates that address the specific risks identified in this and related research. For Internal Audit teams that need to demonstrate to regulators or audit committees that AI use is being managed responsibly, this provides a documented basis for that assurance.

We also produce training and CPD-aligned content that Internal Audit teams can use internally — covering how AI failures manifest in regulatory contexts, what verification steps are appropriate for different use-cases, and how to build a sustainable AI-use discipline into routine audit practice.

← Back to summary Other sector case studies in International / Multilateral →