This case study examines AI hallucination risk for Operations teams at Payment Institutions firms operating under international regulatory frameworks. Testing covered the Guidance on Cyber Resilience for Financial Market Infrastructures published jointly by CPMI and IOSCO in 2016 — a foundational document for firms whose payment infrastructure is subject to international oversight standards. Across the question set, AI tools produced incorrect or materially misleading answers on one aggregated question where regulatory accuracy is directly consequential to operational practice.
The finding documented here reflects a consistent pattern: AI tools overstate the detail and completeness of a regulatory document, failing to recognise when a later publication filled a significant operational gap. Operations teams that rely on these AI responses without independent verification risk building internal processes and compliance frameworks on a misreading of what the 2016 guidance actually requires.
Operations teams at Payment Institutions firms routinely turn to AI tools when scoping or updating internal incident response and business continuity frameworks. When a new product line is being launched, when a third-party technology partner is being onboarded, or when leadership has asked for a rapid regulatory mapping exercise, the Operations function is often the first department to query what international standards actually require in terms of cyber resilience and recovery capability.
AI tools appear well-suited to these tasks — they return confident, structured answers quickly — and staff may not have the background to identify when a response has conflated two separate regulatory documents or overstated the prescriptiveness of one source.
The corporate use-cases sitting directly on top of this topic area include: drafting and reviewing the firm's cyber incident response plan, setting or reviewing recovery time objectives for critical payment operations, designing training material on operational resilience for compliance and risk staff, and responding to regulator questionnaires about the firm's alignment with international cyber standards. In each of these scenarios, an AI tool that asserts "the 2016 CPMI-IOSCO guidance already contains detailed operational expectations for response and recovery" will cause the drafter to stop looking — meaning the operationally richer FSB 2020 guidance never enters the picture.
If the firm acts on that incomplete picture, the consequences extend well beyond a document-drafting error. Regulatory examiners and supervisors reviewing a Payment Institution's cyber resilience framework will assess it against the full body of applicable guidance, including later-published materials. A firm whose response and recovery arrangements reflect only the higher-level 2016 text — without incorporating the more detailed expectations published in 2020 — may find its framework judged inadequate, triggering remediation requirements, supervisory scrutiny, or reputational exposure in the event of an actual cyber incident.
The firm, its leadership, and its board bear those costs directly; the employee who queried the AI tool does not carry personal liability, but the department's credibility and the firm's regulatory standing are both at risk.
The finding documented in this case study reflects a recurring class of AI error: overclaiming the completeness and specificity of a foundational regulatory document. When AI tools encounter a well-established guidance text — particularly one that is widely cited in industry literature — they tend to characterise it as more comprehensive than it actually is. In this instance, AI tools described the 2016 CPMI-IOSCO Cyber Resilience Guidance as containing "detailed expectations" for incident response and recovery, when the source document is comparatively high-level on those topics and a materially more detailed treatment was published four years later by the FSB.
The error is not a fabricated citation or an invented rule; it is a confident overstatement of depth that causes the practitioner to stop searching for the fuller picture.
For Operations teams at Payment Institutions firms, this pattern is especially consequential because cyber resilience planning is not a theoretical exercise. Firms are expected to demonstrate, to regulators and to counterparties, that their recovery frameworks meet current international expectations — not just the expectations that existed at the time of an earlier publication. An AI tool that presents the 2016 guidance as self-sufficient gives the Operations team a false sense of completeness and suppresses the discovery of subsequent, operationally richer material.
The systemic risk compounds quickly. A single AI query that returns an overconfident summary can influence the firm's incident response plan, its recovery time objective documentation, its training syllabus, and its regulatory self-assessment — all of which may be drafted by different staff members who each independently rely on the same flawed AI output. When an examiner or a major incident later exposes the gap, the remediation effort spans multiple documents and processes simultaneously.
The cost of correcting one AI error at the source is trivial; the cost of correcting it after it has propagated across a firm's operational documentation is not.
1 finding in this case study. Click any to see its full evidence card.
The default position for Operations teams should be that AI tools are a starting point for regulatory research, not a reliable primary source. This is especially true for international guidance frameworks where the substantive expectations have evolved across multiple publications over several years. When an AI tool returns a confident answer about what a specific document requires, that answer should trigger a verification step — not reliance. The firm's regulatory library, the relevant regulator's published portal, and primary source documents should be the terminal authority, not the AI summary.
At a firm level, several practical safeguards are worth implementing now. A regulatory-verification policy that explicitly names AI tools as unreliable sources for rule interpretation in cyber resilience, operational continuity, and response-and-recovery topics will set the right expectation before errors propagate. Any AI output that influences an internal work-product — a policy document, a training slide deck, a regulatory self-assessment — should carry a clear audit trail flagging its AI-assisted origin and the verification steps taken before it entered firm-wide use.
Sign-off requirements before AI-drafted regulatory content is distributed internally or included in external submissions will reduce the risk that a single unverified query shapes multiple downstream documents. Operations leaders should also maintain a clear distinction between content that was AI-drafted versus content that was AI-summarised from sources the team has independently verified.
There are areas where AI tools add genuine value in an Operations workflow without introducing regulatory risk. Drafting non-regulatory internal communications, generating a first set of research questions to guide a subject-matter expert's inquiry, and summarising long documents that the team will verify section by section are all appropriate uses. The risk concentrates at the point where AI output is treated as a resolved answer to a regulatory question — particularly one involving the scope, detail level, or currency of a specific guidance document.
RegLeg's published hallucination research gives Operations teams at Payment Institutions firms a practical, free pre-flight check before relying on any AI response in international cyber resilience and operational continuity rule areas. The research is topic- and regulator-specific, which means teams can look up the precise guidance framework they are working with and see, in plain language, where AI tools have been shown to produce inaccurate or incomplete answers. Using that research as a standing check before AI output enters a firm work-product requires no additional tooling — it is a habit, not a system.
For teams that want a deeper view of their own exposure, RegLeg offers bespoke regulator deep-dives that map which AI-supported workflows in a Payment Institutions firm carry the highest hallucination risk. The cyber resilience and incident response space — where guidance has evolved materially across multiple publications from CPMI-IOSCO, the FSB, and national regulators — is one area where this mapping exercise consistently surfaces material gaps between what AI tools claim and what the current regulatory expectation actually is.
A structured review of the firm's AI-supported workflows against RegLeg's failure-mode catalogue gives Operations leadership a clear, prioritised picture of where verification requirements need to be tightest.
RegLeg can also provide a confidential review of the firm's existing AI-use policy, benchmarked against the failure modes documented in the research programme, with prioritised remediation guidance specific to the Operations function. For teams that need to build internal capability, RegLeg's training material and CPD-aligned content can be adapted for use in Operations team sessions — giving staff a working vocabulary for AI reliability limitations and a practical framework for applying appropriate scepticism in their day-to-day regulatory work.