Executive Summary
When Compliance teams at Retail Banking firms in international jurisdictions turn to AI assistants to navigate the CPMI's harmonised ISO 20022 data requirements for cross-border payments, they encounter a specific and consequential failure pattern: AI tools misattribute governance structures and collapse materially distinct adoption statistics into fabricated single figures. Across two aggregated questions on this regulation, AI assistants produced hallucinations in both cases — no correct responses.
The failures are not peripheral: one involves a confident misidentification of which central bank chairs the CPMI working group responsible for the standard itself, sourced to an authoritative press release that says the opposite; the other involves collapsing two distinct RTGS and faster-payments adoption rates into a single invented percentage that the AI later acknowledged it had reconstructed rather than retrieved. For a Compliance function drafting governance narratives, regulatory horizon-scanning reports, or training materials for business lines preparing for ISO 20022 transition obligations, both errors would travel undetected unless the drafter independently verified primary sources.
How AI gets this regulation wrong
On this regulation, AI assistants fail in two distinct ways: they cite real authoritative sources to support answers that contradict what those sources actually say, and they present invented statistics with confidence before admitting — only when pressed — that the figures were reconstructed rather than retrieved. Both patterns are especially difficult to catch in a Compliance workflow because the outputs look like properly sourced, quantitatively precise answers until you pull the cited document yourself.
| AI's Failure Mode | Count | Affected findings |
|---|---|---|
| Exposed Fabrication | 1 | Finding#2 |
| Misattributed | 1 | Finding#1 |
What that means for your team
Both failures in this cell land in the same risk category: the wrong deliverable reaches an internal or external audience. For a Compliance function at a Retail Banking firm operating across international jurisdictions, that means governance documents, training decks, and regulatory correspondence that mis-state the authoritative source of the standard or carry materially incorrect adoption statistics — the kind of error that surfaces during internal audit, regulatory review, or cross-border correspondent bank due diligence rather than at the moment of drafting.
| Risk Impact | Count | Affected findings |
|---|---|---|
| Wrong deliverable | 2 | Finding#1 · Finding#2 |
When this affects your department
Compliance teams at Retail Banking firms in international jurisdictions engage with the CPMI ISO 20022 harmonisation requirements at multiple points in the regulatory lifecycle: scoping the firm's obligations under correspondent banking arrangements, drafting the internal policy framework that maps the harmonised data elements to the firm's payment operations, providing sign-off on the business line's implementation roadmap, and preparing the horizon-scanning submissions that feed board and senior management risk MI. In each of those workflows, accurate attribution of the standard's governance — who chairs the working group, which body issued which component of the requirements — is load-bearing.
A Compliance function that mis-states the governance chain in a board paper or a regulatory horizon-scan exposes itself to credibility risk with internal audit and, in a regulatory dialogue, to exactly the kind of scrutiny that a well-prepared supervisor will use to probe whether the firm actually understands the standard it has committed to implement.
The adoption-statistics failure is equally consequential in a different register. Retail Banking Compliance teams operating across international jurisdictions routinely use CPMI monitoring data to benchmark their firm's readiness against market trajectory — in regulatory submissions, in NED briefings, in correspondent banking due-diligence responses where a counterparty asks whether the firm is aligned with the direction of the market. If AI assistants collapse the materially different RTGS and faster-payments adoption rates into a single inflated figure, the firm's self-positioning in those communications is wrong.
An overstatement of RTGS adoption is not a rounding error; it represents the difference between a firm that appears to be tracking with the market and one that has misjudged where its correspondent infrastructure peers actually sit.
The compounding risk is that neither failure type announces itself. Both look like precisely sourced, quantitatively grounded answers. A junior Compliance analyst using AI tools to build out a regulatory briefing pack will not know to pull the RBA press release or the Andrew Bailey speech to verify the figures — that verification step requires knowing in advance that the AI's answer is plausible but wrong. Firms with a practice of primary-source sign-off on regulatory intelligence products will catch these; firms that treat AI output as a reliable first draft without structured review will not.
The findings at a glance
The two findings below cover governance attribution and adoption-rate statistics — both factual inputs that Compliance teams at Retail Banking firms routinely embed in regulatory deliverables without independent verification when the AI response appears authoritative.
Aggregate impact
The two findings on this regulation cluster around a single underlying problem: AI assistants treat this standard's governance and adoption data as if they were stable, well-indexed facts when in practice the authoritative record is dispersed across central bank press releases, Governor speeches, and CPMI technical publications that postdate most AI training windows. The result is that AI tools fill the gap with plausible-sounding reconstructions — attributing the working group chair to the wrong central bank, conflating two materially different adoption rates — and present those reconstructions with the same surface confidence as verified facts.
For Compliance teams at Retail Banking firms in international jurisdictions, the systemic risk is that both failure types occur on questions the function is likely to ask precisely because the firm needs an accurate, citable answer. Governance attribution questions arise when Compliance is building the policy framework that identifies who issued what obligation and under whose authority — the kind of framing that internal audit and external regulators will interrogate. Adoption-rate questions arise when the function is producing the market-context narrative that justifies the firm's implementation timeline and correspondent banking posture.
An error in either feeds directly into a deliverable that will be used to represent the firm's regulatory intelligence to senior management, the board, or a supervisor.
The pattern also reveals a specific verification gap: because the AI's incorrect answers reference real named sources (a press release, a speech), a reviewer who does not pull the primary document has no reason to doubt the response. The risk is not that the AI produces an obviously wrong answer — it is that the AI produces a Contradictory answer, one that inverts or collapses what the cited source actually says.
That is the failure mode that Compliance quality-assurance processes need to be explicitly designed to catch, and most current AI-use policies in Retail Banking Compliance functions are not built for it.
What your team should do
The default position for Compliance teams at Retail Banking firms using AI tools on the CPMI ISO 20022 harmonisation requirements should be: AI is useful for drafting structure and contextualising the standard's general scope, but any specific factual claim about governance attribution or quantitative adoption data requires verification against the primary source before it is embedded in a deliverable. That is not a counsel of perfection — it is a targeted control applied to the specific question types where these tools demonstrably fail.
The two failures in this cell are not edge cases; they are the kinds of questions Compliance functions ask routinely, and the answers are exactly wrong in ways that look exactly right.
Practically: build a standing check into the review workflow for any AI-assisted regulatory briefing on this standard that includes a named central bank, named individual, or percentage figure. The RBA press release and the Andrew Bailey speech that contradict the AI's answers are public, citable documents — pull them, not the AI summary. For adoption-rate claims specifically, the CPMI publishes monitoring data directly; treat any AI-generated figure as a prompt to locate the underlying CPMI source rather than as the answer itself.
Where the firm's AI-use policy already requires primary-source citation for regulatory submissions, extend that requirement explicitly to horizon-scanning MI and internal training materials, which are the most likely vectors for these errors to travel before anyone reviews them.
AI tools remain useful for Compliance teams on this regulation in lower-stakes functions: drafting the structural outline of a policy document, generating a checklist of the ISO 20022 data fields the firm needs to map, or producing a first-pass comparison of the standard's requirements against an existing internal payments policy. The failure modes identified here are specific to factual attribution and quantitative claims — not to the standard's substantive requirements, which AI tools handle with reasonable fidelity when the question is about the content of the harmonised data fields rather than the governance chain or market-adoption statistics.
How RLB Can Help
RegLeg's published Hallucination Research functions as a pre-flight check for Compliance teams that are already using AI tools on regulatory questions — not a theoretical caution, but a documented record of where AI assistants have produced confident, wrong answers on the exact categories of rules your team works with daily: consumer protection obligations, cross-border disclosure requirements, AML/CFT thresholds, and prudential reporting standards.
Before your team relies on AI output to inform a regulatory position, an enforcement response, or a policy gap assessment, the research lets you see what failure patterns have already been observed on comparable regulatory material — so you know which outputs warrant independent verification and which carry lower risk.
For firms where AI-supported workflows are already embedded in the Compliance function — regulatory horizon scanning, policy-to-rule mapping, RFI drafting, training gap analysis — RegLeg can run a bespoke regulator deep-dive scoped specifically to your jurisdiction set and product lines. That work maps your highest-exposure workflows against the failure modes we've catalogued: not generic risk categories, but the specific question types and regulatory domains where AI assistants have demonstrably and repeatedly miscalibrated. The output gives your team a prioritised view of where human review is non-negotiable and where AI-assisted drafting carries manageable residual risk.
If your firm has an existing AI-use policy covering the Compliance function, RegLeg can review it confidentially against our failure-mode catalogue and return a prioritised remediation list — gaps in scope, untested assumptions about AI accuracy on regulatory content, and disclosure or escalation triggers that are absent or underspecified.
We can also develop training material and CPD-aligned content your team can use internally: scenario-based, grounded in real failure examples from the research, and calibrated for practitioners who don't need the 101 but do need documented evidence to support governance conversations with the board, internal audit, or regulators asking how AI risk is being managed in the Compliance function.