AI Hallucination ResearchAudiencesPractitionersInternational / MultilateralAccountants (CA/PA) › Harmonised ISO 20022 Data Requirements for Enhancing Cross-Border Payments - Updated Report
Practitioners — Accountants (CA/PA) · updated 2026-06-03 · methodology v2.3
Share / Print Twitter LinkedIn Email

AI on Harmonised ISO 20022 Data Requirements for Enhancing Cross-Border Payments for Accountants (CA/PA) in international jurisdictions

Executive Summary

Across two questions put to AI assistants on the CPMI's harmonised ISO 20022 data requirements for cross-border payments, every response produced a wrong deliverable — either a conflated adoption figure or a failure to surface a directly relevant official statistic. For accountants advising clients on cross-border payment infrastructure compliance or treasury modernisation, both errors carry the same downstream risk: advice built on numbers that do not appear anywhere in the authoritative record.

The adoption-rate conflation fused two materially different trajectories — faster payment systems and RTGS systems — into a single composite figure the regulator never published, while the second failure left the practitioner without the precise inquiry-rate and resolution-time reduction benchmarks that official CPMI/FSB speeches provide. Two findings, two failure modes, but a single consistent pattern: AI tools cannot be trusted to retrieve or disaggregate the specific quantitative claims embedded in CPMI's cross-border payments reform programme.

How AI gets this regulation wrong

The failures on this regulation split between two distinct modes: an AI that confidently produced a composite statistic — then, when pressed, admitted the figure was reconstructed from memory rather than sourced — and an AI that simply failed to locate a published official benchmark, attributing related figures to commercial intermediaries instead of the originating regulator. Both modes converge on the same practical outcome: the practitioner receives either a number that was never stated or no number at all, in a domain where precise official figures are exactly what the advice depends on.

AI's Failure ModeCountAffected findings
Exposed Fabrication1Finding#1
Inference Drift1Finding#2

What that means for your practice

Both failures land in the same risk category for accountants: a wrong deliverable — an opinion memo, a client briefing, or a board-level treasury update that cites figures the regulator never published or omits benchmarks the regulator did publish. In a cross-border payments context, where clients are making infrastructure investment decisions and aligning internal compliance timelines to CPMI milestones, a corrupted or absent data point is not a cosmetic error; it misprices the urgency and scope of the remediation work being scoped.

Risk ImpactCountAffected findings
Wrong deliverable2Finding#1 · Finding#2

When this affects Accountants (CA/PA)

Accountants advising correspondent banks, payment service providers, or treasury functions on cross-border payment infrastructure upgrades routinely need two categories of data from the CPMI ISO 20022 harmonisation programme: adoption trajectory figures (to benchmark where a client stands relative to the market) and operational efficiency metrics (to quantify the business case for harmonisation investment). Both are precisely what this regulation's authoritative speeches provide, and both are precisely where AI tools failed.

The moment an accountant turns to an AI assistant to draft a client memo summarising the state of ISO 20022 adoption across payment system types, or to support a cost-benefit analysis of accelerating a client's migration timeline, they are in the zone of maximum exposure.

The RTGS versus faster payments distinction matters enormously in practice. A client operating a real-time gross settlement system faces a materially different compliance horizon and competitive pressure than one in the faster payments space — the regulator's own figures place RTGS adoption at "approaching half" while faster payment systems are already past three-quarters. An accountant who receives a blended "79% for both" figure from an AI assistant and builds a client briefing around it will systematically misrepresent the urgency and the peer-group baseline for RTGS clients.

This is the kind of error that only surfaces when the client pushes back with the actual BIS speech in hand.

The inquiry-resolution benchmarks from the Panetta speech are equally load-bearing for accountants scoping operational due diligence or advising on the ROI of harmonised messaging. A 1–3% inquiry rate across cross-border payment volumes, at 5–10 manual touchpoints per inquiry, is a concrete operational cost that can be modelled against a client's transaction volumes to justify migration investment. The up-to-80% resolution-time reduction is the kind of headline metric that appears in board papers and investment cases.

An accountant who cannot surface this figure — or who attributes it to a commercial bank rather than to an official FSB summit speech — is working with a corrupted evidentiary base.

The findings at a glance

The two findings below cover the specific questions on CPMI's ISO 20022 harmonisation programme where AI assistants produced wrong or missing outputs — with the precise official text alongside what the AI actually said.

#Finding titleTypeCitation ID
1ISO 20022 adoption rate conflation: RTGS vs faster paymentsHallucinationRLB-F-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q006
2Missing official inquiry-rate and resolution-time benchmarksHallucinationRLB-F-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q007

Aggregate impact

Both failures cluster on a single structural weakness: AI tools cannot reliably disaggregate or locate specific quantitative claims from CPMI's cross-border payments speeches when those claims require distinguishing between payment system types or surfacing precise figures from co-presented speeches at the same official event. The adoption-rate conflation is a calibration error — the AI knew the broad direction of travel (widespread ISO 20022 adoption) but collapsed two distinct rates into one, producing a number that flatters RTGS adoption relative to the actual regulatory record.

The missing Panetta figures represent a different but related failure: the AI's search surfaced one speech from the FSB summit but not the companion speech delivered the same day, leaving the practitioner with a "no data found" response on a question where official data exists and is directly accessible.

For accountants in international jurisdictions, the aggregate implication is that this regulation's quantitative backbone — the adoption rates, inquiry volumes, and efficiency gains that underpin every business case and compliance timeline analysis — cannot be reliably retrieved through AI tools without independent verification against the source BIS speeches. The two figures most likely to appear in client deliverables (the RTGS/FPS adoption split and the 80% resolution-time reduction) are precisely the two figures the AI tools got wrong or missed entirely.

The systemic risk is one of compounding. An accountant who drafts an ISO 20022 advisory memo using AI-sourced figures, circulates it to a client's treasury team, and then has that memo feed into a board paper or investment committee submission has created a chain of reliance on corrupted data. The error is not self-correcting — clients rarely check BIS speeches independently, and the AI's confident presentation style provides no signal that the figures warrant verification.

What your team should do

The default position for any quantitative claim from this regulation: go to the source speeches on bis.org directly. The two failure modes here — a confidently fabricated composite figure and a missed co-presented speech — are not edge cases; they are precisely the failure modes that emerge when AI tools attempt to retrieve specific numerical claims from a corpus of regulatory speeches rather than from a single consolidated document.

BIS speeches from the FSB Cross-Border Payments Summit (March 2026) are indexed but not always comprehensively cross-linked, which means an AI that finds the Bailey speech may not surface the Panetta speech from the same event. Build the habit of pulling both speeches for any multi-speaker BIS summit before drafting quantitative claims.

For the specific numbers that matter most to client work on this regulation — the RTGS versus faster payments adoption split, and the inquiry-rate and resolution-time reduction benchmarks — treat any AI-sourced figure as a starting point for source verification, not a citable result. The adoption split is particularly prone to conflation because the "more than three-quarters / approaching half" formulation requires the AI to hold two distinct rates in context simultaneously; the failure pattern here confirms it will not reliably do so.

When drafting client memos that reference these benchmarks, cite the speech directly (Bailey r260316d; Panetta r260316f) rather than relying on an AI-generated paraphrase.

AI tools remain useful for this regulation in lower-stakes areas: structuring the analytical framework for an ISO 20022 readiness assessment, drafting the narrative sections of a client briefing that do not depend on specific official figures, or summarising the general policy direction of the CPMI harmonisation programme at a conceptual level. The failure zone is narrow but consequential: any sentence in a client deliverable that contains a specific percentage, touchpoint count, or efficiency metric attributed to CPMI or FSB officials needs to be verified against the primary speech before it leaves the firm.

How RLB Can Help

RegLeg's published Hallucination Research is available as open reference — use it as a pre-flight check before relying on AI output on regulatory questions that matter to your sign-off. The findings are organised by regulation and failure mode, so if you are working across IFRS application guidance, PCAOB standards, or cross-border group reporting obligations, you can pull the relevant regulation page and see, specifically, where AI tools have fabricated citations, misstated effective dates, or collapsed jurisdiction-specific carve-outs into a single incorrect answer. That is faster and more defensible than discovering the error after the advice has gone out.

For firms running multiple Accountants on the same regulatory portfolio — group reporting, audit quality frameworks, independence requirements across jurisdictions — RegLeg offers bespoke deep-dives. We work through the specific regulations in scope, map the failure modes that surface most consistently in that regulatory space, and produce a structured briefing your team can use as a standing reference. This is not a one-size engagement: the output is scoped to the regulations you are actually using AI tools against, and framed around the workflow decisions those findings affect — materiality judgements, disclosure drafting, cross-border reconciliation.

We also produce training and CPD-aligned material built around the failure modes your team should be stress-testing in their own AI use. Not generic AI literacy content — specific failure patterns documented against the regulations accountants in international practice touch most, presented in a format that maps to the professional judgement calls your team makes daily. If your firm has an existing AI-use policy, we can review it confidentially against RegLeg's failure-mode catalogue and flag where the policy's assumptions about AI reliability are not supported by what the research actually shows.