Executive Summary
Financial Advisers advising clients on cross-border payment infrastructure — whether structuring international treasury arrangements, evaluating fintech partnerships, or opining on API connectivity obligations — rely on accurate landscape data to calibrate risk and scope. The CPMI's API harmonisation framework sits at the intersection of fast payment system (FPS) interoperability and jurisdictional coordination, making quantitative benchmarks (how many FPS are live, how many are cross-border capable, who operates them) material inputs to advisory work. Across the questions we tested on this regulation, AI tools failed on 1 out of 1 aggregated question sets — a clean sweep.
The failure mode was characterised by confident provision of incorrect figures, followed by partial or full retraction only under challenge, producing responses that looked authoritative until pressure-tested. For Financial Advisers building market briefings or client memos against this data, an unchallenged AI answer would have shipped a wrong deliverable.
How AI gets this regulation wrong
On this regulation, the dominant AI failure pattern is confident fabrication that collapses under scrutiny — not silence or hedging, but a definitive wrong answer delivered with the same register as a correct one. The table below maps where AI tools substituted survey-sample counts for global-universe figures, and where they falsely declared authoritative operational-split data unavailable despite its presence in primary CPMI sources.
| AI's Failure Mode | Count | Affected findings |
|---|---|---|
| Exposed Fabrication | 1 | Finding#1 |
What that means for your practice
For Financial Advisers in international jurisdictions, errors on this regulation land squarely in the category of wrong deliverable — a market briefing or advisory memo built on incorrect CPMI benchmarks reaches the client already contaminated. The table below breaks down where that risk materialises across the specific question types tested, and the direct workflow consequence at each point of failure.
| Risk Impact | Count | Affected findings |
|---|---|---|
| Wrong deliverable | 1 | Finding#1 |
When this affects Financial Advisers
Financial Advisers with an international client base encounter the CPMI API harmonisation framework most acutely when advising on cross-border payment readiness — whether a treasury operation has the right counterparty banking relationships in markets where FPS interoperability is live, where it is imminent, and where it remains years away. Getting the FPS landscape count right is not cosmetic: the difference between 57 systems (a survey sample) and 70+ (the global operational universe) materially changes the framing of jurisdictional coverage and the urgency of near-term linkage planning.
Any advisory engagement scoping API connectivity obligations or correspondent banking strategy against an artificially low universe count is scoped to the wrong baseline.
The public/private operational split is equally load-bearing. The 40% central-bank / 35% private-entity breakdown from CPMI sources determines whether a client's regulatory counterparts are primarily public-sector bodies (with sovereign-immunity and public-procurement implications) or private-sector operators (with commercial negotiation and credit-risk implications). When AI tools falsely declare this data unavailable — sourcing only a subset of the CPMI publication record rather than the full speech and report corpus — advisers who rely on that non-answer either leave the question open in their memo or go to the primary source themselves, defeating the purpose of the AI lookup.
The risk is not just that the client gets incomplete advice; it is that the adviser signs off a memo that cites CPMI data gaps that do not exist.
The workflow pressure point is client-facing deliverables produced under time constraint: market briefings prepared ahead of client steering committees, pitch-book sections on payment infrastructure maturity, or due-diligence memos scoping fintech partnership risk. These are the contexts where an adviser asks AI for the headline CPMI numbers, receives a confident-sounding response, and may not have the time or the immediate primary-source familiarity to verify each figure independently. The finding documented here shows AI tools producing the wrong number (57 vs 70+) and the wrong omission (claiming the central-bank/private split is unquantified) in exactly those circumstances.
The findings at a glance
The table below summarises the finding tested on this regulation, the AI failure type, and the primary risk consequence for Financial Advisers work product.
| # | Finding title | Type | Citation ID |
|---|---|---|---|
| 1 | Global FPS count and operational-split data misrepresented | Hallucination | RLB-F-INT-BIS-CPMI-API-HARMONISATION-CROSS-BORDER-2024-Q010 |
Aggregate impact
With only one finding tested on this regulation, the aggregate picture is stark rather than dispersed: AI tools failed on the most numerically specific question a Financial Adviser is likely to bring to them about the global FPS landscape. The failure was not a matter of missing nuance — it was a substitution of a survey-response sample size (57) for an operationally-defined global count (70+), and a false assertion that a public-record operational-split figure (40% central-bank / 35% private) was unavailable. Both errors survived initial presentation; retraction required active challenge.
That pattern — confident delivery, collapse under pressure — is the most operationally dangerous form of AI error because it passes a cursory review.
The errors cluster on the same underlying cause: AI tools conflating the scope of a specific CPMI monitoring survey publication with the full universe of authoritative CPMI quantitative output. The 57-system figure comes from a 2025 monitoring survey sample; the 70+ figure and the operational-split percentages come from a 2023 CPMI speech by Tara Rice. AI tools that over-index on the more recent, more structured publication (the survey report) and under-weight the speech corpus produce plausible-sounding but wrong answers.
This is not a random error — it is a systematic bias toward formatted tabular sources over speech and working-paper material, which means any question that requires the fuller CPMI record is at elevated risk.
For Financial Advisers the systemic implication is that AI tools are unreliable as a single-source lookup for CPMI quantitative benchmarks, particularly where the authoritative data spans report-plus-speech publication formats. A junior team member using AI to populate the landscape section of a market briefing, without a parallel primary-source check against the CPMI speech archive, will frequently produce underestimates of system coverage and false gaps in operational-breakdown data. Given that these figures are precisely the ones clients use to make jurisdictional prioritisation decisions, the downstream commercial and advisory-liability exposure is real.
What your team should do
The default position for any Financial Advisers team producing CPMI FPS landscape data for a client deliverable should be: AI-generated figures are a starting point for source identification, not a citable output. For this regulation specifically, the quantitative landscape — system counts, cross-border linkage tiers, operational governance splits — spans CPMI monitoring surveys, working papers, and the CPMI speech archive. AI tools tested here pulled selectively from the survey corpus and missed the speech record. That is not a failure mode you can detect without already knowing the correct figure, which means junior verification against AI output alone is insufficient.
The practical safeguard is a two-step protocol: use AI to identify the relevant CPMI publication types (reports, briefs, speeches), then verify each figure against the BIS website directly. For the specific data points covered by this finding — global operational FPS count, cross-border-enabled subset, five-year linkage pipeline, and central-bank/private operational split — the primary source is the November 2023 Tara Rice CPMI speech, not Brief 10 or the 2025 monitoring survey. If your team cannot locate that speech on the BIS portal, the AI answer should be held unverified.
Any memo or briefing that cites CPMI data as definitive should flag the primary source URL, not rely on AI paraphrase.
Where AI tools are genuinely useful on this regulation is in structuring the analytical framework around the data: drafting the section architecture for a payment infrastructure readiness memo, identifying the regulatory questions that the CPMI framework is designed to answer, summarising what API harmonisation recommendations mean for a specific client's technology stack. These are tasks where precision of the landscape figures is not load-bearing. The failure risk concentrates on the narrow but high-stakes task of populating quantitative benchmarks — and that is the task where primary-source discipline is non-negotiable.
How RLB Can Help
RegLeg's published Hallucination Research functions as a pre-flight check before you rely on AI output for regulatory questions. The findings catalogue specific failure modes — wrong obligation scope, inverted position on disclosure thresholds, fabricated cross-border carve-outs — across the regulations your clients are actually subject to. Before you cite an AI-generated answer on suitability requirements, product disclosure, or cross-border distribution rules, the research tells you where that answer class has demonstrably broken down and what the failure looks like in practice.
It is not a product review; it is an empirical record of where AI tools get specific regulatory questions wrong.
For firms running a team of advisers against a shared regulatory portfolio — IOSCO standards, MiFID-equivalent regimes, local conduct-of-business rules — we offer bespoke regulation deep-dives scoped to your exact coverage. That means running the research against the specific instruments your practice relies on, not a generic cross-jurisdictional survey, and delivering findings in a form your compliance and supervisory functions can act on. The output maps failure modes to obligation categories, so your team knows which question types to treat as high-risk AI outputs requiring independent verification.
We also produce training and CPD-aligned material built from the failure-mode catalogue — structured around the question types that trip AI tools, the regulatory domains where hallucinations cluster, and the verification steps advisers should apply as a matter of practice. Separately, if your firm has drafted or deployed an AI-use policy, we can run a confidential review against our failure-mode catalogue to identify gaps between what the policy permits and where the research shows AI tools to be unreliable. Both engagements are collaborative: you bring the practice context; we bring the empirical record.