Executive Summary
Risk teams at Payment Institutions firms operating across international jurisdictions use the CPMI API Harmonisation Recommendations as a primary reference for mapping regulatory obligations by stakeholder category and for benchmarking the cross-border fast payment landscape — both activities that feed directly into risk appetite statements, product-launch assessments, and market-entry due diligence. Across two aggregated questions put to AI tools on this regulation, both returned incorrect answers and subsequently retracted or qualified them when challenged.
The failures are not edge-case ambiguity: one involves AI inventing specific stakeholder attribution for each recommendation category despite the recommendation-level detail sitting in an inaccessible PDF, and the other involves AI conflating a survey-sample count with a global-universe count and separately suppressing a verifiable CPMI data point about operator composition. In both cases the AI's initial response read as authoritative and fully sourced, which is precisely the condition under which a Risk team is most likely to route the output directly into a deliverable without secondary verification.
How AI gets this regulation wrong
Both failures on this regulation share the same pattern: AI tools committed to specific, structured answers and then recanted when directly pressed — a pattern of confident fabrication followed by admission of uncertainty that only surfaces if the reader pushes back. The fabrications are not vague hedges; they take the form of precise stakeholder attributions and official-sounding numerical data points that carry the surface markers of regulatory accuracy.
| AI's Failure Mode | Count | Affected findings |
|---|---|---|
| Exposed Fabrication | 2 | Finding#1 · Finding#2 |
What that means for your team
Every failure documented here produces a wrong deliverable — meaning the Risk team receives a structurally plausible but materially incorrect output that, if acted on, would corrupt the underlying analysis rather than simply leave a gap. For Payment Institutions operating across multiple jurisdictions, a wrong deliverable on stakeholder obligation mapping or market-landscape data does not stay local: it propagates into board papers, market-entry risk assessments, and regulatory engagement where correction is costly and visible.
| Risk Impact | Count | Affected findings |
|---|---|---|
| Wrong deliverable | 2 | Finding#1 · Finding#2 |
When this affects your department
Risk teams at Payment Institutions reach for AI on this regulation most heavily during three recurring workflows: first, when scoping the firm's direct obligations versus those sitting with payment system operators, central banks, or standards bodies — a stakeholder-cut analysis that determines whether a gap in the firm's API compliance posture is the firm's problem to fix or a systemic issue to escalate.
Second, when constructing or updating the cross-border payments landscape section of a market-entry risk assessment, a new product approval memo, or a board risk report — contexts where CPMI data on how many fast payment systems are operational globally, how many are cross-border-enabled, and who operates them functions as the baseline against which the firm's exposure or opportunity is calibrated. Third, during internal training or policy refresh cycles where teams need a working map of which recommendation category applies to which class of institution.
The compounding risk is structural. Payment Institutions firms operating internationally are typically interfacing with multiple domestic fast payment systems simultaneously — each at a different point on the cross-border linkage curve. A board paper or regulatory submission that cites the wrong global-universe count (57 survey respondents rather than 70+ operational systems) or misrepresents the operator composition (suppressing the 40% central bank / 35% private split) does not merely understate the landscape — it skews the firm's assessment of the systemic risk it is exposed to and the counterparties it needs controls around.
Similarly, a stakeholder-obligation matrix built from AI-fabricated recommendation-level attributions — where the AI assigns categories exclusively to standards bodies or central banks without that attribution being grounded in the accessible regulatory text — will systematically misdirect the firm's compliance scope.
Both failure types share a downstream consequence that matters specifically to the Risk function: incorrect inputs to a risk appetite statement, a product risk assessment, or a regulatory submission are qualitatively different from a knowledge gap. A gap triggers a research task; a wrong answer embedded in a completed deliverable triggers an undetected mis-statement that may not surface until a regulator, auditor, or counterparty challenges it.
The findings at a glance
The two findings below cover the questions on this regulation where AI tools produced answers that were materially incorrect and subsequently retracted — the specific failure mode and the Risk team workflow it affects are noted for each.
| # | Finding title | Type | Citation ID |
|---|---|---|---|
| 1 | AI fabricates recommendation-level stakeholder attribution | Hallucination | RLB-F-INT-BIS-CPMI-API-HARMONISATION-CROSS-BORDER-2024-Q008 |
| 2 | AI misstates CPMI fast payment system landscape data | Hallucination | RLB-F-INT-BIS-CPMI-API-HARMONISATION-CROSS-BORDER-2024-Q010 |
Aggregate impact
Both failures on this regulation cluster around the same underlying condition: the full recommendation-level and data-level detail in this CPMI publication is not reliably accessible to AI tools working from public sources alone, and AI tools do not consistently signal that limitation — instead they construct plausible-sounding answers from what is accessible (executive summaries, press releases, coverage pieces) and present them as if they reflect the full document. The result is not a retrieval failure the team can see; it is a fabrication the team cannot easily detect without going to primary sources themselves.
For the Risk function at an internationally operating Payment Institution, the aggregate effect is that two of the most operationally significant questions on this regulation — who is this regulation directed at, recommendation by recommendation, and what does the global fast payment system landscape actually look like in CPMI's own data — are precisely the questions where AI tools are least reliable. These are not peripheral research questions; they are the anchors for regulatory obligation mapping and systemic landscape analysis.
A team using AI output unchecked for either would be working from invented stakeholder attributions or miscounted market-universe data embedded in otherwise well-structured prose.
The pattern also has a procedural implication: because both failures only surface when the AI is challenged and pressed to retract, a workflow that involves one person querying the AI and a second reviewing the output — without either running an independent source check — will not catch the error. The retraction behaviour is not a safety net unless the review process is specifically designed to probe AI responses on this regulation rather than simply read them.
What your team should do
The default position for Risk teams on this regulation should be: AI tools can help orient a team member who is new to the regulation's structure and general intent, but any output that involves specific stakeholder attributions at the recommendation level, or that cites CPMI quantitative data on the fast payment system landscape, must be verified directly against the CPMI source before it enters a deliverable. The full PDF is the authoritative document; accessible summaries and coverage pieces omit or compress the detail that matters for obligation mapping.
For practical safeguards: when a junior analyst brings an AI-generated stakeholder matrix for this regulation, the review gate should require the analyst to identify where in the accessible CPMI text each attribution is grounded — not just that the AI provided it. On landscape data, the specific figures that matter (global operational count, cross-border-enabled count, linkage pipeline count, operator composition) should be pulled directly from CPMI speeches and reports rather than from AI recollection; Tara Rice's November 2023 CPMI speech is the primary public source for the 70+ / 14 / 24 / 40%–35% data set.
When AI tools are used as a starting point for regulatory landscape research, the output should be treated as a draft hypothesis rather than a reliable answer — particularly for this regulation where the AI's accessible source base is materially thinner than what the full document contains.
AI tools are reasonably reliable for orientation-level work on this regulation: understanding the high-level structure of the 10 recommendations, the general cross-border payment harmonisation context, and the CPMI's role as the publishing body. They are also useful for flagging related CPMI publications and for summarising publicly accessible executive materials. The boundary where reliability breaks down is precisely where the Risk function needs precision — at the level of who owes what under which recommendation, and what the CPMI's own data says about the landscape the firm is operating in.
How RLB Can Help
RegLeg's published Hallucination Research gives Risk teams at Payment Institutions a ready-made pre-flight check before relying on AI-assisted output for regulatory questions. Each research entry documents, by regulation, the specific failure modes AI tools have exhibited — misquoted thresholds, fabricated cross-references, outdated prudential ratios — so your team can calibrate how much independent verification a given AI output warrants before it informs a risk decision, a capital model assumption, or a supervisory submission.
For firms that want analysis tailored to their own operating model, RegLeg offers bespoke regulator deep-dives that map which AI-supported workflows in a Payment Institution's Risk function carry the highest hallucination exposure. Licensing and own-funds calculations, transaction monitoring rule interpretation, incident reporting timelines, and cross-border passporting conditions each attract distinct failure patterns. A deep-dive produces a prioritised exposure map your team can use to set internal thresholds, review protocols, and escalation triggers — grounded in the same research base as the public site but scoped to your specific regulatory footprint.
RegLeg also offers a confidential review of a firm's existing AI-use policy, benchmarked against the failure-mode catalogue documented in the research programme and assessed against current supervisory expectations on model risk governance. The output is a prioritised remediation list rather than a gap report, with practical steps your team can action. Alongside this, RegLeg can supply training material and CPD-aligned content — covering hallucination mechanics, verification techniques, and risk-function-specific case examples — that equips practitioners to apply sound AI hygiene in their day-to-day work without requiring external support for every query.