AI on Promoting the Harmonisation of Application Programming Interfaces to Enhance Cross-Border Payments: Recommendations and Toolkit for Technology & Data teams at Payment Institutions firms in international jurisdictions

Executive Summary

Two questions put to AI assistants about the CPMI October 2024 API harmonisation framework produced two hallucinations — both in areas that Technology & Data teams at Payment Institutions firms are most likely to act on directly: the structure of the self-assessment toolkit and the versioning history of the accompanying ISO 20022 data requirements update. In both cases, AI tools returned confident, technically-formatted answers that had no basis in any accessible source.

The toolkit hallucination is particularly sharp: the PDF is not publicly extractable, no third-party documentation describes its contents, yet AI assistants produced detailed four-area breakdowns with named assessment dimensions — fabrications that are plausible enough to pass straight through a junior's desk and into a readiness assessment template or board-level gap analysis. The ISO 20022 versioning error compounds the risk: a wrong publication date in a technical annex scope document, paired with invented data entity breakdowns, is exactly the kind of subtle mis-specification that survives internal review and surfaces only when an implementation partner or auditor checks the primary source.

How AI gets this regulation wrong

Across the questions we tested on this regulation, AI tools failed in two distinct ways: inventing specific structural content about a document it could not access, and mis-stating the publication date and technical scope of a regulatory update while fabricating supporting detail to make the error look authoritative. What makes these failures particularly hard to catch is that the AI's answers are internally consistent and technically formatted — they look like the product of someone who read the source, not someone who guessed.

AI's Failure Mode	Count	Affected findings
Exposed Fabrication	1	Finding#1
Misstated Rule	1	Finding#2

What that means for your team

Both failures on this regulation land in the same risk category: a wrong deliverable — a readiness assessment, a technical scoping document, or a versioned gap analysis built on fabricated regulatory content. For a Technology & Data team at a Payment Institution, the downstream exposure is not abstract: it runs from flawed API compliance frameworks presented to internal audit, through to misaligned integration specifications handed to counterparty banks or scheme operators who will check primary sources.

Risk Impact	Count	Affected findings
Wrong deliverable	2	Finding#1 · Finding#2

When this affects your department

Technology & Data teams at Payment Institutions are the primary internal consumers of the CPMI API harmonisation framework — not passively, but as the function that has to operationalise it. That means scoping API gateway uplift programmes against CPMI recommendations, mapping existing payment message implementations to the ISO 20022 data model requirements, and using the self-assessment toolkit to build readiness assessments that go upward to the board or outward to a central bank or scheme operator.

When a payment architect or a compliance-aligned developer reaches for AI to get a quick read on the toolkit structure — what dimensions it covers, how many assessment areas, what the recommended usage sequence is — there is a concrete downstream deliverable waiting for that answer.

The ISO 20022 versioning question is equally live. The updated data requirements document (d230) is a direct input for any team running a SWIFT migration, ISO 20022 data enrichment project, or cross-border corridor remediation. Getting the publication date wrong is a low-stakes-seeming error that carries real weight when it appears in a project charter, a vendor RFP, or a regulatory milestone plan — because counterparties and auditors will check.

Fabricated data entity breakdowns in the technical annex are worse: if a team scopes a mapping exercise against invented entity categories, they are building integration logic against fiction, and the gap will only surface at UAT or post-live.

AI tools are a plausible reach for both of these questions. The regulation is recent, the documents are publicly listed on the BIS portal, and the questions are precisely the kind of structured reference lookup that AI tools handle credibly on better-documented standards. That credibility is the hazard: the AI's answer format — area-by-area breakdown, dimension labels, step-by-step usage sequence — is indistinguishable from the output of someone who actually read the toolkit PDF.

The findings at a glance

The table below summarises the two findings tested on this regulation — the question area, the AI failure, and the risk category it lands in for a Technology & Data team at a Payment Institution.

#	Finding title	Type	Citation ID
1	Fabricated self-assessment toolkit structure	Hallucination	RLB-F-INT-BIS-CPMI-API-HARMONISATION-CROSS-BORDER-2024-Q005
2	Mis-stated ISO 20022 update date and fabricated technical annex scope	Hallucination	RLB-F-INT-BIS-CPMI-API-HARMONISATION-CROSS-BORDER-2024-Q009

Aggregate impact

Both failures on this regulation cluster on the same underlying dynamic: the AI cannot access the primary source material — the toolkit PDF is not publicly extractable, and the versioned update document was misidentified via a secondary aggregator — yet it generates answers that mimic the output of primary-source reading. The errors do not look like gaps or hedges; they look like structured regulatory knowledge. That is the systemic risk. A Technology & Data team that trusts these answers does not know it is working from fabricated content until it collides with a source that is authoritative.

The toolkit finding is the sharper exposure. Multiple AI tools produced the same class of error — fabricated area-by-area breakdowns with assessment dimensions and usage steps — and at least one retracted when challenged, which means the error is catchable, but only if the team thinks to challenge it. In practice, a payments architect building a readiness assessment template does not typically challenge a structured AI answer that looks like a tool specification — they use it.

The retraction behaviour is a signal that the AI itself has low confidence in the content, but that signal only surfaces under adversarial prompting, not on first use.

The ISO 20022 date and data entity error matters because the downstream uses are high-precision: project charters, vendor selection criteria, UAT scope documents. A two-month date error in a regulatory milestone plan is correctable but embarrassing; fabricated entity categories in a data mapping scope are a design error that propagates. For a Payment Institution operating across multiple cross-border corridors, where ISO 20022 enrichment requirements vary by rail, the cost of a re-scoping exercise triggered by a primary-source check is real — and the reputational cost with counterparty banks or scheme operators who catch the error first is harder to price.

What your team should do

The default position on this regulation is simple: do not use AI to retrieve the structural content of any CPMI toolkit or technical annex without first confirming the primary source is accessible and has been read. The self-assessment toolkit is distributed as a PDF attachment to the October 2024 report; if your team cannot open and read that PDF directly, AI cannot substitute. Any AI-generated breakdown of what the toolkit covers, how many areas it has, or what the assessment dimensions are should be treated as fabricated until verified against the BIS publication at bis.org.

The same applies to any AI-generated characterisation of the ISO 20022 data requirements technical annex: check the publication date on the BIS page and compare the entity scope against the actual annex before using either in a scoping document.

Where AI is genuinely useful on this regulation is in the layer above document specifics: understanding the CPMI's stated rationale for API harmonisation, summarising the high-level recommendations (which are described in accessible landing-page text and press materials), or drafting internal explainers for business lines about why the framework matters for cross-border payment corridors. AI tools are also useful for drafting structured questionnaires or review templates where the team supplies the regulatory criteria from primary sources — the AI does the document formatting, the team supplies the substance.

The practical control is a one-step verification rule: any AI output on this regulation that contains specific document structure (areas, dimensions, entity categories, publication dates, step sequences) requires a named primary source and a team member who has opened that source. For the toolkit specifically, that means the PDF — not a press release, not an aggregator summary, not a BIS landing page. If the PDF is unavailable, the readiness assessment waits or is scoped to what the accessible text actually says.

How RLB Can Help

RegLeg's published Hallucination Research gives Technology & Data teams at payment institutions a practical pre-flight check before placing reliance on AI-assisted output for regulatory questions. The research maps the specific ways AI tools misstate regulatory obligations — citing superseded rules, conflating jurisdictions, or fabricating supervisory guidance — so that teams can calibrate their review processes and governance controls accordingly, rather than discovering failure modes after a compliance decision has already been made.

Where a firm's Technology & Data function is deploying or evaluating AI tools to support activities such as data governance, cyber resilience reporting, change management, or third-party technology due diligence, RLB can undertake bespoke regulator deep-dives that identify which of those workflows carry the highest hallucination exposure. That work produces a prioritised map of risk points specific to the payment institution context — informing both the firm's AI-use controls and its engagement with regulators on technology risk.

RLB also works with Technology & Data teams on a confidential review of their existing AI-use policies, assessing them against RegLeg's failure-mode catalogue and producing a structured, prioritised remediation plan. Alongside that, RLB can develop training materials and CPD-aligned content that the team can use internally — equipping engineers, data leads, and compliance-facing technologists with a shared working understanding of where AI tools are reliable, where they are not, and how to document that judgement in a way that stands up to regulatory scrutiny.