AI Labs white papers — AI Hallucination Research

API Harmonisation for Cross-Border Payments: Model Failure Patterns on CPMI's October 2024 Framework

BIS-CPMI · INT · substrate v1

When both Claude Opus 4.7 with web search and Claude Sonnet 4.6 with web search encounter content locked inside an inaccessible PDF — the CPMI October 2024 report on harmonising APIs for cross-border payments — they...

Published 2026-06-04

Amendment-Layer Failures on CFTC Swap Dealer Business Conduct Rules

CFTC · US · substrate v1

Fabrication and scope-conflation are the dominant failure shapes on the CFTC's December 2025 swap dealer business conduct and documentation rulemaking, with Claude Opus 4.7 with web search producing an invented...

Published 2026-06-03

CFTC Digital Asset Collateral Staff Guidance 2025: Hallucination Patterns in Claude Opus 4.7 and Claude Sonnet 4.6

CFTC · US · substrate v1

Condition-sunset misclassification and fabricated amendment provenance are the dominant failure surfaces across both Claude Opus 4.7 and Claude Sonnet 4.6 on the CFTC's Digital Asset Collateral No-Action Relief and...

Published 2026-06-03

PFMI Principle 15 Failures: Conditional-Structure Fabrication and Carve-Out Denial in Claude Opus 4.7 and Claude Sonnet 4.6

BIS-CPMI · INT · substrate v1

Both Claude Opus 4.7 with web search and Claude Sonnet 4.6 with web search produced failures on CPMI-IOSCO's Implementation Monitoring of the PFMI: Level 3 Assessment on General Business Risks (Bank for International...

Published 2026-06-03

Deontic Register Failure on the CPMI-IOSCO Initial Margin Consultation 2026

BIS-CPMI · INT · substrate v1

The dominant failure observed in Claude Sonnet 4.6 on the CPMI-IOSCO Consultation on Updated Guidance and Public Disclosures to Implement Initial Margin Proposals is deontic register substitution — the model hardened...

Published 2026-06-03

ISO 20022 Harmonisation: Numeric Conflation and Attribution Failures in Cross-Border Payment Regulation

BIS-CPMI · INT · substrate v1

Numeric conflation across disaggregated adoption-rate subcategories — collapsing distinct faster-payment-system and RTGS figures into a single blended claim — is the primary failure surface for Claude Opus 4.7 with...

Published 2026-06-03

BBNJ High Seas Biodiversity Agreement: Model Hallucination Findings

UNTC · INT · substrate v1

This paper presents findings from RegLeg's hallucination research on the Agreement under the United Nations Convention on the Law of the Sea on the Conservation and Sustainable Use of Marine Biological Diversity of...

Published 2026-05-31

AI Model Hallucination Patterns on CPMI-IOSCO PFMI: A RegLeg Research Report

BIS-CPMI · INT · substrate v1

RegLeg tested two frontier AI models against the Principles for Financial Market Infrastructures (PFMI), the global standard for payment systems, central counterparties, and securities settlement systems published...

Published 2026-05-29

MAS Notice 637 Capital Adequacy: AI Model Accuracy Evaluation

MAS · SG · substrate v1

This paper presents findings from RegLeg's evaluation of AI model responses to questions about MAS Notice 637 — the Monetary Authority of Singapore's risk-based capital adequacy framework for banks — covering both...

Published 2026-05-28

Hallucination in Regulatory AI: CPMI-IOSCO Cyber Resilience Guidance (2016) — Findings for AI Labs

BIS-CPMI · INT · substrate v1

This report documents hallucinations produced by frontier AI models when asked questions about the Guidance on Cyber Resilience for Financial Market Infrastructures, published in June 2016 by CPMI and IOSCO under the...

Published 2026-05-26

Consumer Duty Hallucination Report: Claude Opus 4.7 and Claude Sonnet 4.6

FCA · GB · substrate v1

This paper presents findings from a structured evaluation of two frontier AI models — Claude Opus 4.7 with web search and Claude Sonnet 4.6 with web search — against the Financial Conduct Authority's Consumer Duty...

Published 2026-05-26