AI Hallucination ResearchRegulatorsMajor advanced economiesUSCFTCDIGITAL-ASSET-COLLATERAL-TOKENIZED-ASSETS-STAFF-GUIDANCE-2025 › White paper
AI Labs · updated 2026-06-03 · methodology v2.1

CFTC Digital Asset Collateral Staff Guidance 2025: Hallucination Patterns in Claude Opus 4.7 and Claude Sonnet 4.6

Executive summary

Condition-sunset misclassification and fabricated amendment provenance are the dominant failure surfaces across both Claude Opus 4.7 and Claude Sonnet 4.6 on the CFTC's Digital Asset Collateral No-Action Relief and Tokenized Asset Staff Guidance (Market Participants Division, December 2025). Both models incorrectly classified the weekly digital asset holdings reporting obligation as a condition that terminates at the end of the pilot's initial three-month phase — when the regulator's text is unambiguous that it persists. Separately, both models reconstructed details of a subsequent staff letter amendment from inference rather than the regulator's record, with Claude Opus 4.7 additionally fabricating a specific reissuance date. The pattern signals a structural problem: where a recent regulatory instrument introduces a phased obligation structure with partial sunset provisions, models appear to over-generalise the sunset to the entire condition set rather than tracking the specific carve-outs the regulator specified.

Findings — impact summary

This is the consolidated view of findings. Click 'see details →' on any item for the full details for each finding.

  1. Finding on 'Q005 Probe' for Claude Opus 4.7 with web search ONRLB-H-US-CFTC-DIGITAL-ASSET-COLLATERAL-TOKENIZED-ASSETS-STAFF-GUIDANCE-2025-Q005-Opus47

    The fabricated reissuance date implicates the training corpus's coverage of this instrument's amendment cycle: the model had enough information to describe the amendment's structural effect but not enough to retrieve the actual date, producing a plausible confabulation instead of a retrieval failure. The omission of OCC Interpretive Letter 1183 as the eligibility hook implicates the retrieval stack's handling of cross-document dependencies — the model resolved the 'what changed' question without following the cross-reference to 'why this charter type qualifies,' a gap in how the RAG layer traverses inter-document citations.

    see details →
  2. Finding on 'Q006 Probe' for Claude Opus 4.7 with web search ONRLB-H-US-CFTC-DIGITAL-ASSET-COLLATERAL-TOKENIZED-ASSETS-STAFF-GUIDANCE-2025-Q006-Opus47

    This finding implicates the model's handling of phased obligation structures where partial sunset language is present. The model correctly identified some sunset conditions but over-generalised the sunset to a continuing obligation — a pattern that suggests the training-data representation of this instrument came primarily from third-party summaries that flatten the obligation lifecycle rather than from the regulator's specific enumeration. Web search was active and did not correct the error, indicating the retrieval stack did not surface the primary text's carve-out language.

    see details →
  3. Finding on 'Q005 Probe' for Claude Sonnet 4.6 with web search ONRLB-H-US-CFTC-DIGITAL-ASSET-COLLATERAL-TOKENIZED-ASSETS-STAFF-GUIDANCE-2025-Q005-Sonnet46

    The elision of OCC Interpretive Letter 1183 as the national trust bank eligibility hook points to a gap in how the retrieval layer handles cross-document dependencies for recent regulatory instruments. The model answered the structural question (what changed in the definition) while dropping the operationally critical cross-reference (what anchors the new category's eligibility) — a failure mode where the RAG layer retrieves summary-level content but does not follow the citation chain to the specific secondary instrument.

    see details →
  4. Finding on 'Q006 Probe' for Claude Sonnet 4.6 with web search ONRLB-H-US-CFTC-DIGITAL-ASSET-COLLATERAL-TOKENIZED-ASSETS-STAFF-GUIDANCE-2025-Q006-Sonnet46

    This is the most severe finding in this paper: the model fabricated a specific source document — 'March 2026 CFTC Staff FAQs' — to support an answer directly contradicted by the regulator's text, and presented the termination as a precisely-worded procedural rule. This implicates the calibration signal for named-source citations: the model committed to a document title and date without apparent retrieval basis rather than flagging uncertainty. It also implicates the training-data representation of the amendment cycle — the model's confident wrong answer suggests it is reconstructing from a plausible structural template, not retrieving the governing text.

    see details →
  5. Finding on 'Q007 Probe' for Claude Sonnet 4.6 with web search ONRLB-H-US-CFTC-DIGITAL-ASSET-COLLATERAL-TOKENIZED-ASSETS-STAFF-GUIDANCE-2025-Q007-Sonnet46

    The dropped multi-DCO worst-case selection rule implicates retrieval-layer answer construction for questions with a numeric threshold and a tie-breaking rule. The model retrieved and stated the base threshold correctly but did not surface the governing rule for the multi-party case, which is the only rule that matters when the question explicitly concerns multiple DCOs accepting the same asset. This is likely a training-data density issue for the FAQ-level elaboration of this rule, combined with a tendency to answer the simpler version of a numeric-threshold question when the more complex governing rule requires an additional retrieval step.

    see details →
← Other AI Labs white papers The detailed Case study →