Fabrication and scope-conflation are the dominant failure shapes on the CFTC's December 2025 swap dealer business conduct and documentation rulemaking, with Claude Opus 4.7 with web search producing an invented Federal Register document identifier, misattributing the scope of a CFTC staff letter to a generic venue framing, and overextending 'eliminated in its entirety' to mean obligation extinguishment rather than paragraph-level restructuring. Claude Sonnet 4.6 with web search suppressed the specific appendix identity on the same correction-notice question that Opus 4.7 hallucinated a citation for. Failures concentrate on the amendment layer — the January 2026 correction, the no-action letter governing Eligible UK Trading Venues, and the PTMMM paragraph reorganisation — where the gap between secondary commentary availability and primary document retrievability is widest. Both models with web search active failed to retrieve the regulator's primary text for the correction notice, substituting law-firm commentary that was structurally accurate but substantively incomplete.
This is the consolidated view of findings. Click 'see details →' on any item for the full details for each finding.
This failure implicates two subsystems simultaneously: the citation-generation path produced a fabricated Federal Register document number rather than declining to specify, indicating the training corpus has secondary commentary on the correction without the primary document's structured metadata; and the retrieval layer did not surface the primary correction notice as a top result despite web search being active. The fabricated identifier (Doc. 2026-01712) would return no result at federalregister.gov — a model deployed in a compliance co-pilot context would deliver a dead citation with high confidence.
see details →This failure points to a scope-restriction suppression pattern in how the model handles CFTC staff letters: the training corpus appears to weight the interpretive general statement (ITBC swap treatment) more heavily than the bounding clause (Eligible UK Trading Venues only), so the model reproduces the general statement as the letter's full coverage. The retrieval layer's top result — a law-firm summary — did not contain the precise venue restriction, and the model did not flag the gap between its training-derived framing and the retrieved source.
see details →This failure is a post-training calibration gap: the model interpreted 'eliminated in its entirety' as obligation extinguishment when the rule's operation was a paragraph-level reorganisation — the disclosure and compensation requirements moved subsections rather than disappearing. The model generated a product-agnostic exemption claim unsupported by the rule text, suggesting the calibration signal for 'eliminate/delete' in a rulemaking context does not adequately distinguish structural reorganisation from substantive removal. No cited sources were produced, indicating this was a training-reconstruction failure rather than a retrieval failure.
see details →This failure is a qualifier-suppression pattern: the model characterised the correction's structure correctly (a drafting error accidentally removed an appendix) but omitted the specific content the question required — the appendix name, subpart, and the guidance function it serves for recommendations to Special Entities. The suppression is not a retrieval failure but a response-generation choice to stay at the structural level rather than commit to specific content, likely because the primary correction notice was not retrieved and the law-firm commentary cited did not name the appendix specifically.
see details →