Executive Summary
AI assistants we tested produced hallucinations on three distinct questions about the CFTC's December 2025 final rule revising business conduct and swap documentation requirements for swap dealers and major swap participants — a rule that touches daily operational obligations for Compliance teams at U.S. investment banking firms. In every case the AI either invented or mischaracterised specific, verifiable facts: which appendix was inadvertently removed and restored by a January 2026 correction, which trading venues CFTC Staff Letter 25-49 actually covers, and what "eliminated in its entirety" means for the pre-trade mid-market mark requirement under §23.431(a)(3).
None of these failures are cosmetic — each concerns a discrete legal or operational fact that a Compliance team would cite directly in policy documentation, training materials, or regulatory correspondence. When AI tools were challenged on their incorrect answers, two of the three admitted their original response was wrong, making the initial confident delivery the operative risk.
How AI gets this regulation wrong
The failures here split into two patterns: AI tools that delivered a confident but factually wrong answer and then retracted it under challenge, and AI tools that overstated the scope of what the rule actually changed. Both patterns are dangerous for a Compliance team — the first because the retraction only surfaces if someone pushes back, the second because overbroad characterisations of regulatory requirements get embedded in policies and training without challenge.
| AI's Failure Mode | Count | Affected findings |
|---|---|---|
| Exposed Fabrication | 2 | Finding#2 · Finding#3 |
| Inference Drift | 1 | Finding#1 |
What that means for your team
All three failures map to regulatory enforcement risk — and specifically to the kind of enforcement exposure that originates internally: a Compliance sign-off on a policy or a training deck that is materially wrong about what the CFTC's rule actually says. For U.S. swap dealer Compliance teams, the December 2025 rule revision and its associated corrections and staff letters are exactly the documents a business line will cite when they need a quick answer about disclosure obligations or venue treatment, making accurate AI output a precondition of safe reliance.
| Risk Impact | Count | Affected findings |
|---|---|---|
| Regulatory enforcement | 3 | Finding#1 · Finding#2 · Finding#3 |
When this affects your department
Compliance teams at U.S. investment banking swap dealers are most likely to query AI tools on this rule during three operational moments: updating the firm's written supervisory procedures and compliance manuals to reflect the December 2025 amendments, supporting front-office questions about disclosure obligations when the trading desk wants a quick read on what changed, and preparing training materials for relationship managers and structured-products personnel who deal with special entity counterparties under §§23.434 and 23.440.
The January 2026 correction — a statutory amendment that restored Appendix A to Subpart H — is precisely the kind of late-breaking, low-profile document that AI tools handle poorly but that Compliance teams cite when defending their control framework to internal audit or CFTC examination staff.
Staff letters and no-action relief carry similar risk. A Compliance officer checking the scope of CFTC Staff Letter 25-49 to determine whether the firm's UK execution venue activity falls inside or outside the relief is relying on the AI to characterise a specific, bounded piece of guidance correctly. If the AI maps that relief onto U.S. SEFs and DCMs — the familiar domestic template — rather than the Eligible UK Trading Venues actually covered, the firm is either applying relief it does not qualify for or failing to apply relief it does.
Both errors are live compliance failures under the broader business conduct framework.
The PTMMM finding compounds this risk at the policy-documentation layer. "Eliminated in its entirety" sounds categorical — a junior Compliance analyst drafting updated disclosure procedures would reasonably interpret that language to mean nothing survives. In practice, the provision's deletion from §23.431(a)(3) only removed the structural container; price and compensation disclosure obligations migrated into adjacent paragraphs and the scope was always limited to uncleared swaps, FX forwards, and FX swaps.
An IB swap desk that runs cleared CDS alongside uncleared products cannot afford ambiguity here — a policy that overclaims the elimination will create systemic gaps in disclosure controls for the uncleared book.
The findings at a glance
The table below summarises each finding — the specific question asked, what the AI got wrong, and what the correct regulatory position actually is.
Aggregate impact
The three findings cluster around the edges of the December 2025 rule revision — the correction notice, associated staff guidance, and the precise mechanics of how a key disclosure requirement was restructured. This is a consistent pattern: AI tools handle the headline rule change reasonably well but fail on the specific technical details that sit one document layer below the primary rulemaking. For a Compliance team, those details are exactly what the job requires getting right.
Two of the three failures share a structural characteristic that makes them harder to catch: the AI initially delivered a wrong answer with apparent confidence, then retracted it only when pressed. In a typical Compliance workflow — a senior officer asking a junior analyst to do a quick regulatory scan, with the result flowing into a policy memo or a training deck — no one pushes back. The first answer is the answer. That dynamic means the retraction-on-challenge mechanism, which might seem like a safety feature, is operationally inert unless the team already knows what to challenge.
The aggregate enforcement exposure for a U.S. investment banking swap dealer is not theoretical. The CFTC's examination program for swap dealers covers business conduct obligations directly, and Compliance manuals, training logs, and disclosure procedures are standard examination deliverables. An AI-generated error in any of those documents — wrong appendix cited as guidance authority, wrong venue type for a staff letter's relief, wrong product scope for a restructured disclosure requirement — is an examiner's finding waiting to happen.
At this regulation's maturity stage, with the correction notice less than six months old, there is very limited tolerance for documentation errors that a reasonable control framework should have caught.
What your team should do
The default position for this regulation is that AI tools should not be the terminal source on any specific factual claim — appendix identities, staff letter scope, or the precise mechanics of how a provision was restructured. Use AI to get oriented quickly on the landscape of a rule revision, then verify every specific against the Federal Register text, the correction notice, and the relevant staff letters directly. For the December 2025 final rule and the January 2026 correction, those primary documents are short enough that primary-source verification is not a meaningful burden.
For staff letters specifically — CFTC Staff Letter 25-49 being a live example — the key safeguard is treating the "venue type" question as a mandatory verification item. AI tools have a strong pull toward the domestic SEF/DCM template when the topic is ITBC swaps, because that is the dominant context in the codified rule. Any staff letter that departs from that template by covering a foreign venue type will be systematically mis-summarised.
A Compliance team supporting the equities or derivatives desk on cross-border execution venue questions should build a standing practice of reading the staff letter's scope paragraph directly rather than relying on AI characterisation.
AI tools are reasonably safe for drafting early-stage summary language about the overall structure of the rule revision — what categories of change the December 2025 final rule addressed, which subparts were touched, the general direction of travel on disclosure burdens. They are also useful for generating a checklist of provisions that warrant primary-source review. Where they are not safe is in the translation from "what changed" to "what the specific operational requirement now is" — that last step is where the failures we documented occur, and it is exactly the step that Compliance documentation requires getting right.
How RLB Can Help
RegLeg's published Hallucination Research gives your team a concrete pre-flight check before relying on AI-generated output on regulatory questions. If your analysts or legal colleagues are using AI tools to interpret SEC or FINRA requirements, assess capital treatment under Basel III, or draft policy justifications, the research identifies exactly where those tools have produced confidently wrong answers on the same regulatory texts — wrong entities, inverted obligations, fabricated thresholds.
That published record is free to access and specific enough to be operationally useful: you can cross-reference it against the regulations your team actually works with before the output reaches a submission, a trade approval memo, or a board paper.
Beyond the public findings, we run bespoke regulator deep-dives scoped to the Compliance workflows that carry the highest hallucination exposure in investment banking specifically. That means mapping AI failure patterns against the places where your team's reliance on AI output creates the sharpest consequence: regulatory capital calculations, trade reporting obligations under CFTC and SEC, conflicts governance, and cross-border rule applicability questions where the gap between what an AI tool asserts and what the regulation actually requires can be both large and invisible.
The output is a prioritised exposure map your team can use to set guardrails, not a generic risk register.
Where you have an existing AI-use policy, we can run a confidential review of it against RegLeg's failure-mode catalogue — the categories of errors the research has documented across regulatory domains — and return a prioritised remediation brief: which policy provisions are underspecified relative to known failure patterns, where human-review checkpoints are missing, and where the policy's assumptions about AI reliability are contradicted by documented evidence.
We can also develop training material and CPD-aligned content your Compliance team can use internally — grounded in real findings from the research, framed for practitioners who already know the regulatory landscape and need to calibrate when and how much to trust AI-assisted work product.