Executive Summary
AI assistants tested against the CFTC's 2024 amendments to Regulation 1.25 produced materially wrong answers for Risk teams at US investment banking firms operating as or through FCMs — specifically on the tiered concentration limits that determine how customer segregated funds may be invested in government money market funds and Treasury ETFs. Across the question set, AI tools confidently asserted uniform, flat percentage limits and denied the existence of any asset-size-based structure, directly contradicting the two-tier framework the rule establishes based on fund size and management company AUM.
When pressed, the AI retracted — admitting it had synthesised secondary commentary rather than the regulatory text itself. For a Risk function responsible for FCM segregation compliance, an AI that provides wrong answers with false confidence and corrects only under challenge is precisely the failure mode that travels undetected into policy documents, concentration-limit spreadsheets, and internal audit responses.
How AI gets this regulation wrong
The dominant failure pattern on this regulation is confident fabrication of a simpler rule — AI tools flattened the tiered concentration structure into a uniform limit and stated categorically that no size-based differentiation exists. The failure is compounded by the AI's willingness to self-correct only when directly challenged, meaning a team that accepts the first answer without stress-testing it will never encounter the retraction.
| AI's Failure Mode | Count | Affected findings |
|---|---|---|
| Exposed Fabrication | 1 | Finding#1 |
What that means for your team
The downstream risk for Risk teams at investment banks lands squarely in regulatory enforcement exposure — miscalibrated concentration limits fed into investment policy frameworks or segregation compliance controls create the precise conditions for a CFTC examination finding or a Part 30 / Part 1.25 violation notice. The table below maps how each failure mode translates into operational and regulatory consequence for the firm.
| Risk Impact | Count | Affected findings |
|---|---|---|
| Regulatory enforcement | 1 | Finding#1 |
When this affects your department
Risk teams at investment banks with FCM registration — or with prime brokerage or clearing businesses that route customer funds through affiliated FCMs — regularly use AI tools to pressure-test their segregated fund investment policies against current CFTC requirements. The practical trigger points are: annual policy refresh cycles following a rule amendment, internal audit preparation for segregation compliance, onboarding new money market fund counterparties into the approved investment universe, and responding to business line queries about whether a proposed investment allocation is permissible.
In all of these scenarios, a junior risk analyst or compliance associate querying an AI tool about what concentration limits apply to government MMFs will receive a confident, numerically specific answer — and have no immediate reason to distrust it.
The specific failure documented here — AI asserting that limits apply uniformly regardless of fund size or management company AUM — maps directly onto a practical error in investment policy design. If the policy document or the approved-investment-list spreadsheet encodes a flat 10% per-fund cap rather than the 50% ceiling available for large-fund/large-manager combinations meeting the ≥$1B / ≥$25B thresholds, the firm either unnecessarily constrains its segregated fund investment flexibility, or — if a separate team applied a different (incorrect) reading — runs above a limit it believed was higher than it actually is. Either direction creates compliance exposure.
For an investment bank whose segregation programme spans multiple FCM affiliates or clearing business lines, the amplification risk is material: a single wrong AI output used as a reference in a template policy circulates across entities before anyone checks the source text. When the CFTC examines segregation records, the concentration limit calculation is one of the first things reviewers pull — and a policy that cannot map cleanly back to the tiered structure in the 2024 amendments will generate questions that are expensive to answer and harder to dismiss as inadvertent.
The findings at a glance
The table below summarises the single aggregated finding from AI testing on the 2024 Regulation 1.25 amendments, covering the outcome, failure character, and primary risk dimension relevant to Risk teams at US investment banking firms.
| # | Finding title | Type | Citation ID |
|---|---|---|---|
| 1 | Tiered concentration limits for government MMFs and Treasury ETFs | Hallucination | RLB-F-US-CFTC-FCM-DCO-CUSTOMER-FUNDS-INVESTMENTS-REG-1-25-2024-Q001 |
Aggregate impact
The finding on this regulation clusters on a single structural feature of the 2024 amendments that AI tools consistently flattened: the tiered concentration framework that differentiates permissible investment limits based on the size of the fund and the assets under management of the fund's management company. Both AI tools tested independently arrived at the same wrong answer — asserting uniform flat caps and explicitly denying that any size-based structure exists — before retracting under challenge.
That convergence matters: it suggests the error is not idiosyncratic to one model's training, but reflects a systematic gap in how AI tools have absorbed the 2024 amendments' structural detail relative to the simpler issuer-based caps that preceded them.
For Risk functions at investment banks, the systemic risk is that the AI's confident, numerically precise wrong answer is exactly the kind of output that gets copy-pasted into a draft policy or a compliance memo without further verification. The retraction behaviour — where the AI corrects itself only when directly challenged with follow-up questions — means the error is self-concealing in standard use: a team that asks once and acts on the answer never discovers the AI's uncertainty.
The firm's investment policy, its approved fund lists, and its segregation monitoring dashboards can all encode the wrong limit structure while the team believes it is compliant.
The enforcement dimension is unambiguous. The CFTC's segregation rules for FCMs are among the most actively examined areas in derivatives broker examinations, and concentration limit violations are quantifiable and documentable. A policy that caps large-fund investment allocations at 10% when the rule permits 50% for qualifying funds is an operational inefficiency; a policy that inadvertently applies the 50% ceiling to funds that do not meet the ≥$1B / ≥$25B thresholds is a Part 1.25 violation. Either outcome traces back to the same source: an AI answer that presented a structurally incomplete reading of the rule as definitive.
What your team should do
The default position for this regulation is straightforward: do not use AI output as the reference source for concentration limit calculations in segregated fund investment policies. The 2024 amendments introduced a specific tiered structure that AI tools have demonstrably misread — and the failure mode is self-concealing because the AI does not flag uncertainty in its first response. Any work product that encodes investment limits (policy documents, approved fund lists, monitoring spreadsheets, internal audit response packs) must trace its numbers directly to the rule text, not to an AI summary of it.
Where AI tools remain useful in this workflow is at the earlier, less precise stages: drafting the overall policy narrative, building a comparison of the pre-2024 and post-2024 frameworks for internal training materials, or identifying which business lines and affiliates fall within FCM registration scope for a given clearing arrangement. These are tasks where a structural error in AI output is more easily caught before it enters a compliance-grade document.
Similarly, AI can assist in drafting questions for outside counsel review or preparing the framing for a regulatory mapping exercise — provided the team treats that output as a starting draft, not a technical answer.
The practical safeguard for concentration limit questions specifically is to require that any AI-assisted policy output on Regulation 1.25 be checked against the CFTC's published final rule text — not secondary commentary, trade association summaries, or law firm client alerts, which are the sources the AI demonstrably drew on when producing wrong answers. Build that requirement into the team's AI-use policy explicitly: for Part 1.25 investment limit questions, primary source verification is mandatory before any number enters a compliance-grade document. That single control eliminates the enforcement exposure this finding creates.
How RLB Can Help
RegLeg's published Hallucination Research gives your team a concrete pre-flight check before placing weight on AI-generated output in regulatory analysis. For a Risk function at a US investment bank, that means stress-testing the AI tools your analysts, quant risk, and compliance-adjacent teams are already using against a documented catalogue of failure modes — not hypothetical edge cases, but patterns observed across real regulatory texts including capital, margin, derivatives, and conduct frameworks that your desk is operating under.
Before a model-generated interpretation of a Fed or SEC rule lands in a stress test assumption, a credit risk framework, or a counterparty exposure memo, you can verify whether that regulatory scope is one where AI assistants have already been shown to hallucinate in material ways.
Beyond the published research, RLB can run a bespoke regulator deep-dive scoped to your specific AI-supported workflows — mapping which regulatory questions your Risk team is actually asking AI tools to answer, and where in that workflow the hallucination exposure is highest. For an investment bank, that typically surfaces around capital adequacy interpretation, cross-border margin rules, large-exposure thresholds, and model-risk overlays where the regulatory text is dense, frequently amended, and carries significant asymmetry between a correct and an incorrect read.
The output is a prioritised exposure map, not a generic AI risk framework — calibrated to your firm's jurisdictional footprint and the actual regulatory questions your function depends on getting right.
RLB also works directly with Risk teams on two further workstreams: a confidential review of your firm's existing AI-use policy against the failure-mode catalogue, identifying where current controls are under-specified for the hallucination patterns we've documented, with a prioritised remediation roadmap; and the development of training and CPD-aligned material your team can use internally — content written at the right technical register for senior risk professionals, grounding AI governance obligations in specific, documented failure patterns rather than abstract model-safety concepts. Both workstreams are built collaboratively with your team, with findings staying inside the firm.