AI Hallucination ResearchAudiencesPractitionersUnited KingdomLawyers › Detail
Practitioners — Lawyers · updated 2026-05-26 · methodology v2.1

AI Hallucinations Affecting Lawyers in the United Kingdom

This case study examines how AI tools perform when UK lawyers ask questions about the Financial Conduct Authority's Consumer Duty — one of the most significant regulatory frameworks now affecting every financial services lawyer practising in the United Kingdom. Across one regulation (Consumer Duty, PS22/9 and PRIN 2A), ten distinct research questions were aggregated, covering the Duty's legal basis, its cross-cutting rules, scope definitions, guidance status, and the FCA's supervisory posture since implementation. AI assistants produced materially incorrect, incomplete, or misleading answers on every one of those ten questions.

The errors identified range from subtle qualifier-drops that shift the legal meaning of a binding rule, to fabricated regulatory events complete with invented dates and publication references — each capable of causing a lawyer to give incorrect advice or to fail to identify a compliance gap in a client's business.

When this affects Practitioners — Lawyers

UK lawyers working in financial services, regulatory, or commercial practices regularly turn to AI tools to accelerate research, draft client memos, or prepare briefing notes on topics such as the Consumer Duty's scope, its cross-cutting rules, and how the FCA expects regulated firms to demonstrate compliance.

These are precisely the situations where the findings in this case study carry the greatest risk: an AI-generated answer, plausible in tone and confidently worded, may contain a dropped qualifier, a fabricated date, or a reversed exclusion that transforms a compliant position into a non-compliant one — or vice versa — without any signal in the output that anything is wrong.

In normal workflow, a lawyer might use an AI tool to check whether a particular client category falls inside the Consumer Duty's scope, to assess whether a specific methodology such as consumer testing of communications is mandated or merely recommended in guidance, or to quickly establish the FCA's current supervisory posture following implementation. The findings in this case study show that AI tools get each of these questions materially wrong in ways that would not be visible without independent verification against the FCA Handbook or published policy statements.

The errors are not in obscure edge cases: they appear in the core rules, the defined terms, and the FCA's most prominent published outputs.

The stakes are significant at both levels. For the lawyer personally, giving advice based on an unverified AI answer carries professional indemnity exposure and, in the most serious cases, the risk of regulatory action by the Solicitors Regulation Authority or Bar Standards Board. For clients — who are often themselves regulated firms — acting on incorrect Consumer Duty advice could mean a flawed compliance programme, a mischaracterised scope assessment, or enforcement action that correctly framed legal advice could have prevented.

The FCA has broad supervisory and enforcement powers under FSMA 2000, and errors in Consumer Duty advice translate directly into client risk at a regulated level.

Aggregate impact

Across all ten aggregated questions on the Consumer Duty, AI tools produced materially incorrect answers in every case. The pattern of errors is not random. AI tools consistently added conditions that the rules do not contain, dropped qualifiers that determine whether a rule is triggered at all, and — in several instances — fabricated specific regulatory events, including dates, publication references, and counts of withdrawn supervisory documents that cannot be traced to anything the FCA has published. These are not peripheral details.

Consumer Duty advice turns precisely on these distinctions: which legal threshold applies, whether a methodology is mandated by rule or recommended in guidance, and which supervisory expectations currently remain live.

All ten findings relate to a single regulatory framework — Consumer Duty (PS22/9 and PRIN 2A) — administered by the FCA. This concentration means that any UK lawyer who regularly advises on Consumer Duty compliance faces a systemic risk of receiving incorrect AI-generated answers across the full range of their work on this topic, from initial scoping engagements through to detailed compliance advice.

The errors identified include wrong legal tests (substituting objective customer understanding for the firm's reasonable belief), incorrect exclusion characterisations (claiming group insurance distribution is within scope when the rules explicitly exclude it), fabricated binding rule references (inventing a specific Handbook paragraph requiring consumer testing), and overstated regulatory expectations (converting the FCA's permissive position on non-monetary benefit quantification into an active expectation).

The practical consequence for a UK lawyer is that relying on AI tools for Consumer Duty research without independent verification creates a near-certain probability of carrying forward at least one material error across any given engagement. The FCA's supervisory posture on Consumer Duty is also evolving rapidly: in March 2025 it published FS25/2, withdrawing more than 90 Dear CEO letters and over 100 multi-firm reports as part of a systematic review. AI tools tested here were either unaware of this published document or fabricated its timeline and content.

A lawyer whose advice about live supervisory expectations is grounded in outdated or fabricated information is exposed both professionally and in the quality of service delivered to regulated firm clients.

Findings

10 findings in this case study. Click any to see its full evidence card.

  1. Legal basis of the Consumer Duty and the role of FSMA 2023 see this finding →
  2. Foreseeable harm and customer-accepted risk under the Consumer Duty see this finding →
  3. Scope of "retail customer" — micro-enterprises and small charities under PRIN 2A see this finding →
  4. Consumer testing of communications — mandatory rule or non-binding guidance? see this finding →
  5. Quantifying non-monetary benefits in Consumer Duty fair value assessments see this finding →
  6. FCA withdrawal of pre-Consumer Duty Dear CEO letters — scale and timing see this finding →
  7. FCA public commentary on first-year Consumer Duty compliance see this finding →
  8. Differences between CP21/36 consultation and PS22/9 final rules see this finding →
  9. Scope exclusions — reinsurance, large-risk contracts, and group insurance distribution see this finding →
  10. Which FCA Dear CEO letters remain in force after Consumer Duty implementation see this finding →

What your team should do

AI tools should be treated as a drafting aid and a starting point for research — not as a primary or verified source — when working on Consumer Duty questions or any other regulatory topic covered by the FCA Handbook. The findings in this case study demonstrate that AI tools produce confident, well-structured answers that can contain wrong legal tests, incorrect thresholds, fabricated regulatory events, and reversed exclusions. A response that reads fluently and cites real-looking URLs is not a verified one.

The default professional position for UK lawyers should be: assume the AI is wrong on any regulatory detail until you have checked it against the FCA's published text, whether that is the Handbook, a policy statement, or published finalised guidance.

Practical safeguards for your workflow include the following. First, independently verify every regulatory citation before it appears in a client deliverable — check the FCA Handbook and the relevant policy statement against what the AI has said, not the other way round. Second, if AI tools have contributed to your research on a matter, maintain an audit trail showing which claims were verified and against which source, both for your own professional records and to demonstrate compliance with your firm's AI-use policy.

Third, never include AI-generated regulatory references — rule numbers, guidance paragraph numbers, or descriptions of FCA publications — in a document for a client without having read the underlying text yourself. The findings here show that AI tools invent specific Handbook rule references that look authoritative but do not exist, and fabricate publication names, dates, and figures that have no basis in the FCA's record.

There are areas where AI tools remain genuinely useful in a lawyer's workflow, including on regulatory topics. Generating a first-draft list of questions to put to a compliance team, summarising a long document that you will then read and verify, or producing initial drafts of non-regulatory narrative sections of a client memo are all tasks where AI's output quality is high and the verification cost is low. The risk profile is fundamentally different when the AI is helping you draft a question to be answered by a human expert, compared to when it is itself supplying the regulatory answer.

Use AI tools accordingly: value them for the first kind of work, and verify before relying on them for the second.

How RLB can help

RegLeg Benchmark (RLB) publishes ongoing Hallucination Research examining how AI tools perform when tested against specific regulatory texts. The Consumer Duty findings in this case study are part of a broader programme covering FCA regulations and other UK financial services instruments. UK lawyers can access the published research as a free reference resource before acting on an AI-generated answer in any covered regulatory area — checking whether the question you have just asked an AI tool falls within a category where AI assistants are known to produce incorrect, incomplete, or fabricated outputs.

Using the research as a pre-verification prompt costs very little; acting on an unverified AI answer in a regulated context can cost considerably more.

For firms that employ multiple lawyers working on the same regulatory portfolio — financial services practices with Consumer Duty mandates across a client base, compliance teams advising regulated firms at scale, or in-house legal teams within authorised firms — RLB offers bespoke regulation deep-dives tailored to the specific regulatory perimeter your lawyers work within. These are not generic AI-risk briefings. They identify the specific questions, qualifiers, and rule boundaries where AI tools have been demonstrated to fail on your regulatory ground, and can be mapped directly to the client or matter types your team handles.

RLB also produces training materials and CPD-aligned content showing the concrete failure modes UK lawyers should watch for when using AI tools on regulatory research tasks, along with confidential reviews of a firm's existing AI-use policy against RLB's failure-mode catalogue. If your firm has already adopted AI tools for regulatory work and wants to know whether its verification controls are calibrated to the actual risk profile — rather than a generic technology or data-protection risk framework — that review is a practical starting point.

The goal is not to discourage AI use but to ensure that where AI tools are used, they are used in ways that do not expose the firm or its clients to the consequences of an error that a brief verification step would have caught.

← Back to summary Other practitioner case studies in United Kingdom →