AI Hallucination ResearchAudiencesSectorsUnited KingdomRetail BankingLegal › Detail
Retail Banking × Legal — United Kingdom · updated 2026-05-26 · methodology v2.1

AI Hallucinations Affecting Legal at Retail Banking Firms in the United Kingdom

This case study examines how AI tools respond to questions about Consumer Duty regulation — specifically the FCA's PS22/9 and Principle 2A framework — when consulted by Legal teams at Retail Banking firms in the United Kingdom. Across three aggregated questions drawn from this regulatory area, AI assistants produced incorrect or materially incomplete answers in every instance tested.

The errors range from omitting a critical statutory disclaimer about FSMA 2023, to mischaracterising the qualifying threshold for charities under the retail customer definition, to presenting unverifiable specific claims about differences between the consultation draft and the final rules as if they were established fact. Legal teams in UK retail banking are among the most frequent internal users of AI for regulatory research, making these failure patterns directly relevant to day-to-day practice.

When this affects Retail Banking × Legal — United Kingdom

Legal teams at Retail Banking firms routinely turn to AI tools when scoping the Consumer Duty's reach across the business — whether drafting internal policy frameworks, producing training materials for frontline staff, advising product teams on whether a new offering brings the firm within scope, or mapping regulatory obligations as part of onboarding new distribution partners.

Questions about the legal basis of Principle 12, the definition of "retail customer" under PRIN 2A, and how the final rules differ from the consultation draft are precisely the kind of foundational questions that Legal teams answer early in any Consumer Duty project — and whose answers then travel downstream into operational guidance, board papers, and supplier contracts.

The corporate use-cases that sit on top of these topics are substantial. A Legal team's view on the scope of the "retail customer" definition shapes whether compliance monitoring covers small charities and micro-enterprises. An incorrect account of the legal basis for the Duty could affect how the firm characterises its obligations in external reporting or in responses to FCA information requests. An inaccurate summary of what changed between the consultation and the final rules could cause the firm to rely on draft-stage provisions that were subsequently modified, building internal processes around rules that no longer apply.

When the AI's answer is wrong and the Legal team carries that answer forward without verification, the consequences fall on the firm rather than the individual practitioner. The FCA can impose financial penalties, require remediation programmes, issue public censures, or — under Consumer Duty — scrutinise the firm's governance and oversight arrangements. Operational harm compounds the regulatory risk: processes built on incorrect rule-readings require costly re-engineering, and clients who receive advice or products structured around a misreading of the Duty may have grounds for redress. Reputational damage to the firm and, in high-profile cases, to the wider sector follows.

Aggregate impact

All three findings in this case study relate to a single regulation — the FCA's Consumer Duty framework under PS22/9 and PRIN 2A — and collectively they reveal a consistent pattern: AI tools produce answers that are partially correct but materially incomplete or quietly wrong in ways that are difficult to detect without specialist knowledge. In one case the AI omitted a targeted disclaimer that FSMA 2023 did not create the Duty — an omission that would mislead anyone trying to understand the legislative architecture.

In another, the AI substituted the wrong accounting measure (income for turnover) in the charity threshold, a subtle distinction that matters when determining whether a specific counterparty is a retail customer. In the third, AI tools presented detailed, confident accounts of differences between the consultation draft and the final rules in an area where those specifics cannot reliably be verified — a pattern of confabulation dressed as expertise.

These errors cluster entirely within the Consumer Duty perimeter, which is significant because Consumer Duty is the dominant regulatory reform agenda for UK retail banking Legal teams. The Duty affects product design, distribution, customer communications, oversight and governance arrangements, and the treatment of vulnerable customers — meaning Legal teams are routinely producing work-product that depends on accurate Consumer Duty analysis. A Legal function that uses AI to accelerate Consumer Duty research is therefore exposed to these failure patterns at precisely the highest-stakes moments in its workflow.

The systemic risk compounds quickly. A single incorrect AI answer about the scope of "retail customer" can propagate into the firm's Consumer Duty implementation plan, its board attestations, its supplier questionnaires, and its monitoring framework — all before any human expert has reviewed the underlying premise. When several downstream work-products rest on the same unverified AI response, the cost of correcting the error multiplies: not just re-drafting the original output, but tracing and correcting every document, process, and decision that relied on it.

For a Legal team operating under time pressure, the apparent efficiency gain from AI assistance is rapidly erased when a foundational error surfaces late in the project cycle.

Findings

3 findings in this case study. Click any to see its full evidence card.

  1. Legal basis of the Consumer Duty — FSMA 2023 disclaimer see this finding →
  2. Retail customer definition — charity threshold and sourcebook variation see this finding →
  3. Changes from CP21/36 consultation to PS22/9 final rules see this finding →

What your team should do

The default position for Legal teams at Retail Banking firms should be that AI tools are a starting point for orientation, not a primary source, when the question concerns specific regulatory rules, definitions, or legislative history. The findings in this case study illustrate why: AI tools produce answers that look authoritative and are partially correct, but contain errors — a substituted accounting term, a missing statutory qualifier, an unverifiable claim about rule changes — that are exactly the kind of detail that matters in a regulatory context and that a non-specialist reader would not know to question.

Legal teams should treat any AI response on Consumer Duty scope, definitions, or rulemaking history as a draft hypothesis to be verified against the FCA Handbook, the relevant policy statements, and, where necessary, primary legislation.

At the firm level, practical safeguards reduce exposure without eliminating the productivity benefits of AI assistance. A regulatory-verification policy that explicitly names Consumer Duty, PRIN 2A, and related FCA rules as areas where AI output must be checked against primary sources before it enters any work-product provides a clear organisational standard. Audit trails for AI-assisted research — noting what was asked, what the AI said, and how it was verified — create a record that supports good governance and demonstrates due diligence if a regulatory question later arises.

Sign-off requirements before AI-drafted regulatory analysis is circulated firm-wide, or included in board papers or supplier-facing documents, ensure that a qualified reviewer has seen the output before it propagates. Where AI output is used in regulatory-facing material, distinguishing clearly between "AI-drafted, expert-verified" and "AI-generated, unverified" content protects both the firm and the individuals involved.

There are areas where AI tools add genuine value in Legal workflows with lower risk. Drafting non-regulatory copy — introductory text, headings, communication scaffolding — is a safe use. Summarising long documents that the team can verify against the original is another: the AI saves reading time, and the team checks the summary against the source. Generating first-draft questions for further research, or producing a list of topics to cover in a policy document, draws on AI's breadth without relying on its precision.

The key discipline is keeping AI in the role of drafting assistant and research prompt, rather than regulatory authority.

How RLB can help

RegLeg's published hallucination research gives Legal teams at Retail Banking firms a free, ready-to-use reference before relying on any AI answer in Consumer Duty and related FCA rule areas. The research identifies, at the question level, where AI tools have produced incorrect or materially incomplete answers on specific regulatory topics — so that a team member can check whether the question they just asked an AI tool falls within a known failure zone before acting on the response. This is a practical pre-step that costs nothing and can be integrated into existing research workflows.

For firms that want a deeper picture of their own exposure, RegLeg offers bespoke regulatory deep-dives that map which AI-supported workflows in a Retail Banking Legal function carry the highest hallucination risk. Consumer Duty is an obvious focus given the findings in this case study, but the same analysis applies across the FCA's broader retail banking regulatory perimeter. The output is a prioritised risk map — not a generic warning, but a specific account of which questions, in which workflows, are most likely to produce an unreliable AI response and what the consequences are if that response is used unchecked.

RegLeg can also conduct a confidential review of a firm's existing AI-use policy against RegLeg's failure-mode catalogue, identifying gaps between the policy as written and the actual behaviour of AI tools in the regulatory domains the firm's Legal team covers, with prioritised remediation steps. For teams looking to build internal capability, RegLeg provides training materials and CPD-aligned content that Legal professionals can use to develop a practical understanding of how and why AI tools fail on regulatory content — moving beyond generic "check your sources" guidance to a specific, evidence-based account that Legal teams can act on immediately.

← Back to summary Other sector case studies in United Kingdom →