Executive Summary
The BBNJ High Seas Biodiversity Agreement — the first binding international treaty to govern marine biodiversity in international waters — introduces a new environmental impact assessment regime, a benefit-sharing framework for marine genetic resources and their digital sequence information, and a Conference of the Parties empowered to designate area-based management tools including marine protected areas.
For Legal teams at Law Firms firms advising clients on ocean-economy activities, international biodiversity compliance, or marine genetic resource commercialisation, accuracy on this treaty's operative provisions is not a secondary concern — it defines both the scope of mandatory due-diligence obligations and the limits of client liability exposure across international jurisdictions.
Across four questions testing the Agreement's most consequential provisions, AI assistants we tested produced wrong answers on every one. The errors were not peripheral: they misidentified the threshold that triggers a mandatory environmental impact assessment, completely inverted the Agreement's retroactivity rule for marine genetic resources collected before entry into force, misattributed the article governing digital sequence information benefit-sharing obligations, and cited the wrong provision to support the non-undermining constraint on Conference of the Parties decisions.
In each case the AI assistants presented their answers with confidence, and three of the four errors were only exposed when the AI was directly challenged — at which point it conceded it had not been certain of the position it had just stated. For a Legal team at a Law Firms firm relying on AI-assisted research to support transaction due diligence, regulatory opinions, or client briefings, that pattern of confident error followed by quiet retraction represents a structural reliability problem, not a one-off hallucination.
How AI gets this regulation wrong
AI tools we tested most frequently gave confident but incorrect answers to questions about the BBNJ Agreement, then retracted or qualified those answers only when pressed — a pattern that is particularly hazardous in legal research where the initial response is often the one that gets recorded and relied upon.
Across the findings below, the failures range from AI tools inventing article numbers for provisions that exist but are cited at the wrong location, to completely reversing the operative default of a treaty clause, to narrowing a legal threshold in a way that materially changes which activities are caught by a mandatory obligation.
| AI's Failure Mode | Count | Affected findings |
|---|---|---|
| Exposed Fabrication | 3 | Finding#1 · Finding#2 · Finding#3 |
| Misstated Rule | 1 | Finding#4 |
What that means for your team
Every finding in this cell carries liability or professional indemnity exposure for Law Firms firms whose Legal teams use AI-assisted research on BBNJ matters — the errors sit directly on provisions that determine client obligations, transaction risk, and the scope of mandatory regulatory compliance. The risk concentrates in two areas: transactions and advisory work involving marine genetic resources or digital sequence information (where incorrect retroactivity or benefit-sharing analysis can expose clients to unforeseen obligations), and environmental due diligence for high-seas activities (where a misidentified EIA threshold can lead a client to proceed without required authorisation).
| Risk Impact | Count | Affected findings |
|---|---|---|
| Liability / PI exposure | 4 | Finding#1 · Finding#2 · Finding#3 · Finding#4 |
When this affects your department
A Legal team at a Law Firms firm in an international jurisdiction is most likely to consult AI tools on the BBNJ Agreement when advising on the legal framework for a client's planned high-seas activities — ocean research cruises, deep-sea mining scoping, bioprospecting expeditions, or fisheries-adjacent operations that could trigger the Agreement's environmental impact assessment regime.
Equally common is supporting a client in the marine biotech, pharmaceuticals, or food ingredient sectors where marine genetic resources or the digital sequence information derived from them may now fall within the Agreement's benefit-sharing obligations, and the question of whether legacy sample collections are caught is a live commercial issue.
The Agreement also features in regulatory mapping exercises: law firms building treaty-compliance frameworks for clients entering the blue economy space need to understand how the Conference of the Parties' area-based management tool decisions interact with existing international shipping lanes and IMO competencies — a question directly tested in this cell. In all of these contexts, an AI-assisted first-pass on the relevant provisions is a tempting efficiency gain, and the risk is not that the AI will produce an obviously nonsensical answer.
The risk is that it will produce a plausible, well-formatted answer that cites a real provision, at the wrong article number, with the operative threshold subtly shifted, and that answer will travel into a memo, a due-diligence matrix, or a client briefing before anyone checks the treaty text.
If the error is incorporated into a client opinion and that client subsequently acts in reliance on it — proceeds without an EIA it was required to undertake, structures a transaction on the assumption that pre-entry-into-force MGR collections are non-retroactively exempt when the firm's advice said otherwise, or takes a position on COP competence based on the wrong article — the firm's professional indemnity exposure is direct. In cross-border mandates with multiple signatories to the Agreement, regulatory penalties, licence revocations, or missed benefit-sharing obligations can compound across jurisdictions.
The findings at a glance
The table below summarises each finding in this cell — the question area, how AI tools responded, and the regulatory text that contradicts that response. Each row links to a detailed per-finding card with the verbatim excerpt and citation classification.
Aggregate impact
Looking across the four findings together, a clear structural pattern emerges: the errors are not random noise but cluster on the Agreement's most operationally consequential provisions — the EIA trigger, the MGR retroactivity rule, the DSI benefit-sharing article, and the COP competence constraint. These are exactly the provisions a Legal team at a Law Firms firm would prioritise in a first-pass regulatory mapping exercise, because they define what clients must do, when they must do it, and what international body has authority to constrain their activities.
AI tools we tested got all four wrong, and in three cases did so with an apparent confidence that only gave way under challenge.
The retroactivity finding (Finding 2) is the most materially dangerous for a Law Firms firm, because the error completely inverts the operative default rather than simply citing the wrong article. The BBNJ Agreement is non-retroactive: its MGR and digital sequence information provisions apply only to resources collected after entry into force.
AI assistants we tested stated the opposite — that the regime is retroactive by default, with an opt-out available — which, if relied upon, would lead a firm to advise clients that pre-existing collections require benefit-sharing compliance action when they do not, or conversely could create confusion that causes clients to miss a genuine obligation. Two separate AI tools produced this inversion independently, suggesting it is a systematic failure on this provision rather than an isolated aberration.
The article misattribution errors (Findings 1, 3, and 4) carry a different but compounding risk for legal work: they undermine the defensibility of any opinion that relies on the AI's cross-reference. An advice letter citing Article 30 instead of Article 27 for the EIA screening threshold, or Article 15.5 instead of Article 14(1) for DSI benefit-sharing, or Article 8 instead of Article 22(2) for the COP non-undermining constraint, will not survive regulatory scrutiny or a peer review process — and in a contested matter, incorrect citations become a focus of challenge.
For international-jurisdiction law firms where treaty-text precision is not a secondary matter but the foundation of the opinion, the cumulative effect of these errors across a single research exercise is significant.
What your team should do
The default position for a Legal team at a Law Firms firm should be to treat AI-generated analysis of BBNJ Agreement provisions as a starting-point for orientation only — useful for building a conceptual map of the Agreement's structure, identifying which parts of the treaty may be relevant to a client matter, and drafting initial question lists. No AI-generated provision summary, article reference, or default-rule characterisation should leave the team's hands without verification against the treaty text itself, which is publicly available through the United Nations Treaty Collection at treaties.un.org.
The agreement was adopted in 2023 and has not yet attracted the volume of settled commentary that older treaties have — which means AI tools are more likely to confuse draft language, negotiating-text variants, and secondary analysis with the final treaty text.
For the specific provisions that failed in this cell, the team should build a short verification checklist into any BBNJ research workflow: confirm the EIA screening threshold is the "may have more than a minor or transitory effect" precautionary standard at Article 27 (not a higher probability bar); confirm that MGR and DSI provisions are non-retroactive by default under Article 10(1) (applying only to post-entry-into-force collections); confirm that DSI benefit-sharing obligations are anchored in Article 14(1); and confirm that the COP's non-undermining obligation for area-based management tool decisions is at Article 22(2).
These are not time-consuming checks against a lengthy text — they are targeted lookups that take minutes and eliminate the most material risk this cell documents.
AI tools are reasonably safe to use for tasks that do not require precise article-level accuracy on the BBNJ Agreement: drafting background sections that describe the Agreement's general purpose and scope, producing comparison matrices across multiple treaty regimes where the firm will verify each entry, and generating client FAQ templates that the legal team will review and correct before use. The risk profile is sharpest when AI output travels directly into a client opinion, a due-diligence report, or a regulatory submission — and in those contexts, treaty text verification is not optional.
How RLB Can Help
RegLeg's published Hallucination Research gives Legal teams at law firms a ready pre-flight check before placing weight on AI-assisted output in regulatory matters. Each research entry documents a confirmed failure mode against a specific instrument — the type of provision involved, how the AI went wrong, and the risk consequence — so lawyers can run a quick cross-reference against the regulation they are working with before finalising advice, drafting submissions, or briefing clients. The research is freely available and requires no engagement to access.
For firms that want to go further, RLB offers bespoke regulator deep-dives scoped to the specific bodies and instruments your Legal function works with most. These engagements map which AI-supported workflows — regulatory research, precedent checking, cross-border compliance comparison, client advice drafting — carry the highest hallucination exposure in your practice context, and produce a ranked risk register the team can act on immediately. The output is confidential and is tailored to the jurisdictions and regulatory perimeters your firm operates across.
RLB also conducts confidential reviews of existing AI-use policies against its failure-mode catalogue, identifying gaps between the controls a firm has documented and the classes of error its AI tools are most likely to produce on regulatory questions. Each review closes with a prioritised remediation plan. Alongside policy work, RLB can supply training materials and CPD-aligned content — structured around real failure cases — that Legal teams can deploy internally to build consistent, defensible AI literacy across practice groups and seniority levels.