AI on BBNJ High Seas Biodiversity Agreement for Legal teams at Oil & Gas firms in international jurisdictions

Executive Summary

The BBNJ Agreement — the UN treaty governing biodiversity of areas beyond national jurisdiction, including environmental impact assessment obligations and area-based management tools on the high seas — creates direct compliance obligations for oil and gas firms conducting or planning offshore activities in international waters. Legal teams at those firms are increasingly consulting AI tools to interpret the treaty's screening thresholds, jurisdictional limits, and institutional competence rules. Across the two findings in this cell, AI assistants got the regulation wrong on both occasions tested, producing answers that were factually incorrect on critical technical points.

The failures split between an AI that confidently stated the wrong legal threshold for triggering an EIA screening — and then walked back its answer when pressed — and an AI that cited the right principle but attributed it to the wrong article, with material consequences for how a legal team would locate and verify the rule. For a legal function advising on high-seas operations, both errors would produce flawed internal work product before the mistake was likely to be caught.

How AI gets this regulation wrong

AI tools tested on the BBNJ Agreement made two distinct types of error: in one case, the AI gave a wrong answer with apparent confidence and only admitted uncertainty when challenged; in the other, it invented an incorrect article citation for a real legal principle. The table below sets out how each failure mode appeared in practice for this regulation.

AI's Failure Mode	Count	Affected findings
Exposed Fabrication	1	Finding#1
Misstated Rule	1	Finding#2

What that means for your team

Both findings in this cell carry the same practical risk category for a legal team: a wrong deliverable. Whether the AI misstates the EIA screening threshold or misattributes the non-undermining principle to the wrong article, the output is a legal memorandum, briefing, or compliance checklist built on an incorrect foundation. The table below maps each finding to the downstream work-product risk it creates for an oil and gas legal function advising on high-seas operations.

Risk Impact	Count	Affected findings
Wrong deliverable	2	Finding#1 · Finding#2

When this affects your department

Legal teams at oil and gas firms in international jurisdictions encounter the BBNJ Agreement across several pressure points: scoping new deepwater exploration blocks or pipeline routes that cross the high seas; drafting internal compliance frameworks ahead of ratification deadlines; advising the business on whether a proposed activity falls inside or outside the treaty's EIA screening trigger; and responding to investor or lender due-diligence requests that ask the firm to map its international-waters exposure against emerging treaty obligations.

In each of these contexts, a legal team member may turn to an AI tool to get a fast orientation on what the treaty actually requires before commissioning more expensive specialist advice.

The risk is that an AI answer used in the scoping phase becomes the invisible baseline for subsequent work. If an AI understates the EIA trigger threshold — describing it as applying only to activities "likely to" cause harm rather than activities that "may have" more than a minor or transitory effect, as the treaty actually states — the legal team may conclude that a borderline activity clears the screening test and proceed without commissioning an EIA. That conclusion then flows into project-approval documents, board papers, and lender reporting before anyone re-reads the primary text.

Similarly, if the AI attributes the non-undermining principle governing the COP's area-based management authority to the wrong article, internal compliance documents will cite the wrong provision, potentially causing embarrassment in regulatory correspondence or weakening the firm's position in any dispute about the legality of a COP measure affecting a shipping or pipeline route.

At stake for the firm is not merely reputational exposure but concrete operational and financial risk. An oil and gas company that fails to commission a required EIA for a high-seas activity may face challenges from treaty parties, flag-state objections, or complications with port-state access, any of which can interrupt or delay a project with significant capital already committed. Lenders and insurers increasingly require treaty-compliance confirmations for offshore assets; a legal opinion based on a misread EIA threshold may not survive a challenge, exposing the firm to indemnity claims.

Getting the regulatory mapping right at the outset — with AI used only for orientation, not as the source of the final legal position — is the minimal safeguard.

The findings at a glance

The table below summarises each finding in this cell — the question asked of AI tools, how the AI answered, and what the correct position is under the BBNJ Agreement.

#	Finding title	Type	Citation ID
1	EIA screening threshold — wrong standard cited	Hallucination	RLB-F-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q001
2	COP ABMT authority — correct principle, wrong article	Hallucination	RLB-F-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q005

Aggregate impact

Both findings in this cell concern the structural architecture of the BBNJ Agreement's regulatory obligations rather than peripheral detail — the EIA screening threshold and the limits on the COP's area-based management authority are two of the treaty's most operationally significant provisions for the extractive industries. That both failures appeared on these central, consequential questions, rather than on obscure transitional provisions, is the more significant pattern. It suggests that AI tools are not failing on edge cases but on the core treaty text that a legal team would consult first.

The error shape also follows a consistent logic. In the EIA threshold finding, the AI substituted a higher-bar standard ("likely to have") for the treaty's precautionary standard ("may have"), reducing the apparent scope of the EIA obligation. In the article-attribution finding, the AI correctly identified the non-undermining principle but placed it in the wrong article, meaning a legal team following the AI's citation would locate the general UNCLOS-relationship clause rather than the specific ABMT provision that actually constrains COP decisions.

Both errors produce answers that are plausible enough to pass casual review — they contain the right concepts in the wrong configuration — which makes them more, not less, dangerous than a clearly absurd hallucination.

For an oil and gas legal function advising across multiple international jurisdictions, the systemic risk is that these two types of error — threshold narrowing and article misattribution — can each silently corrupt a larger body of compliance work. A flawed EIA-threshold memo circulates to the assets team and becomes the reference point for multiple project decisions; a wrong article citation appears in correspondence with counterparties or regulators and cannot easily be corrected once sent.

The combination of confident presentation and plausible-but-wrong content means that the usual internal review process — reading the AI output for logical consistency rather than checking against primary sources — is insufficient to catch these errors before they cause harm.

What your team should do

The default position for a legal team working on BBNJ-related matters should be that AI tools are unsuitable as the primary source for any advice that turns on a specific threshold, article number, or jurisdictional limit within the treaty. The two findings in this cell are both examples of AI producing an answer that looked authoritative but was wrong on the operative legal detail — exactly the kind of output that would not be caught by a routine document review unless a team member checked the primary text.

For work product that will be relied upon externally — legal opinions, regulatory submissions, lender compliance confirmations — the AI output should be treated as background context only, with the final position verified against the treaty text directly.

Practically, the legal team can establish a two-step discipline for BBNJ questions. First, use AI tools to orient quickly on the overall structure of the treaty: what the Part III EIA provisions cover, how the COP's ABMT authority sits alongside UNCLOS and IMO competences, and what the general ratification timeline looks like. These structural questions are lower-risk because errors are easier to spot when the team already has a map of the treaty.

Second, for any specific threshold, article reference, or institutional-authority question — particularly any question that will determine whether an activity requires an EIA or whether a COP measure can validly restrict vessel transit — go directly to the treaty text and, where interpretation is live, to specialist international environmental law counsel. This two-step approach does not eliminate AI from the workflow; it puts it where it adds value without creating risk.

AI tools are genuinely useful for BBNJ background work: summarising the treaty's negotiation history for a board briefing, drafting a plain-language explainer for non-legal colleagues, mapping the treaty's relationship to other international frameworks such as UNCLOS, ISA, or IMO instruments at a high level, or generating a first-pass checklist of questions for counsel. None of these uses depend on the AI getting a specific threshold or article number right.

The discipline the team needs is knowing which questions require primary-source verification before the answer is relied upon, and the two findings in this cell provide a concrete illustration of where that line sits for the BBNJ Agreement.

How RLB Can Help

RegLeg's published Hallucination Research gives Legal teams at Oil & Gas firms a practical pre-flight check before relying on AI-assisted output on regulatory questions. Because the research is drawn from live regulatory texts across multiple jurisdictions, it surfaces the specific failure modes — misquoted obligations, fabricated cross-references, outdated compliance thresholds — that carry the greatest consequence in a legal context. Reviewing the relevant findings before deploying AI tools on a regulatory matter takes minutes and can prevent the kind of error that reaches a regulator or counterparty.

Beyond the published research, RLB works with Legal functions directly to map which AI-supported workflows in an Oil & Gas firm carry the highest hallucination exposure. Licence compliance, environmental permitting, cross-border trade obligations, and sanctions screening each present a different risk profile, and the right mitigation for a contract review workflow differs from the right mitigation for a regulatory horizon-scanning task. RLB's bespoke regulator deep-dives identify where the failure modes documented in the research are most likely to materialise in your firm's specific operating context, so effort is directed where it matters most.

RLB also offers a confidential review of an existing AI-use policy against its failure-mode catalogue, with prioritised remediation guidance framed around the Legal team's day-to-day work rather than generic AI governance principles. Alongside that, RLB can develop training material and CPD-aligned content the team can use internally — giving lawyers the working knowledge to interrogate AI output critically, brief non-legal colleagues on appropriate reliance limits, and meet any continuing professional development obligations around technology competence.