AI-Bio Safeguards

Ensure that AI systems capable of substantially accelerating biological misuse incorporate robust safeguards.

Last updated: May 11, 2026 · Public updates are batched quarterly, with urgent corrections as needed.

5-year budget target

$75M

Pillar

Prevent

Current Status of the Field

Technology

LLM bio-risk evaluations are emerging, but bio-foundation-model evaluations, agentic workflow tests, and autonomous-lab risk tooling remain early and fragmented.

Policy

Frontier AI commitments increasingly mention biological risk, but requirements rarely distinguish LLMs, specialized biological models, and AI systems linked to laboratory execution.

Bottleneck

The bottleneck is separating stakeholder-specific tracks while preserving a common threshold for biological capabilities of concern and a common expectation of independent evaluation.

The Problem

AI is rapidly lowering the expertise barrier for biological weapons design, and safeguards have not kept pace. Frontier LLMs, biological foundation models, agentic systems, and autonomous-lab workflows share some risks but have different developers, evaluation tools, users, and failure modes. Not all frontier models undergo biological risk evaluation before deployment, bio-specific model evaluation tooling remains immature, and there are no widely adopted norms governing model release, high-risk biological training data, monitored access, or AI systems that can move from biological design toward physical creation.

The Solution

While further research is needed, a meaningful set of technical and governance interventions already exists, including evaluation standards, managed access, user verification, safeguard deployment, and independent red-teaming. The immediate bottleneck is that many AI developers do not conduct appropriate biorisk evaluation in the first place, and biological foundation models often need different evaluation tooling than general-purpose LLMs. Securing separate commitments for frontier LLMs, biological foundation models, and autonomous-lab risk pilots is therefore the urgent first move. The longer-term goal is broader: consensus definitions of biological capabilities of concern, full evaluation coverage for relevant models, secure handling of high-risk biological training data, no open-source proliferation of catastrophic biological capability, and carefully monitored access pathways for legitimate defensive uses.

Objectives

  • Consensus definitions of biological capabilities of concern are adopted to inform evaluation and safeguards globally
  • High-risk biological training data is generated and shared in secure environments, and no open-source model proliferates the capability to design catastrophic pandemic pathogens
  • 100% of frontier LLMs with biological capabilities of concern undergo pre-deployment biorisk evaluation and deploy effective and proportionate safeguards
  • 100% of biological foundation models and other specialized biological AI systems with capabilities of concern undergo dedicated evaluation using bio-specific tooling, not only general-purpose LLM tests
  • Autonomous AI-lab workflows that connect design tools to physical creation pathways are evaluated and governed before they can operate at scale
  • Defensive use is accelerated with secure early access to monitored models for developing medical countermeasures against pandemic pathogens, under a two-tier system: general frontier models with limited bio capabilities and safeguards to keep those limited, and monitored bio-helpful models with adequate KYC and oversight
  • A Guardian AI system operates as a continuous, real-time monitoring layer across frontier AI deployments with biological capabilities — detecting anomalous query patterns, novel jailbreak attempts, and coordinated misuse campaigns, with the authority to throttle or suspend access autonomously when risk thresholds are exceeded

Urgent 2026 Milestone

Secure binding biorisk evaluation and safeguard commitments for both frontier LLMs and biological foundation models, and launch a dedicated pilot for autonomous lab plus AI risk.

Long-term Targets

Evaluation coverage

100% by end of 2026

Safeguard coverage

100% by end of 2028

Data security coverage

100% by end of 2028

Year-by-Year

Philanthropy

  • Convene international processes to define biological capabilities of concern for frontier AI and biological models
  • Fund separate evaluation standards for frontier LLM biological risk and biological foundation model risk
  • Fund pilot evaluation of autonomous AI-lab workflows that connect biological design to physical creation pathways
  • Fund pilot safeguards, including managed access and monitoring, for agentic systems and high-risk biological AI models
  • Fund independent AI-bio red-teaming

Private Sector

  • Conduct pre-deployment biorisk evaluations for relevant frontier LLMs and submit them to independent red-teaming
  • Develop and test bio-specific evaluation tooling for biological foundation models
  • Identify autonomous-lab integrations that require evaluation before broader deployment
  • Implement proven safeguards for models that exhibit capabilities of concern
  • Support information sharing and norm development among frontier developers

Government

  • Adopt or begin adopting policy definitions of biological capabilities of concern for AI systems
  • Support standards development for pre-deployment AI biorisk evaluation across separate LLM, biological foundation model, and autonomous-lab tracks
  • Use formal agreements, procurement conditions, or other mechanisms to encourage safeguard adoption
  • Begin developing policy approaches for secure handling of high-risk biological training data and responsible model release

Philanthropy

  • Continue funding deployment of screening software and independent audit capacity across providers, including smaller and international firms
  • Support implementation-focused KYC tools and verification infrastructure
  • Continue funding separate AI-bio evaluation tracks for frontier LLMs, biological foundation models, and autonomous-lab workflows
  • Support broader bio-audit adoption, institutional awareness, and development of publication-risk-management norms
  • Continue support for misuse-detection tooling and cross-provider threat-sharing systems

Private Sector

  • Implement standardized synthesis screening and customer verification at scale
  • Participate in split-order detection systems and submit to independent screening verification where possible
  • Expand KYC adoption across high-risk vendors and service providers
  • Make AI-bio evaluation and safeguard deployment routine across separate LLM, biological foundation model, and autonomous-lab tracks
  • Expand participation in bio-audits, accident reporting, pre-publication review, and institutional adoption pilots
  • Routinely share threat signals and maintain misuse monitoring

Government

  • Move from legislative or rulemaking activity to initial compliance and oversight in synthesis screening and KYC
  • Support independent verification of AI biorisk evaluation and safeguard adoption through track-specific formal agreements or requirements
  • Expand implementation of accident-reporting requirements, research-risk oversight, and institutional participation pathways
  • Support law-enforcement pathways for escalating provider-generated threat signals
2028 — Scale and enforcePrevent-level actions

Philanthropy

  • Support more advanced evasion-resistant and function-based screening R&D
  • Support scaling of KYC tools and red-teaming across a broader set of high-risk service categories
  • Support secure data infrastructure for high-risk biological AI training data and continued safeguard research across LLM, bio-foundation-model, and autonomous-lab tracks
  • Support global governance reform, institutional adoption, and continued capacity-building for bio-audits and risk-based research oversight
  • Support public communication efforts that reinforce deterrence posture and accountability

Private Sector

  • Submit to independent screening verification and deploy updated screening algorithms
  • Extend KYC and more extensive customer vetting to newly covered categories such as CROs, cloud labs, repositories, and AI/bio tools
  • Continue safeguard deployment for high-risk AI systems and participate in track-specific red-teaming
  • Expand compliance with institutional risk-governance requirements, adoption programs, and misuse monitoring mandates

Government

  • Enforce synthesis-screening compliance and audit requirements
  • Regulate the secondary market for benchtop synthesis devices and broaden KYC requirements to additional high-risk categories
  • Mandate safeguards for high-risk AI systems with biological capabilities across separate LLM, biological foundation model, and autonomous-lab pathways
  • Institutionalize accountable public oversight for high-consequence biological research, including licensing, institutional participation, and bio-audit frameworks
  • Institute requirements for misuse monitoring and threat sharing for high-risk biological services

Philanthropy

  • Focus remaining philanthropic capital on gap-closing, deterrence signaling, and ensuring high-risk systems actually perform under stress
  • Support maturation of attribution capabilities and public communication around accountability
  • Continue targeted support for jurisdictions, sectors, or institutions lagging in compliance, AI-bio safeguards, or research-governance adoption

Private Sector

  • Achieve broad compliance with screening, KYC, safeguard, and misuse-monitoring requirements
  • Maintain regular red-teaming, audit participation, and rapid remediation processes
  • Support stronger bioforensic attribution capability through data-sharing and exercise participation

Government

  • Ensure broad compliance and meaningful enforcement across screening, KYC, AI-bio safeguards, and research governance
  • Verify that high-risk LLM, biological foundation model, and autonomous-lab safeguards perform under stress
  • Establish or strengthen attribution and accountability doctrine for biological attacks
  • Exercise bioforensic and deterrence capabilities routinely

Philanthropy

  • Focus on closing residual gaps, supporting lagging geographies or sectors, and maintaining pressure for continuous improvement
  • Continue support for frontier-risk research, AI-bio track-specific evaluation, institutional research-governance adoption, residual-risk assessment, and measurement refinement

Private Sector

  • Maintain high compliance and continuous improvement in screening, KYC, safeguards, red-teaming, and misuse monitoring

Government

  • Sustain enforcement, update standards to keep pace with new technologies, and ensure risk-governance and AI-bio safeguard systems remain current

Who's Working on This

Implementers

What's Still Needed

coalition
nonprofit
Secure Biological Data Stewardship Organization — trusted infrastructure for hosting high-risk biological training data
startup
Guardian model development — continuous real-time AI monitoring layer for frontier biological AI deployments