source · blog

Why do we take LLMs seriously as a potential source of biorisk?

src_anthropic_biorisk_post

https://red.anthropic.com/2025/biorisk/

reliability

0.55

authors: Anthropic

published: 2025-09-05

accessed: 2026-04-19

Notes

Above blog prior (0.35): first-party post from a frontier lab reporting on its own internal evaluations and safety actions, describing methodology and results directly. Still self-interested, hence capped below press/paper priors.

Intake provenance

method: httpx
tool: afls-ingest/0.0.1
git sha: 4d098737f648
at: 2026-04-19T23:13:04.438402Z
sha256: e803f08021cf…

Evidence from this source (5)

Anthropic's 2024 preliminary wet-lab uplift pilot (n=8) on basic biology lab pr… support

weight

0.70

method: direct_measurement · locator: 'Ok, but we've been talking about information...' section

“We did not observe any evidence of uplift in this study. However, we noted that all participants, including the internet-only group, did surprisingly well on all tasks in the real lab.”
Anthropic is co-sponsoring a larger wet-lab uplift study through the Frontier M… support

weight

0.90

method: primary_testimony · locator: 'Ok, but we've been talking about information...' section

“we are co-sponsoring a larger study through the Frontier Model Forum (FMF) to further investigate the ability of AI to aid people engaged in real tasks in a laboratory. In conjunction with the pandemic prevention non-profit Sentinel Bio”
Material barriers that previously served as passive biodefense --- including th… support

weight

0.65

method: expert_estimate · locator: 'Why focus on biorisk?' section

“the decreasing cost of nucleic acid synthesis, standardization of reagent kits, and easy access to standard molecular biology equipment (such as PCR machines), are making material acquisition less of a bottleneck.”
In Anthropic's controlled bioweapons-acquisition-planning uplift trials, partic… support

weight

0.85

method: direct_measurement · locator: Figure 2 and 'Is an LLM's knowledge useful in an applied scenario?' section

“Participants with access to Claude 4 models---especially Claude Opus 4---received much higher scores and developed plans with substantially fewer critical failures compared to the internet-only control group.”
Within approximately one year, Claude's performance on the Virology Capabilitie… support

weight

0.80

method: direct_measurement · locator: Figure 1 and surrounding text, 'Don't experts know more about biology than AI models?' section

“Within a year, Claude went from underperforming world-class experts on an evaluation designed to test virology troubleshooting scenarios in a lab setting, to comfortably exceeding that baseline”