ai-for-less-suffering.com

descriptive claim

Within approximately one year, Claude's performance on the Virology Capabilities Test (VCT), an evaluation of virology lab troubleshooting designed by SecureBio, progressed from underperforming world-class expert virologists answering questions within their own specialty to comfortably exceeding that expert baseline.

desc_claude_vct_expert_baseline_crossed

confidence

0.80

Evidence (1)

supports (1)

Why do we take LLMs seriously as a potential source of biorisk? direct_measurement

weight

0.80

locator: Figure 1 and surrounding text, 'Don't experts know more about biology than AI models?' section

“Within a year, Claude went from underperforming world-class experts on an evaluation designed to test virology troubleshooting scenarios in a lab setting, to comfortably exceeding that baseline”

Camps holding this claim (5)