descriptive claim
Within approximately one year, Claude's performance on the Virology Capabilities Test (VCT), an evaluation of virology lab troubleshooting designed by SecureBio, progressed from underperforming world-class expert virologists answering questions within their own specialty to comfortably exceeding that expert baseline.
desc_claude_vct_expert_baseline_crossed
confidence 0.80
Evidence (1)
supports (1)
- Why do we take LLMs seriously as a potential source of biorisk? direct_measurementweight0.80
locator: Figure 1 and surrounding text, 'Don't experts know more about biology than AI models?' section
“Within a year, Claude went from underperforming world-class experts on an evaluation designed to test virology troubleshooting scenarios in a lab setting, to comfortably exceeding that baseline”