source · primary doc

Activating AI Safety Level 3 protections

src_anthropic_asl3_activation

https://www.anthropic.com/news/activating-asl3-protections

reliability

0.85

authors: Anthropic

published: 2025-05-22

accessed: 2026-04-19

Notes

First-party policy announcement from Anthropic describing its own ASL-3 activation decision and controls. Slightly below the 0.90 primary_doc prior because it is self-reporting on its own risk posture with implicit marketing incentives, but the underlying factual assertions (activation decision, specific controls implemented) are verifiable first-party.

Intake provenance

method: httpx
tool: afls-ingest/0.0.1
git sha: 4d098737f648
at: 2026-04-19T23:12:34.616682Z
sha256: 81f2b2fa37b4…

Evidence from this source (6)

Anthropic has implemented preliminary egress bandwidth controls as part of ASL-… support

weight

0.90

method: primary_testimony · locator: Security section

“By limiting the rate of outbound network traffic, these controls can leverage model weight size to create a security advantage... we expect to get to the point where rate limits are low enough that exfiltrating model weights before being detected is very difficult—even if an attacker has otherwise significantly compromised our systems.”
On May 22, 2025, Anthropic activated ASL-3 Deployment and Security Standards in… support

weight

0.95

method: primary_testimony · locator: Opening paragraph and Rationale section

“We are deploying Claude Opus 4 with our ASL-3 measures as a precautionary and provisional action. To be clear, we have not yet determined whether Claude Opus 4 has definitively passed the Capabilities Threshold that requires ASL-3 protections.”
Anthropic's ASL-3 deployment measures are narrowly scoped to preventing model a… support

weight

0.85

method: primary_testimony · locator: Footnote 3

“Initially they are focused exclusively on biological weapons as we believe these account for the vast majority of the risk”
Anthropic's ASL-3 Security Standard is designed to defend against sophisticated… support

weight

0.95

method: primary_testimony · locator: Footnote 6

“Nation-state threats (other than those using non-novel attack chains) and sophisticated insider risk are out of the scope of the ASL-3 Standard.”
Anthropic's ASL-3 deployment measures are narrowly scoped to preventing model a… support

weight

0.90

method: primary_testimony · locator: Deployment Measures section

“ASL-3 deployment measures do not aim to address issues unrelated to CBRN, to defend against non-universal jailbreaks, or to prevent the extraction of commonly available single pieces of information”
Anthropic's ASL-3 deployment architecture uses Constitutional Classifiers --- r… support

weight

0.90

method: primary_testimony · locator: Deployment Measures section

“We have implemented Constitutional Classifiers—a system where real-time classifier guards, trained on synthetic data representing harmful and harmless CBRN-related prompts and completions, monitor model inputs and outputs”