source · primary doc
Anthropic's Responsible Scaling Policy
src_anthropic_rsp_updates
https://www.anthropic.com/responsible-scaling-policy
reliability 0.88
authors: Anthropic
published: 2026-04-02
accessed: 2026-04-19
Notes
First-party policy document from Anthropic describing its own RSP; highly reliable for what Anthropic commits to, less so for whether commitments bind in practice.
Intake provenance
- method
- httpx
- tool
- afls-ingest/0.0.1
- git sha
- 604c9dfd252a
- at
- 2026-04-19T18:42:46.657051Z
- sha256
- b9282a279421…
Evidence from this source (5)
- weight0.95
method: primary_testimony · locator: March 31, 2025 update
“we have added a new capability threshold related to CBRN development... we have disaggregated our existing AI R&D capability thresholds, separating them into two distinct levels (the ability to fully automate entry-level AI research work, and the ability to cause dramatic acceleration in the rate of effective scaling)”
- weight0.90
method: primary_testimony · locator: Learning from Experience
“we reviewed how well we adhered to the framework and identified a small number of instances where we fell short of meeting the full letter of its requirements”
- weight0.95
method: primary_testimony · locator: Planned ASL-3 Safeguards > Deployment Safeguards
“The four layers will be: Access controls... Real-time prompt and completion classifiers... Asynchronous monitoring classifiers... Post-hoc jailbreak detection with rapid response procedures”
- weight0.95
method: primary_testimony · locator: April 2, 2026 update
“our language around AI doubling the rate of progress ("compress two years of 2018 – 2024 AI progress into a single year") could have been read as... "doubling the productivity of researchers". In v3.1, we are clear that we mean the former and not the latter.”
- weight0.90
method: primary_testimony · locator: Security Safeguards > Access control for model weights
“Implement multi-party authorization and mandatory code review on production code to remove persistent, high-privilege access to model weights... Require hardware authentication device prompt, justification and employee approval to grant access.”