descriptive claim
In its first year operating under RSP v1, Anthropic self-identified four instances of falling short of the policy's letter --- including a 3-day-late evaluation, an unauthorized autonomy-evaluation update, missing elicitation techniques (best-of-N, chain-of-thought), and failure to explicitly design evaluations to establish a 6x scaling buffer.
desc_rsp_self_reported_noncompliance
confidence 0.90
Evidence (1)
supports (1)
- Anthropic's Responsible Scaling Policy primary_testimonyweight0.90
locator: Learning from Experience
“we reviewed how well we adhered to the framework and identified a small number of instances where we fell short of meeting the full letter of its requirements”