ai-for-less-suffering.com

← all claims

descriptive claim

In its first year operating under RSP v1, Anthropic self-identified four instances of falling short of the policy's letter --- including a 3-day-late evaluation, an unauthorized autonomy-evaluation update, missing elicitation techniques (best-of-N, chain-of-thought), and failure to explicitly design evaluations to establish a 6x scaling buffer.

desc_rsp_self_reported_noncompliance

confidence
0.90

Evidence (1)

supports (1)

  • weight
    0.90

    locator: Learning from Experience

    “we reviewed how well we adhered to the framework and identified a small number of instances where we fell short of meeting the full letter of its requirements”

Camps holding this claim (3)