ai-for-less-suffering.com

← all claims

descriptive claim

DeepMind's FSF v3 replaces the prior exploratory instrumental-reasoning (deceptive-reasoning) CCL approach with expanded protocols attached to the machine learning R&D CCLs, addressing misalignment risk from models that could accelerate AI R&D to destabilizing levels rather than from detected deceptive reasoning per se.

desc_fsf_v3_drops_instrumental_reasoning_ccl

confidence
0.85

Evidence (1)

supports (1)

  • weight
    0.90

    locator: Section: Adapting our approach to misalignment risks

    “While our previous version of the Framework included an exploratory approach centered on instrumental reasoning CCLs ... with this update we now provide further protocols for our machine learning research and development CCLs”

Camps holding this claim (3)