descriptive claim
DeepMind's FSF v3 replaces the prior exploratory instrumental-reasoning (deceptive-reasoning) CCL approach with expanded protocols attached to the machine learning R&D CCLs, addressing misalignment risk from models that could accelerate AI R&D to destabilizing levels rather than from detected deceptive reasoning per se.
desc_fsf_v3_drops_instrumental_reasoning_ccl
confidence 0.85
Evidence (1)
supports (1)
- Strengthening our Frontier Safety Framework primary_testimonyweight0.90
locator: Section: Adapting our approach to misalignment risks
“While our previous version of the Framework included an exploratory approach centered on instrumental reasoning CCLs ... with this update we now provide further protocols for our machine learning research and development CCLs”