source · primary doc

Strengthening our Frontier Safety Framework

src_deepmind_fsf_v3_update

https://deepmind.google/blog/strengthening-our-frontier-safety-framework/

reliability

0.88

authors: Four Flynn, Helen King, Anca Dragan

published: 2025-09-22

accessed: 2026-04-19

Notes

First-party DeepMind announcement describing its own safety framework; primary_doc prior (0.90) lightly discounted because it is a self-descriptive governance statement, not a binding artifact.

Intake provenance

method: httpx
tool: afls-ingest/0.0.1
git sha: 4d098737f648
at: 2026-04-19T20:23:29.971598Z
sha256: 1cd3676dbe86…

Evidence from this source (5)

DeepMind's Frontier Safety Framework v3 adds a Critical Capability Level for ha… support

weight

0.95

method: primary_testimony · locator: Section: Addressing the risks of harmful manipulation

“we're introducing a Critical Capability Level (CCL) focused on harmful manipulation --- specifically, AI models with powerful manipulative capabilities that could be misused to systematically and substantially change beliefs and behaviors in identified high stakes contexts”
FSF v3 extends DeepMind's risk assessment process beyond early-warning capabili… support

weight

0.85

method: primary_testimony · locator: Section: Sharpening our risk assessment process

“Building on our core early-warning evaluations, we describe how we conduct holistic assessments that include systematic risk identification, comprehensive analyses of model capabilities and explicit determinations of risk acceptability.”
Under FSF v3, DeepMind extends its pre-launch safety case review process to cov… support

weight

0.90

method: primary_testimony · locator: Section: Adapting our approach to misalignment risks

“For advanced machine learning research and development CCLs, large-scale internal deployments can also pose risk, so we are now expanding this approach to include such deployments.”
DeepMind's FSF v3 replaces the prior exploratory instrumental-reasoning (decept… support

weight

0.90

method: primary_testimony · locator: Section: Adapting our approach to misalignment risks

“While our previous version of the Framework included an exploratory approach centered on instrumental reasoning CCLs ... with this update we now provide further protocols for our machine learning research and development CCLs”
In FSF 3.1 (April 2026 update), DeepMind introduces Tracked Capability Levels (… support

weight

0.95

method: primary_testimony · locator: Section: FSF 3.1: Introducing tracked capability levels

“As of April 17, 2026, we are adding Tracked Capability Levels (TCLs) in certain domains to our Frontier Safety Framework, introducing a new capability level to help us spot and evaluate potential less extreme risks sooner.”