source · primary doc
Strengthening our Frontier Safety Framework
src_deepmind_fsf_v3_update
https://deepmind.google/blog/strengthening-our-frontier-safety-framework/
authors: Four Flynn, Helen King, Anca Dragan
published: 2025-09-22
accessed: 2026-04-19
Notes
First-party DeepMind announcement describing its own safety framework; primary_doc prior (0.90) lightly discounted because it is a self-descriptive governance statement, not a binding artifact.
Intake provenance
- method
- httpx
- tool
- afls-ingest/0.0.1
- git sha
- 4d098737f648
- at
- 2026-04-19T20:23:29.971598Z
- sha256
- 1cd3676dbe86…
Evidence from this source (5)
- weight0.95
method: primary_testimony · locator: Section: Addressing the risks of harmful manipulation
“we're introducing a Critical Capability Level (CCL) focused on harmful manipulation --- specifically, AI models with powerful manipulative capabilities that could be misused to systematically and substantially change beliefs and behaviors in identified high stakes contexts”
- weight0.85
method: primary_testimony · locator: Section: Sharpening our risk assessment process
“Building on our core early-warning evaluations, we describe how we conduct holistic assessments that include systematic risk identification, comprehensive analyses of model capabilities and explicit determinations of risk acceptability.”
- weight0.90
method: primary_testimony · locator: Section: Adapting our approach to misalignment risks
“For advanced machine learning research and development CCLs, large-scale internal deployments can also pose risk, so we are now expanding this approach to include such deployments.”
- weight0.90
method: primary_testimony · locator: Section: Adapting our approach to misalignment risks
“While our previous version of the Framework included an exploratory approach centered on instrumental reasoning CCLs ... with this update we now provide further protocols for our machine learning research and development CCLs”
- weight0.95
method: primary_testimony · locator: Section: FSF 3.1: Introducing tracked capability levels
“As of April 17, 2026, we are adding Tracked Capability Levels (TCLs) in certain domains to our Frontier Safety Framework, introducing a new capability level to help us spot and evaluate potential less extreme risks sooner.”