descriptive claim
Modifications to a deployed LLM's system prompt can materially shift the chatbot's output distribution toward the described disposition without retraining, as evidenced by Grok's behavioral shift following the 'politically incorrect' system-prompt update and reversion after removal.
desc_system_prompt_behavioral_leverage
confidence 0.75
Evidence (1)
supports (1)
- weight0.60
locator: Section 'Not shy'
βHe said the changes to Grok appeared to have encouraged the bot to reproduce toxic content.β