Reasoning models struggle to control their chains of thought, and that’s good
OpenAI research introduces CoT-Control, finding reasoning models struggle to control chain-of-thought behavior, which reinforces monitorability as an AI safety safeguard.
Excerpt
OpenAI introduces CoT-Control and finds reasoning models struggle to control their chains of thought, reinforcing monitorability as an AI safety safeguard.
Read at source: https://openai.com/index/reasoning-models-chain-of-thought-controllability