Continuously hardening ChatGPT Atlas against prompt injection

OpenAI Blog · Dec 22, 2025

OpenAI uses RL-trained automated red teaming to proactively discover and patch prompt injection vulnerabilities in ChatGPT Atlas.

Categories: Research

Excerpt

OpenAI is strengthening ChatGPT Atlas against prompt injection attacks using automated red teaming trained with reinforcement learning. This proactive discover-and-patch loop helps identify novel exploits early and harden the browser agent’s defenses as AI becomes more agentic.

Read at source: https://openai.com/index/hardening-atlas-against-prompt-injection