Continuously hardening ChatGPT Atlas against prompt injection
OpenAI uses RL-trained automated red teaming to proactively discover and patch prompt injection vulnerabilities in ChatGPT Atlas.
Excerpt
OpenAI is strengthening ChatGPT Atlas against prompt injection attacks using automated red teaming trained with reinforcement learning. This proactive discover-and-patch loop helps identify novel exploits early and harden the browser agent’s defenses as AI becomes more agentic.
Read at source: https://openai.com/index/hardening-atlas-against-prompt-injection