Building an early warning system for LLM-aided biological threat creation
OpenAI published a reproducible evaluation blueprint for measuring LLM uplift in biological threat creation, finding GPT-4 provides only a mild advantage over unaided experts.
Excerpt
We’re developing a blueprint for evaluating the risk that a large language model (LLM) could aid someone in creating a biological threat. In an evaluation involving both biology experts and students, we found that GPT-4 provides at most a mild uplift in biological threat creation accuracy. While this uplift is not large enough to be conclusive, our finding is a starting point for continued research and community deliberation.
Read at source: https://openai.com/index/building-an-early-warning-system-for-llm-aided-biological-threat-creation