Extracting Concepts from GPT-4
OpenAI scales sparse autoencoders on GPT-4 to automatically extract 16 million interpretable concepts, advancing mechanistic interpretability research.
Excerpt
Using new techniques for scaling sparse autoencoders, we automatically identified 16 million patterns in GPT-4's computations.
Read at source: https://openai.com/index/extracting-concepts-from-gpt-4