Introducing Activation Atlases
Activation Atlases technique visualizes neuron interactions in neural networks, enabling better understanding of internal decision-making for identifying weaknesses.
Excerpt
We’ve created activation atlases (in collaboration with Google researchers), a new technique for visualizing what interactions between neurons can represent. As AI systems are deployed in increasingly sensitive contexts, having a better understanding of their internal decision-making processes will let us identify weaknesses and investigate failures.
Read at source: https://openai.com/index/introducing-activation-atlases