Introducing Activation Atlases

OpenAI Blog ·

Activation Atlases technique visualizes neuron interactions in neural networks, enabling better understanding of internal decision-making for identifying weaknesses.

Categories: Research

Excerpt

We’ve created activation atlases (in collaboration with Google researchers), a new technique for visualizing what interactions between neurons can represent. As AI systems are deployed in increasingly sensitive contexts, having a better understanding of their internal decision-making processes will let us identify weaknesses and investigate failures.