Unlocking large scale AI training networks with MRC (Multipath Reliable Connection)

OpenAI Blog ·

OpenAI released MRC, a new networking protocol for AI training clusters that enables multipath reliability, improving resilience and performance at scale through the Open Compute Project.

Categories: Research

Excerpt

OpenAI introduces MRC (Multipath Reliable Connection), a new supercomputer networking protocol released via OCP to improve resilience and performance in large-scale AI training clusters.