Hierarchical text-conditional image generation with CLIP latents

OpenAI Blog ·

OpenAI published the unCLIP paper describing the hierarchical latent diffusion architecture behind DALL-E 2, combining CLIP latents with autoregressive/decoding components.

Categories: Research