Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation (Talk)
- Slava Elizarov (Senior Research Scientist )
- Unity
In this talk, I will present Geometry Image Diffusion (GIMDiffusion), a novel method designed to generate 3D objects from text prompts efficiently. GIMDiffusion uses geometry images, a 2D representation of 3D shapes, which allows the use of existing image-based architectures instead of complex 3D-aware models. This approach reduces computational costs and simplifies the model design. By incorporating Collaborative Control, the method exploits rich priors of pretrained Text-to-Image models like Stable Diffusion, enabling strong generalization even with limited 3D training data. GIMDiffusion produces 3D objects with semantically meaningful, separable parts and internal structures, which enhances the ease of manipulation and editing.
Biography: Slava Elizarov is a Senior Research Scientist at Unity, specializing in generative models, computer vision, and 3D graphics. His focus is on applying generative models to 3D graphics, and Text-to-3D in particular. Previously, he worked on diffusion models for virtual try-on, 3D reconstruction, and cloth dynamics simulation. Before joining Unity, he co-founded Limner.ai and developed deep learning models at WANNA (a subdivision of Farfetch). He holds a Master’s degree in Applied Mathematics and Computer Science from the Higher School of Economics (Moscow), and has extensive experience teaching deep learning.
3D Generative models
Details
- 26 September 2024 • 14:00 - 15:00
- Virtual, Live stream at Max-Planck-Ring 4, N3, Aquarium
- Perceiving Systems