[CVPR‘22] EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks

news2026/2/12 7:04:29

paper: https://nvlabs.github.io/eg3d/media/eg3d.pdf
project: EG3D: Efficient Geometry-aware 3D GANs
code: GitHub - NVlabs/eg3d

总结：

本文提出一种hybrid explicit-implicit 3D representation: tri-plane hybrid 3D representation，该方法不仅有更强的表达能力，速度更快，内存开销更小。
同时，为解决多视角不一致问题，引入相机参数矩阵作为StyleGANv2生成器、超分模型、Volume Rendering的控制条件。
最后，为解决超分模型导致的信息丢失问题，本文提出dual discrimination strategy，使得超分前后图像保持一致。

skirt the cimputational constraints

inherit their efficiency and expressiveness

xx has started to gain momentum as well

摘要

引言

贡献

近期工作

Neural scene representation and rendering.

Generative 3D-aware image synthesis.

Tri-plane hybrid 3D representation

3D GAN framework

CNN generator backbone and rendering

Dual discrimination

Modeling pose-correlated attributes

Experiments and results

Ablation study

Application

摘要

研究如何基于单视角2D图片，通过无监督方法，生成高质量、多视角一致的3D形状；
现有3D GAN存在问题：1）计算开销大；2）不具有3D一致性（3D-consistent）；
本文提出：1）expressive hybrid explicit-implicit network architecture：提速、减小计算开销；2）decoupling feature generation and neural rendering：可以借助sota 2D GAN，例如：StyleGAN2。
在FFHQ和AFHQ Cats的3D-aware synthesis任务上达到sota。

引言

现有2D GAN无法显式地建模潜在的3D场景；
近期3D GAN，开始解决：1）多视角一致的图片生成；2）无需多视角图片和几何监督，提取3D形状。但是3D GAN生成的图片质量和分辨率仍然远逊于2D GAN。还有一个问题是，目前3D GAN和Neural Rendering方法计算开销大。
3D GAN通常由两部分组成：1）生成网络中的3D结构化归纳偏置；2）neural rendering engine提供视角一致性结果。其中，归纳偏置可以被建模为：显式的体素网格或隐式的神经表达。但受限于计算开销，这两种表达方式都不适用于训练高分辨率的3D GAN。目前常用的方法是超分，但超分又会牺牲视觉连续性和3D形状的质量。
本文提出：1）hybrid explicit-implicit 3D representation由于提速、减小计算开销；2）dual discrimination strategy由于保留输出和neural rendering的一致性；3）对生成器引入pose-based conditioning，解耦pose相关属性，例如人脸表情系数；4）本文框架将特征生成从neural rendering中解耦出来，使得框架可以受益于sota 2D GAN，例如：StyleGAN2。

贡献

提出一种tri-plane-based 3D GAN框架。在保持效果的情况下，提速明显；
提出一种3D GAN训练策略dual discrimination，用于保持多视角一致性；
提出generator pose conditioning，建模pose相关的属性，例如：表情。
在FFHQ和AFHQ Cats的3D-aware图片生成中取得sota结果。