使用高斯潜在空间来改善式式的反转和发电多样性

论文标题

使用高斯潜在空间来改善式式的反转和发电多样性

Improving Inversion and Generation Diversity in StyleGAN using a Gaussianized Latent Space

论文作者

Wulff, Jonas, Torralba, Antonio

论文摘要

现代生成的对抗网络能够从居住在低维学的潜在空间中的潜在媒介中创建人造的，逼真的图像。已经表明，可以将各种图像投影到该空间中，包括对发电机进行训练的域之外的图像。但是，尽管在这种情况下，发电机会重现图像的像素和纹理，但重建的潜在向量是不稳定的，而小扰动会导致显着的图像扭曲。在这项工作中，我们建议明确对潜在空间中的数据分布进行建模。我们表明，在简单的非线性操作下，数据分布可以建模为高斯，因此可以使用足够的统计数据表达。这产生了一个简单的高斯先验，我们用它来将图像投影在潜在空间中的投影。由此产生的预测在于潜在空间的更光滑和表现更好的区域，如真实图像和生成的图像的插值性能所示。此外，潜在空间中分布的高斯模型使我们能够研究发电机输出中伪像的起源，并提供了一种减少这些伪像的方法，同时保持生成的图像的多样性。

Modern Generative Adversarial Networks are capable of creating artificial, photorealistic images from latent vectors living in a low-dimensional learned latent space. It has been shown that a wide range of images can be projected into this space, including images outside of the domain that the generator was trained on. However, while in this case the generator reproduces the pixels and textures of the images, the reconstructed latent vectors are unstable and small perturbations result in significant image distortions. In this work, we propose to explicitly model the data distribution in latent space. We show that, under a simple nonlinear operation, the data distribution can be modeled as Gaussian and therefore expressed using sufficient statistics. This yields a simple Gaussian prior, which we use to regularize the projection of images into the latent space. The resulting projections lie in smoother and better behaved regions of the latent space, as shown using interpolation performance for both real and generated images. Furthermore, the Gaussian model of the distribution in latent space allows us to investigate the origins of artifacts in the generator output, and provides a method for reducing these artifacts while maintaining diversity of the generated images.

下载PDF全文

下载文献需遵守相关版权规定

论文标题