论文标题

甘斯什么时候复制?关于数据集大小的选择

When do GANs replicate? On the choice of dataset size

论文作者

Feng, Qianli, Guo, Chenqi, Benitez-Quiroz, Fabian, Martinez, Aleix

论文摘要

甘斯是否复制训练图像?先前的研究表明,GAN似乎没有复制训练数据而没有训练程序发生重大变化。这导致了一系列研究甘斯(Gans)超越培训数据所需的确切条件。尽管在理论上或经验上已经确定了许多因素,但数据集大小和复杂性对甘体复制的影响仍然未知。借助Biggan和stylegan2的经验证据,在数据集Celeba,Flower和Lsun卧室上,我们表明数据集的大小及其复杂性在生成的图像的复制和感知质量中起着重要作用。我们进一步量化了这种关系,发现复制百分比在数据集的大小和复杂性方面呈指数衰减,并在gan-dataset组合中具有共同的衰减因子。同时,感知图像质量遵循U形趋势W.R.T数据集大小。这一发现导致了一个实用的工具,用于对最小数据集大小进行一次性估算,以防止GAN复制,该复制可用于指导数据集的构建和选择。

Do GANs replicate training images? Previous studies have shown that GANs do not seem to replicate training data without significant change in the training procedure. This leads to a series of research on the exact condition needed for GANs to overfit to the training data. Although a number of factors has been theoretically or empirically identified, the effect of dataset size and complexity on GANs replication is still unknown. With empirical evidence from BigGAN and StyleGAN2, on datasets CelebA, Flower and LSUN-bedroom, we show that dataset size and its complexity play an important role in GANs replication and perceptual quality of the generated images. We further quantify this relationship, discovering that replication percentage decays exponentially with respect to dataset size and complexity, with a shared decaying factor across GAN-dataset combinations. Meanwhile, the perceptual image quality follows a U-shape trend w.r.t dataset size. This finding leads to a practical tool for one-shot estimation on minimal dataset size to prevent GAN replication which can be used to guide datasets construction and selection.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源