论文标题
Boomerang:使用扩散模型对图像歧管上的本地采样
Boomerang: Local sampling on image manifolds using diffusion models
论文作者
论文摘要
扩散模型的推理阶段可以看作是运行反向扩散随机微分方程,其中从高斯潜在分布中的样品转化为通常位于低维歧管上的目标分布的样品中,例如图像歧管。初始潜在空间和图像歧管之间的中间值可以解释为嘈杂的图像,而前向扩散过程噪声表确定了噪声量。我们利用这种解释来展示Boomerang,这是图像歧管局部采样的方法。如其名称所暗示的那样,Boomerang本地采样涉及将噪声添加到输入图像,将其移至潜在空间靠近,然后通过部分反向扩散过程将其映射回图像歧管。因此,Boomerang在流形上生成``类似,''但非相同的图像,与原始输入图像无关。我们可以通过调整添加的噪声量来控制生成的图像与原始图像的接近度。此外,由于Boomerang中反向扩散过程的随机性质,生成的图像显示了一定程度的随机性,使我们能够从歧管中获取局部样品而不会遇到任何重复。 Boomerang提供了与任何预处理的扩散模型(例如稳定扩散)无缝无缝工作的灵活性,而无需对反向扩散过程进行任何调整。我们为Boomerang提出了三个申请。首先,我们提供了一个框架,用于构建具有可控程度的匿名程度的隐私数据集。其次,我们表明,使用Boomerang进行数据增强可以提高概括性能,并且表现优于最先进的合成数据增强。最后,我们引入了一个感知图像增强框架,该框架可以提高分辨率。
The inference stage of diffusion models can be seen as running a reverse-time diffusion stochastic differential equation, where samples from a Gaussian latent distribution are transformed into samples from a target distribution that usually reside on a low-dimensional manifold, e.g., an image manifold. The intermediate values between the initial latent space and the image manifold can be interpreted as noisy images, with the amount of noise determined by the forward diffusion process noise schedule. We utilize this interpretation to present Boomerang, an approach for local sampling of image manifolds. As implied by its name, Boomerang local sampling involves adding noise to an input image, moving it closer to the latent space, and then mapping it back to the image manifold through a partial reverse diffusion process. Thus, Boomerang generates images on the manifold that are ``similar,'' but nonidentical, to the original input image. We can control the proximity of the generated images to the original by adjusting the amount of noise added. Furthermore, due to the stochastic nature of the reverse diffusion process in Boomerang, the generated images display a certain degree of stochasticity, allowing us to obtain local samples from the manifold without encountering any duplicates. Boomerang offers the flexibility to work seamlessly with any pretrained diffusion model, such as Stable Diffusion, without necessitating any adjustments to the reverse diffusion process. We present three applications for Boomerang. First, we provide a framework for constructing privacy-preserving datasets having controllable degrees of anonymity. Second, we show that using Boomerang for data augmentation increases generalization performance and outperforms state-of-the-art synthetic data augmentation. Lastly, we introduce a perceptual image enhancement framework, which enables resolution enhancement.