當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Diffusion-GAN: Training GANs with Diffusion 解读

發布時間：2024/1/1 编程问答 32 豆豆

生活随笔收集整理的這篇文章主要介紹了 Diffusion-GAN: Training GANs with Diffusion 解读小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

?Diffusion-GAN: 將GAN與diffusion一起訓練?

paper：https://arxiv.org/abs/2206.02262

code：GitHub - Zhendong-Wang/Diffusion-GAN: Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion

??第一行從左向右看是diffusion forward的過程，不斷由?real image 進行 diffusion，第三行從右向左看是由noise逐步恢復成fake image的過程，第二行是鑒別器D，D對每一個timestep都進行鑒別。?

?Figure 1: Flowchart for Diffusion-GAN. The top-row images represent the forward diffusion process of a real image, while the bottom-row images represent the forward diffusion process of a generated fake image. The discriminator learns to distinguish a diffused real image from a diffused fake image at all diffusion steps.

in Figure 1. In Diffusion-GAN, the input to?the diffusion process is either a real or a generated image, and the diffusion process consists of?a series of steps that gradually add noise to the? image. The number of diffusion steps is not?fixed, but depends on the data and the generator. We also design the diffusion process to be?differentiable, which means that we can compute the derivative of the output with respect to?the input. This allows us to propagate the gradient from the discriminator to the generator?through the diffusion process, and update the generator accordingly. Unlike vanilla GANs,?which compare the real and generated images directly, Diffusion-GAN compares the noisy?versions of them, which are obtained by sampling from the Gaussian mixture distribution over?the diffusion steps, with the help of our timestep-dependent discriminator. This distribution?has the property that its components have different noise-to-data ratios, which means that?some components add more noise than others. By sampling from this distribution, we can?achieve two benefits: first, we can stabilize the training by easing the problem of vanishing?gradient, which occurs when the data and generator distributions are too different; second,?we can augment the data by creating different noisy versions of the same image, which can?improve the data efficiency and the diversity of the generator. We provide a theoretical analysis?to support our method, and show that the min-max objective function of Diffusion-GAN,?which measures the difference between the data and generator distributions, is continuous and?differentiable everywhere. This means that the generator in theory can always receive a useful?gradient from the discriminator, and improve its performance.【G可以從D收到有用的梯度，從而提升G的性能】

主要貢獻：

1) We show both theoretically and empirically how the?diffusion process can be utilized to provide a model- and domain-agnostic differentiable?augmentation, enabling data-efficient and leaking-free stable GAN training.【穩定了GAN的訓練】
2) Extensive?experiments show that Diffusion-GAN boosts the stability and generation performance of?strong baselines, including StyleGAN2 , Projected GAN?, and InsGen , achieving state-of-the-art results in synthesizing photo-realistic images, as measured by both the Fréchet Inception Distance (FID) ?and Recall score.【diffusion提升了原始只有GAN組成的框架的性能，例如styleGAN2，Projected GAN】

Figure 2: The toy example inherited from Arjovsky et al. [2017]. The first row plots the distributions of data with diffusion noise injected for t. The second row shows the JS divergence?and the optimal discriminator value with and without our noise injection.?

Figure 4: Plot of adaptively adjusted maximum diffusion steps T and discriminator outputs of?Diffusion-GANs.?

To investigate how the adaptive diffusion process works during training, we illustrate in Figure 4 the convergence of the maximum timestep T in our adaptive diffusion and discriminator outputs.
We see that T is adaptively adjusted: The T for Diffusion StyleGAN2 increases as the training goes while the T for Diffusion ProjectedGAN first goes up and then goes down. Note that the T is adjusted according to the overfitting status of the discriminator. The second panel shows that trained with the diffusion-based mixture distribution, the discriminator is always?well-behaved and provides useful learning signals for the generator, which validates our analysis in Section 3.4 and Theorem 1.

如圖4左所示，隨著訓練過程的變化，擴散的timestep T也會自適應的改變（T通過鑒別器D過擬合的狀態而改變）；
如圖4右所示，用基于擴散的混合分布訓練的鑒別器總是表現良好，并為生成器G提供有用的學習信號。

Effectiveness of Diffusion-GAN for domain-agnostic augmentation（未知域增強的有效性）

25-Gaussians Example.

We conduct experiments on the popular 25-Gaussians generation?task. The 25-Gaussians dataset is a 2-D toy data, generated by a mixture of 25 two-dimensional?Gaussian distributions. Each data point is a 2-dimensional feature vector. We train a?small GAN model, whose generator and discriminator are both parameterized by multilayer?perceptrons (MLPs), with two 128-unit hidden layers and LeakyReLu nonlinearities.

Figure 5: The 25-Gaussians example. We show the true data samples, the generated samples?from vanilla GANs, the discriminator outputs of the vanilla GANs, the?generated?samples?from our Diffusion-GAN, and the discriminator outputs of Diffusion-GAN.?

（1）groundtruth數據集的數據分布，在25個Gaussians example均勻分布；
（2）vanilla GANs的輸出結果產生了mode collapsing，只在幾個model上生成數據；
（3）vanilla GANs鑒別器輸出很快就會彼此分離。這意味著發生了鑒別器的強烈過擬合，使得鑒別器停止為發生器提供有用的學習信號。
（4）Diffusion-GAN在25個example上均勻分布，意味著它在所有的model上學到了采樣分布；
（5）Diffusion-GAN的鑒別器輸出，D在持續的為G提供有用的學習信號

我們從兩個角度來解釋這種改進：
首先，non-leaking augmentation（無泄漏增強）有助于提供關于數據空間的更多信息；第二，自適應調整的基于擴散的噪聲注入，鑒別器表現良好。

關于?Difffferentiable augmentation. （可微分增強）

As Diffusion-GAN transforms both the data and generated?samples before sending them to the discriminator, we can also relate it to differentiable?augmentation?proposed for data-efficient GAN training.?Karras et al?introduce a stochastic augmentation pipeline with 18 transformationsand develop an adaptive mechanism for controlling the augmentation probability. Zhao et al.?[2020] propose to use Color + Translation + Cutout as differentiable augmentations for both?generated and real images.

While providing good empirical results on some datasets, these augmentation methods are?developed with domain-specific knowledge and have the risk of leaking augmentation? into?generation [Karras et al., 2020a]. As observed in our experiments, they sometime worsen?the results when applied to a new dataset, likely because the risk of augmentation leakage?overpowers the benefits of enlarging the training set, which could happen especially if the?training set size is already sufficiently large.（在數據量足夠大的情況下，數據增強帶來的負面效果可能大于正面效果）

By contrast, Diffusion-GAN uses a differentiable forward diffusion process to stochastically?transform the data and can be considered as both a domain-agnostic and a model-agnostic?augmentation method. In other words, Diffusion-GAN can be applied to non-image data?or even latent features, for which appropriate data augmentation is difficult to be defined,?and easily plugged into an existing GAN to improve its generation performance. Moreover,?we prove in theory and show in experiments that augmentation leakage is not a concern for?Diffusion-GAN. Tran et al. [2021] provide a theoretical analysis for deterministic non-leaking?transformation with differentiable and invertible mapping functions. Bora et al. [2018] show?similar theorems to us for specific stochastic transformations, such as Gaussian Projection,?Convolve+Noise, and stochastic Block-Pixels, while our Theorem 2 includes more satisfying?possibilities as discussed in?Appendix B.

總結

以上是生活随笔為你收集整理的Diffusion-GAN: Training GANs with Diffusion 解读的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。