Abstract
A large number of experimental results reveal that FRAME tends to generate inferior synthesized images and is often arduous to converge during training. The synthesized images of FRAME seriously deteriorates along with the model energy. This phenomenon is caused by KL-vanishing in the stepwise parameters estimation of the model due to the existence of the great filter responses disparity between distribution of model and distribution of real data. Moreover, in the discrete time setting of the actual iterative training process, the dissipation of the model energy may become considerably unstable, and the stepwise minimization scheme may suffer serious KL-vanishing issue during the communicative parameters estimation.
To tackle the above shortcomings, we first investigate this model from a particle perspective by regarding all the observed signals as Brownian particles (pre-condition of KL discrete flow), which helps explore the reasons for the collapses of the FRAME model. We then delve into the model in discrete time state and translate its learning mechanism from KL discrete flow into the Jordan-Kinderlehrer-Otto (JKO) discrete flow, which is a procedure for finding time-discrete approximations to solutions of diffusion equations in Wasserstein space.
Model Collapse Identification
This confirmation of existance of model collapse is implemented on a subset of SUN[5].
Quantitive and Qualitive Experiments Results
On large datasets CelebA[3] and LSUN-Bedroom[4]
And the quantitive results of those three datasets above, The table is the inception scores of comapred generative models on dataset CIFAR-10.
References
[1] Zhu, S. C.; Wu, Y. N.; and Mumford, D. 1997. Minimax entropy principle and its application to texture modeling. Neural computation 9(8):1627–1660.
[2] Jordan, R.; Kinderlehrer, D.; and Otto, F. 1998. The vari-ational formulation of the fokker–planck equation.SIAMjournal on mathematical analysis29(1):1–17.
[3] Liu, Z.; Luo, P.; Wang, X.; and Tang, X. 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, 3730–3738.
[4] Yu, F.; Zhang, Y.; Song, S.; Seff, A.; and Xiao, J. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365.
[5] Xiao, J.; Hays, J.; Ehinger, K. A.; Oliva, A.; and Torralba, A. 2010. Sun database: Large-scale scene recognition from abbey to zoo. In Computer vision and pattern recognition (CVPR), 2010 IEEE conference on, 3485–3492. IEEE.