Shichang Tang
Paper download is intended for registered attendees only, and is
subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.
Papers from this author
Disentangled Representation Learning for Controllable Image Synthesis: An Information-Theoretic Perspective
Shichang Tang, Xu Zhou, Xuming He, Yi Ma
Auto-TLDR; Controllable Image Synthesis in Deep Generative Models using Variational Auto-Encoder
Abstract Slides Poster Similar
In this paper, we look into the problem of disentangled representation learning and controllable image synthesis in a deep generative model. We develop an encoder-decoder architecture for a variant of the Variational Auto-Encoder (VAE) with two latent codes $z_1$ and $z_2$. Our framework uses $z_2$ to capture specified factors of variation while $z_1$ captures the complementary factors of variation. To this end, we analyze the learning problem from the perspective of multivariate mutual information, derive optimizable lower bounds of the conditional mutual information in the image synthesis processes and incorporate them into the training objective. We validate our method empirically on the Color MNIST dataset and the CelebA dataset by showing controllable image syntheses. Our proposed paradigm is simple yet effective and is applicable to many situations, including those where there is not an explicit factorization of features available, or where the features are non-categorical.