ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Disentangle, Assemble, and Synthesize: Unsupervised Learning to Disentangle Appearance and Location

Hiroaki Aizawa, Hirokatsu Kataoka, Yutaka Satoh, Kunihito Kato

Auto-TLDR; Generative Adversarial Networks with Structural Constraint for controllability of latent space

Abstract Slides Poster

The next step for the generative adversarial networks~(GAN) is to learn representations that allow us to control only a certain factor in the image explicitly. Since such a representation of the factor is independent of other factors, the controllability obtained from these representations leads to interpretability by identifying the variation of the synthesized image and the transferability for downstream tasks by inference. However, since it is difficult to identify and strictly define latent factors, the annotation is laborious. Moreover, learning such representations by a GAN is challenging due to the complex generation process. Therefore, we resolve this limitation using a novel generative model that can disentangle latent space into the appearance, the x-axis, and the y-axis of the object, and reassemble these components in an unsupervised manner. Specifically, based on the concept of packing the appearance and location in each position of the feature map, we introduce a novel structural constraint technique that prevents these representations from interacting with each other. The proposed structural constraint promotes the disentanglement of these factors. In experiments, we found that the proposed method is simple but effective for controllability and allows us to control the appearance and location via latent space without supervision, as compared with the conditional GAN.

Similar papers

Mask-Based Style-Controlled Image Synthesis Using a Mask Style Encoder

Jaehyeong Cho, Wataru Shimoda, Keiji Yanai

Auto-TLDR; Style-controlled Image Synthesis from Semantic Segmentation masks using GANs

Disentangle, Assemble, and Synthesize: Unsupervised Learning to Disentangle Appearance and Location

Similar papers

Mask-Based Style-Controlled Image Synthesis Using a Mask Style Encoder

Local Facial Attribute Transfer through Inpainting

Semantics-Guided Representation Learning with Applications to Visual Synthesis

Disentangled Representation Learning for Controllable Image Synthesis: An Information-Theoretic Perspective

Dual-MTGAN: Stochastic and Deterministic Motion Transfer for Image-To-Video Synthesis

Phase Retrieval Using Conditional Generative Adversarial Networks

GarmentGAN: Photo-Realistic Adversarial Fashion Transfer

Learning Interpretable Representation for 3D Point Clouds

Mutual Information Based Method for Unsupervised Disentanglement of Video Representation

Reducing the Variance of Variational Estimates of Mutual Information by Limiting the Critic's Hypothesis Space to RKHS

Interpreting the Latent Space of GANs Via Correlation Analysis for Controllable Concept Manipulation

Learning Low-Shot Generative Networks for Cross-Domain Data

Attributes Aware Face Generation with Generative Adversarial Networks

GAN-Based Gaussian Mixture Model Responsibility Learning

On the Evaluation of Generative Adversarial Networks by Discriminative Models

Augmented Cyclic Consistency Regularization for Unpaired Image-To-Image Translation

High Resolution Face Age Editing

Ω-GAN: Object Manifold Embedding GAN for Image Generation by Disentangling Parameters into Pose and Shape Manifolds

Semantic-Guided Inpainting Network for Complex Urban Scenes Manipulation

Future Urban Scenes Generation through Vehicles Synthesis

Unsupervised Face Manipulation Via Hallucination

Continuous Learning of Face Attribute Synthesis

Cascade Attention Guided Residue Learning GAN for Cross-Modal Translation

Learning Disentangled Representations for Identity Preserving Surveillance Face Camouflage

Controllable Face Aging

AVAE: Adversarial Variational Auto Encoder

The Role of Cycle Consistency for Generating Better Human Action Videos from a Single Frame

Learning to Take Directions One Step at a Time

Unsupervised Contrastive Photo-To-Caricature Translation Based on Auto-Distortion

Variational Capsule Encoder

JUMPS: Joints Upsampling Method for Pose Sequences

Adversarial Knowledge Distillation for a Compact Generator

Shape Consistent 2D Keypoint Estimation under Domain Shift

GAP: Quantifying the Generative Adversarial Set and Class Feature Applicability of Deep Neural Networks

Semi-Supervised Outdoor Image Generation Conditioned on Weather Signals

CardioGAN: An Attention-Based Generative Adversarial Network for Generation of Electrocardiograms

Coherence and Identity Learning for Arbitrary-Length Face Video Generation

Galaxy Image Translation with Semi-Supervised Noise-Reconstructed Generative Adversarial Networks

S2I-Bird: Sound-To-Image Generation of Bird Species Using Generative Adversarial Networks

Image Representation Learning by Transformation Regression

Multi-Domain Image-To-Image Translation with Adaptive Inference Graph

Pixel-based Facial Expression Synthesis

Exemplar Guided Cross-Spectral Face Hallucination Via Mutual Information Disentanglement

Boundary Guided Image Translation for Pose Estimation from Ultra-Low Resolution Thermal Sensor

Quantifying the Use of Domain Randomization

Combining GANs and AutoEncoders for Efficient Anomaly Detection

Detail Fusion GAN: High-Quality Translation for Unpaired Images with GAN-Based Data Augmentation

IDA-GAN: A Novel Imbalanced Data Augmentation GAN