ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Future Urban Scenes Generation through Vehicles Synthesis

Alessandro Simoni, Luca Bergamini, Andrea Palazzi, Simone Calderara, Rita Cucchiara

Auto-TLDR; Predicting the Future of an Urban Scene with a Novel View Synthesis Paradigm

Abstract Slides Poster

In this work we propose a deep learning pipeline to predict the visual future appearance of an urban scene. Despite recent advances, generating the entire scene in an end-to-end fashion is still far from being achieved. Instead, here we follow a two stages approach, where interpretable information is included in the loop and each actor is modelled independently. We leverage a per-object novel view synthesis paradigm; i.e. generating a synthetic representation of an object undergoing a geometrical roto-translation in the 3D space. Our model can be easily conditioned with constraints (e.g. input trajectories) provided by state-of-the-art tracking methods or by the user itself. This allows us to generate a set of diverse realistic futures starting from the same input in a multi-modal fashion. We visually and quantitatively show the superiority of this approach over traditional end-to-end scene-generation methods on CityFlow, a challenging real world dataset.

Similar papers

Shape Consistent 2D Keypoint Estimation under Domain Shift

Levi Vasconcelos, Massimiliano Mancini, Davide Boscaini, Barbara Caputo, Elisa Ricci

Auto-TLDR; Deep Adaptation for Keypoint Prediction under Domain Shift

Future Urban Scenes Generation through Vehicles Synthesis

Similar papers

Shape Consistent 2D Keypoint Estimation under Domain Shift

Dual-MTGAN: Stochastic and Deterministic Motion Transfer for Image-To-Video Synthesis

Semantic-Guided Inpainting Network for Complex Urban Scenes Manipulation

Learning to Take Directions One Step at a Time

Multiple Future Prediction Leveraging Synthetic Trajectories

GarmentGAN: Photo-Realistic Adversarial Fashion Transfer

The Role of Cycle Consistency for Generating Better Human Action Videos from a Single Frame

DeepBEV: A Conditional Adversarial Network for Bird’s Eye View Generation

Local Facial Attribute Transfer through Inpainting

Let's Play Music: Audio-Driven Performance Video Generation

SECI-GAN: Semantic and Edge Completion for Dynamic Objects Removal

VITON-GT: An Image-Based Virtual Try-On Model with Geometric Transformations

Mutual Information Based Method for Unsupervised Disentanglement of Video Representation

Reducing the Variance of Variational Estimates of Mutual Information by Limiting the Critic's Hypothesis Space to RKHS

Motion-Supervised Co-Part Segmentation

Boundary Guided Image Translation for Pose Estimation from Ultra-Low Resolution Thermal Sensor

AG-GAN: An Attentive Group-Aware GAN for Pedestrian Trajectory Prediction

Learning Disentangled Representations for Identity Preserving Surveillance Face Camouflage

Free-Form Image Inpainting Via Contrastive Attention Network

UCCTGAN: Unsupervised Clothing Color Transformation Generative Adversarial Network

A Grid-Based Representation for Human Action Recognition

What and How? Jointly Forecasting Human Action and Pose

Transformer Networks for Trajectory Forecasting

Residual Learning of Video Frame Interpolation Using Convolutional LSTM

Extending Single Beam Lidar to Full Resolution by Fusing with Single Image Depth Estimation

Talking Face Generation Via Learning Semantic and Temporal Synchronous Landmarks

Light3DPose: Real-Time Multi-Person 3D Pose Estimation from Multiple Views

Unsupervised Face Manipulation Via Hallucination

Anomaly Detection, Localization and Classification for Railway Inspection

Novel View Synthesis from a 6-DoF Pose by Two-Stage Networks

Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution

Cycle-Consistent Adversarial Networks and Fast Adaptive Bi-Dimensional Empirical Mode Decomposition for Style Transfer

Cascade Attention Guided Residue Learning GAN for Cross-Modal Translation

Deep Photo Relighting by Integrating Both 2D and 3D Lighting Information

Deep Realistic Novel View Generation for City-Scale Aerial Images

Self-Supervised Learning of Dynamic Representations for Static Images

A GAN-Based Blind Inpainting Method for Masonry Wall Images

JUMPS: Joints Upsampling Method for Pose Sequences

DAG-Net: Double Attentive Graph Neural Network for Trajectory Forecasting

Towards Efficient 3D Point Cloud Scene Completion Via Novel Depth View Synthesis

Quantifying the Use of Domain Randomization

Disentangle, Assemble, and Synthesize: Unsupervised Learning to Disentangle Appearance and Location

Robust Pedestrian Detection in Thermal Imagery Using Synthesized Images

An Unsupervised Approach towards Varying Human Skin Tone Using Generative Adversarial Networks

A Quantitative Evaluation Framework of Video De-Identification Methods

Unsupervised 3D Human Pose Estimation in Multi-view-multi-pose Video

AerialMPTNet: Multi-Pedestrian Tracking in Aerial Imagery Using Temporal and Graphical Features

Effective Deployment of CNNs for 3DoF Pose Estimation and Grasping in Industrial Settings