ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Learning to Implicitly Represent 3D Human Body from Multi-Scale Features and Multi-View Images

Zhongguo Li, Magnus Oskarsson, Anders Heyden

Auto-TLDR; Reconstruction of 3D human bodies from multi-view images using multi-stage end-to-end neural networks

Abstract Slides Poster

Reconstruction of 3D human bodies, from images, faces many challenges, due to it generally being an ill-posed problem. In this paper we present a method to reconstruct 3D human bodies from multi-view images, through learning an implicit function to represent 3D shape, based on multi-scale features extracted by multi-stage end-to-end neural networks. Our model consists of several end-to-end hourglass networks for extracting multi-scale features from multi-view images, and a fully connected network for implicit function classification from these features. Given a 3D point, it is projected to multi-view images and these images are fed into our model to extract multi-scale features. The scales of features extracted by the hourglass networks decrease with the depth of our model, which represents the information from local to global scale. Then, the multi-scale features as well as the depth of the 3D point are combined to a new feature vector and the fully connected network classifies the feature vector, in order to predict if the point lies inside or outside of the 3D mesh. The advantage of our method is that we use both local and global features in the fully connected network and represent the 3D mesh by an implicit function, which is more memory-efficient. Experiments on public datasets demonstrate that our method surpasses previous approaches in terms of the accuracy of 3D reconstruction of human bodies from images.

Similar papers

Orthographic Projection Linear Regression for Single Image 3D Human Pose Estimation

Yahui Zhang, Shaodi You, Theo Gevers

Auto-TLDR; A Deep Neural Network for 3D Human Pose Estimation from a Single 2D Image in the Wild

Learning to Implicitly Represent 3D Human Body from Multi-Scale Features and Multi-View Images

Similar papers

Orthographic Projection Linear Regression for Single Image 3D Human Pose Estimation

Hybrid Approach for 3D Head Reconstruction: Using Neural Networks and Visual Geometry

Silhouette Body Measurement Benchmarks

Towards Efficient 3D Point Cloud Scene Completion Via Novel Depth View Synthesis

Light3DPose: Real-Time Multi-Person 3D Pose Estimation from Multiple Views

DmifNet:3D Shape Reconstruction Based on Dynamic Multi-Branch Information Fusion

PEAN: 3D Hand Pose Estimation Adversarial Network

RefiNet: 3D Human Pose Refinement with Depth Maps

A Multi-Task Neural Network for Action Recognition with 3D Key-Points

Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution

Better Prior Knowledge Improves Human-Pose-Based Extrinsic Camera Calibration

Unsupervised 3D Human Pose Estimation in Multi-view-multi-pose Video

Occlusion-Tolerant and Personalized 3D Human Pose Estimation in RGB Images

Learning Semantic Representations Via Joint 3D Face Reconstruction and Facial Attribute Estimation

Rotational Adjoint Methods for Learning-Free 3D Human Pose Estimation from IMU Data

On the Robustness of 3D Human Pose Estimation

Novel View Synthesis from a 6-DoF Pose by Two-Stage Networks

Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation

HPERL: 3D Human Pose Estimastion from RGB and LiDAR

Facetwise Mesh Refinement for Multi-View Stereo

Cross-Regional Attention Network for Point Cloud Completion

Partially Supervised Multi-Task Network for Single-View Dietary Assessment

VITON-GT: An Image-Based Virtual Try-On Model with Geometric Transformations

Joint Face Alignment and 3D Face Reconstruction with Efficient Convolution Neural Networks

Deep Space Probing for Point Cloud Analysis

EdgeNet: Semantic Scene Completion from a Single RGB-D Image

MixedFusion: 6D Object Pose Estimation from Decoupled RGB-Depth Features

PC-Net: A Deep Network for 3D Point Clouds Analysis

DeepPear: Deep Pose Estimation and Action Recognition

Deep Realistic Novel View Generation for City-Scale Aerial Images

Efficient High-Resolution High-Level-Semantic Representation Learning for Human Pose Estimation

Learning Non-Rigid Surface Reconstruction from Spatio-Temporal Image Patches

PointSpherical: Deep Shape Context for Point Cloud Learning in Spherical Coordinates

Extending Single Beam Lidar to Full Resolution by Fusing with Single Image Depth Estimation

Weakly Supervised Body Part Segmentation with Pose Based Part Priors

Learning Interpretable Representation for 3D Point Clouds

Ordinal Depth Classification Using Region-Based Self-Attention

Two-Stage Adaptive Object Scene Flow Using Hybrid CNN-CRF Model

Distinctive 3D Local Deep Descriptors

Multi-Attribute Regression Network for Face Reconstruction

StrongPose: Bottom-up and Strong Keypoint Heat Map Based Pose Estimation

Total Estimation from RGB Video: On-Line Camera Self-Calibration, Non-Rigid Shape and Motion

JUMPS: Joints Upsampling Method for Pose Sequences

NetCalib: A Novel Approach for LiDAR-Camera Auto-Calibration Based on Deep Learning

Joint Supervised and Self-Supervised Learning for 3D Real World Challenges

Real-Time Monocular Depth Estimation with Extremely Light-Weight Neural Network

End-To-End Hierarchical Relation Extraction for Generic Form Understanding

3D Semantic Labeling of Photogrammetry Meshes Based on Active Learning