ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

LFIR2Pose: Pose Estimation from an Extremely Low-Resolution FIR Image Sequence

Saki Iwata, Yasutomo Kawanishi, Daisuke Deguchi, Ichiro Ide, Hiroshi Murase, Tomoyoshi Aizawa

Auto-TLDR; LFIR2Pose: Human Pose Estimation from a Low-Resolution Far-InfraRed Image Sequence

Abstract Slides Poster

In this paper, we propose a method for human pose estimation from a Low-resolution Far-InfraRed (LFIR) image sequence captured by a 16 × 16 FIR sensor array. Human body estimation from such a single LFIR image is a hard task. For training the estimation model, annotation of the human pose to the images is also a difficult task for human. Thus, we propose the LFIR2Pose model which accepts a sequence of LFIR images and outputs the human pose of the last frame, and also propose an automatic annotation system for the model training. Additionally, considering that the scale of human body motion is largely different among body parts, we also propose a loss function focusing on the difference. Through an experiment, we evaluated the human pose estimation accuracy using an original data set, and confirmed that human pose can be estimated accurately from an LFIR image sequence.

Similar papers

RefiNet: 3D Human Pose Refinement with Depth Maps

Andrea D'Eusanio, Stefano Pini, Guido Borghi, Roberto Vezzani, Rita Cucchiara

Auto-TLDR; RefiNet: A Multi-stage Framework for 3D Human Pose Estimation

Abstract Slides Similar

Human Pose Estimation is a fundamental task for many applications in the Computer Vision community and it has been widely investigated in the 2D domain, i.e. intensity images. Therefore, most of the available methods for this task are mainly based on 2D Convolutional Neural Networks and huge manually-annotated RGB datasets, achieving stunning results. In this paper, we propose RefiNet, a multi-stage framework that regresses an extremely-precise 3D human pose estimation from a given 2D pose and a depth map. The framework consists of three different modules, each one specialized in a particular refinement and data representation, i.e. depth patches, 3D skeleton and point clouds. Moreover, we collect a new dataset, namely Baracca, acquired with RGB, depth and thermal cameras and specifically created for the automotive context. Experimental results confirm the quality of the refinement procedure that largely improves the human pose estimations of off-the-shelf 2D methods.

Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution

Renshu Gu, Gaoang Wang, Jenq-Neng Hwang

Auto-TLDR; 3D Human Pose Estimation for Multi-Human Videos with Occlusion

Abstract Slides Similar

3D human pose estimation (HPE) is crucial in human behavior analysis, augmented reality/virtual reality (AR/VR) applications, and self-driving industry. Videos that contain multiple potentially occluded people captured from freely moving monocular cameras are very common in real-world scenarios, while 3D HPE for such scenarios is quite challenging, partially because there is a lack of such data with accurate 3D ground truth labels in existing datasets. In this paper, we propose a temporal regression network with a gated convolution module to transform 2D joints to 3D and recover the missing occluded joints in the meantime. A simple yet effective localization approach is further conducted to transform the normalized pose to the global trajectory. To verify the effectiveness of our approach, we also collect a new moving camera multi-human (MMHuman) dataset that includes multiple people with heavy occlusion captured by moving cameras. The 3D ground truth joints are provided by accurate motion capture (MoCap) system. From the experiments on static-camera based Human3.6M data and our own collected moving-camera based data, we show that our proposed method outperforms most state-of-the-art 2D-to-3D pose estimation methods, especially for the scenarios with heavy occlusions.

What and How? Jointly Forecasting Human Action and Pose

Yanjun Zhu, Yanxia Zhang, Qiong Liu, Andreas Girgensohn

Auto-TLDR; Forecasting Human Actions and Motion Trajectories with Joint Action Classification and Pose Regression

LFIR2Pose: Pose Estimation from an Extremely Low-Resolution FIR Image Sequence

Similar papers

RefiNet: 3D Human Pose Refinement with Depth Maps

Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution

What and How? Jointly Forecasting Human Action and Pose

StrongPose: Bottom-up and Strong Keypoint Heat Map Based Pose Estimation

A Multi-Task Neural Network for Action Recognition with 3D Key-Points

Rotational Adjoint Methods for Learning-Free 3D Human Pose Estimation from IMU Data

A Grid-Based Representation for Human Action Recognition

Video Analytics Gait Trend Measurement for Fall Prevention and Health Monitoring

Better Prior Knowledge Improves Human-Pose-Based Extrinsic Camera Calibration

Pose-Based Body Language Recognition for Emotion and Psychiatric Symptom Interpretation

Orthographic Projection Linear Regression for Single Image 3D Human Pose Estimation

Space-Time Domain Tensor Neural Networks: An Application on Human Pose Classification

Inner Eye Canthus Localization for Human Body Temperature Screening

Boundary Guided Image Translation for Pose Estimation from Ultra-Low Resolution Thermal Sensor

Weight Estimation from an RGB-D Camera in Top-View Configuration

Light3DPose: Real-Time Multi-Person 3D Pose Estimation from Multiple Views

DeepPear: Deep Pose Estimation and Action Recognition

HPERL: 3D Human Pose Estimastion from RGB and LiDAR

Attention-Oriented Action Recognition for Real-Time Human-Robot Interaction

Deep Gait Relative Attribute Using a Signed Quadratic Contrastive Loss

JT-MGCN: Joint-Temporal Motion Graph Convolutional Network for Skeleton-Based Action Recognition

Audio-Video Detection of the Active Speaker in Meetings

P2 Net: Augmented Parallel-Pyramid Net for Attention Guided Pose Estimation

Temporal Attention-Augmented Graph Convolutional Network for Efficient Skeleton-Based Human Action Recognition

IPT: A Dataset for Identity Preserved Tracking in Closed Domains

Simple Multi-Resolution Representation Learning for Human Pose Estimation

Occlusion-Tolerant and Personalized 3D Human Pose Estimation in RGB Images

Toward Building a Data-Driven System ForDetecting Mounting Actions of Black Beef Cattle

Unsupervised 3D Human Pose Estimation in Multi-view-multi-pose Video

Learning Group Activities from Skeletons without Individual Action Labels

Temporal Extension Module for Skeleton-Based Action Recognition

PEAN: 3D Hand Pose Estimation Adversarial Network

Attention-Driven Body Pose Encoding for Human Activity Recognition

Median-Shape Representation Learning for Category-Level Object Pose Estimation in Cluttered Environments

JUMPS: Joints Upsampling Method for Pose Sequences

Online Object Recognition Using CNN-Based Algorithm on High-Speed Camera Imaging

On the Robustness of 3D Human Pose Estimation

Efficient High-Resolution High-Level-Semantic Representation Learning for Human Pose Estimation

Activity Recognition Using First-Person-View Cameras Based on Sparse Optical Flows

Modeling Long-Term Interactions to Enhance Action Recognition

From Human Pose to On-Body Devices for Human-Activity Recognition

Tilting at Windmills: Data Augmentation for Deeppose Estimation Does Not Help with Occlusions

Channel-Wise Dense Connection Graph Convolutional Network for Skeleton-Based Action Recognition

PHNet: Parasite-Host Network for Video Crowd Counting

Late Fusion of Bayesian and Convolutional Models for Action Recognition

Learning to Implicitly Represent 3D Human Body from Multi-Scale Features and Multi-View Images

Real-Time Monocular Depth Estimation with Extremely Light-Weight Neural Network

RWF-2000: An Open Large Scale Video Database for Violence Detection