ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Semantic Segmentation for Pedestrian Detection from Motion in Temporal Domain

Guo Cheng, Jiang Yu Zheng

Auto-TLDR; Motion Proﬁle: Recognizing Pedestrians along with their Motion Directions in a Temporal Way

Abstract Slides Poster

In autonomous driving, state-of-the-art methods detect pedestrian through appearance in 2-D spatial images. However, these approaches are typically time-consuming because of the complexity of algorithms to cope with large variations in shape, pose, action, and illumination. They also fall short of capturing temporal continuity in motion trace. In a completely diﬀerent approach, this work recognizes pedestrians along with their motion directions in a temporal way. By projecting a driving video to a 2-D temporal image called Motion Proﬁle (MP), we can robustly distinguish pedestrian in motion and standing-still against smooth background motion. To ensure non-redundant data processing of deep network on a compact motion profile further, a novel temporal-shift memory (TSM) model is developed to perform deep learning of sequential input in linear processing time. In experiments containing various pedestrian motion from sensors such as video and LiDAR, we demonstrate that, with the data size around 3/720th of video volume, this motion-based method can reach the detecting rate of pedestrians at 90% in near and mid-range on the road. With a super-fast processing speed and good accuracy, this method is promising for intelligent vehicles.

Similar papers

Human Segmentation with Dynamic LiDAR Data

Tao Zhong, Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi

Auto-TLDR; Spatiotemporal Neural Network for Human Segmentation with Dynamic Point Clouds

Abstract Slides Similar

Consecutive LiDAR scans and depth images compose dynamic 3D sequences, which contain more abundant spatiotemporal information than a single frame. Similar to the development history of image and video perception, dynamic 3D sequence perception starts to come into sight after inspiring research on static 3D data perception. This work proposes a spatiotemporal neural network for human segmentation with the dynamic LiDAR point clouds. It takes a sequence of depth images as input. It has a two-branch structure, i.e., the spatial segmentation branch and the temporal velocity estimation branch. The velocity estimation branch is designed to capture motion cues from the input sequence and then propagates them to the other branch. So that the segmentation branch segments humans according to both spatial and temporal features. These two branches are jointly learned on a generated dynamic point cloud data set for human recognition. Our works fill in the blank of dynamic point cloud perception with the spherical representation of point cloud and achieves high accuracy. The experiments indicate that the introduction of temporal feature benefits the segmentation of dynamic point cloud perception.

Sensor-Independent Pedestrian Detection for Personal Mobility Vehicles in Walking Space Using Dataset Generated by Simulation

Takahiro Shimizu, Kenji Koide, Shuji Oishi, Masashi Yokozuka, Atsuhiko Banno, Motoki Shino

Auto-TLDR; CosPointPillars: A 3D Object Detection Method for Pedestrian Detection in Walking Spaces

Semantic Segmentation for Pedestrian Detection from Motion in Temporal Domain

Similar papers

Human Segmentation with Dynamic LiDAR Data

Sensor-Independent Pedestrian Detection for Personal Mobility Vehicles in Walking Space Using Dataset Generated by Simulation

Yolo+FPN: 2D and 3D Fused Object Detection with an RGB-D Camera

Holistic Grid Fusion Based Stop Line Estimation

Multiple Future Prediction Leveraging Synthetic Trajectories

Motion U-Net: Multi-Cue Encoder-Decoder Network for Motion Segmentation

Temporal Pulses Driven Spiking Neural Network for Time and Power Efficient Object Recognition in Autonomous Driving

Ground-truthing Large Human Behavior Monitoring Datasets

HPERL: 3D Human Pose Estimastion from RGB and LiDAR

AerialMPTNet: Multi-Pedestrian Tracking in Aerial Imagery Using Temporal and Graphical Features

Vehicle Lane Merge Visual Benchmark

NetCalib: A Novel Approach for LiDAR-Camera Auto-Calibration Based on Deep Learning

Construction Worker Hardhat-Wearing Detection Based on an Improved BiFPN

CARRADA Dataset: Camera and Automotive Radar with Range-Angle-Doppler Annotations

Early Wildfire Smoke Detection in Videos

Attention Based Coupled Framework for Road and Pothole Segmentation

MagnifierNet: Learning Efficient Small-Scale Pedestrian Detector towards Multiple Dense Regions

Dynamic Resource-Aware Corner Detection for Bio-Inspired Vision Sensors

A Fine-Grained Dataset and Its Efficient Semantic Segmentation for Unstructured Driving Scenarios

Gabriella: An Online System for Real-Time Activity Detection in Untrimmed Security Videos

Image Sequence Based Cyclist Action Recognition Using Multi-Stream 3D Convolution

PRF-Ped: Multi-Scale Pedestrian Detector with Prior-Based Receptive Field

RWF-2000: An Open Large Scale Video Database for Violence Detection

Attention-Oriented Action Recognition for Real-Time Human-Robot Interaction

Street-Map Based Validation of Semantic Segmentation in Autonomous Driving

Enhancing Depth Quality of Stereo Vision Using Deep Learning-Based Prior Information of the Driving Environment

Temporal Feature Enhancement Network with External Memory for Object Detection in Surveillance Video

Enhancing Semantic Segmentation of Aerial Images with Inhibitory Neurons

IPT: A Dataset for Identity Preserved Tracking in Closed Domains

Future Urban Scenes Generation through Vehicles Synthesis

Real-Time Monocular Depth Estimation with Extremely Light-Weight Neural Network

An Adaptive Fusion Model Based on Kalman Filtering and LSTM for Fast Tracking of Road Signs

Extending Single Beam Lidar to Full Resolution by Fusing with Single Image Depth Estimation

Nighttime Pedestrian Detection Based on Feature Attention and Transformation

PHNet: Parasite-Host Network for Video Crowd Counting

Video Semantic Segmentation Using Deep Multi-View Representation Learning

Utilising Visual Attention Cues for Vehicle Detection and Tracking

Real-time Pedestrian Lane Detection for Assistive Navigation using Neural Architecture Search

Tracking Fast Moving Objects by Segmentation Network

SynDHN: Multi-Object Fish Tracker Trained on Synthetic Underwater Videos

What and How? Jointly Forecasting Human Action and Pose

Learning to Take Directions One Step at a Time

Learning Defects in Old Movies from Manually Assisted Restoration

Two-Stage Adaptive Object Scene Flow Using Hybrid CNN-CRF Model

Real-Time Drone Detection and Tracking with Visible, Thermal and Acoustic Sensors

ResFPN: Residual Skip Connections in Multi-Resolution Feature Pyramid Networks for Accurate Dense Pixel Matching

Forground-Guided Vehicle Perception Framework

AG-GAN: An Attentive Group-Aware GAN for Pedestrian Trajectory Prediction