ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

EdgeNet: Semantic Scene Completion from a Single RGB-D Image

Aloisio Dourado, Teofilo De Campos, Adrian Hilton, Hansung Kim

Auto-TLDR; Semantic Scene Completion using 3D Depth and RGB Information

Abstract Slides Poster

Semantic scene completion is the task of predicting a complete 3D representation of volumetric occupancy with corresponding semantic labels for a scene from a single point of view. In this paper, we present EdgeNet, a new end-to-end neural network architecture that fuses information from depth and RGB, explicitly representing RGB edges in 3D space. Previous works on this task used either depth-only or depth with colour by projecting 2D semantic labels generated by a 2D segmentation network into the 3D volume, requiring a two step training process. Our EdgeNet representation encodes colour information in 3D space using edge detection and flipped truncated signed distance, which improves semantic completion scores especially in hard to detect classes. We achieved state-of-the-art scores on both synthetic and real datasets with a simpler and a more computationally efficient training pipeline than competing approaches.

Similar papers

Towards Efficient 3D Point Cloud Scene Completion Via Novel Depth View Synthesis

Haiyan Wang, Liang Yang, Xuejian Rong, Ying-Li Tian

Auto-TLDR; 3D Point Cloud Completion with Depth View Synthesis and Depth View synthesis

Abstract Poster Similar

3D point cloud completion has been a long-standing challenge at scale, and corresponding per-point supervised training strategies suffered from the cumbersome annotations. 2D supervision has recently emerged as a promising alternative for 3D tasks, but specific approaches for 3D point cloud completion still remain to be explored. To overcome these limitations, we propose an end-to-end method that directly lifts a single depth map to a completed point cloud. With one depth map as input, a multi-way novel depth view synthesis network (NDVNet) is designed to infer coarsely completed depth maps under various viewpoints. Meanwhile, a geometric depth perspective rendering module is introduced to utilize the raw input depth map to generate a re-projected depth map for each view. Therefore, the two parallelly generated depth maps for each view are further concatenated and refined by a depth completion network (DCNet). The final completed point cloud is fused from all refined depth views. Experimental results demonstrate the effectiveness of our proposed approach composed of aforementioned components, to produce high-quality state-of-the-art results on the popular SUNCG benchmark.

Enhancing Deep Semantic Segmentation of RGB-D Data with Entangled Forests

Matteo Terreran, Elia Bonetto, Stefano Ghidoni

Auto-TLDR; FuseNet: A Lighter Deep Learning Model for Semantic Segmentation

EdgeNet: Semantic Scene Completion from a Single RGB-D Image

Similar papers

Towards Efficient 3D Point Cloud Scene Completion Via Novel Depth View Synthesis

Enhancing Deep Semantic Segmentation of RGB-D Data with Entangled Forests

Improving Visual Relation Detection Using Depth Maps

Fast and Accurate Real-Time Semantic Segmentation with Dilated Asymmetric Convolutions

Global-Local Attention Network for Semantic Segmentation in Aerial Images

In Depth Semantic Scene Completion

Enhancing Semantic Segmentation of Aerial Images with Inhibitory Neurons

Improving Robotic Grasping on Monocular Images Via Multi-Task Learning and Positional Loss

FatNet: A Feature-Attentive Network for 3D Point Cloud Processing

Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation

Delivering Meaningful Representation for Monocular Depth Estimation

Domain Siamese CNNs for Sparse Multispectral Disparity Estimation

SECI-GAN: Semantic and Edge Completion for Dynamic Objects Removal

Multi-Direction Convolution for Semantic Segmentation

Dynamic Guided Network for Monocular Depth Estimation

3D Semantic Labeling of Photogrammetry Meshes Based on Active Learning

FastCompletion: A Cascade Network with Multiscale Group-Fused Inputs for Real-Time Depth Completion

Real-Time Monocular Depth Estimation with Extremely Light-Weight Neural Network

DmifNet:3D Shape Reconstruction Based on Dynamic Multi-Branch Information Fusion

DEN: Disentangling and Exchanging Network for Depth Completion

BP-Net: Deep Learning-Based Superpixel Segmentation for RGB-D Image

Planar 3D Transfer Learning for End to End Unimodal MRI Unbalanced Data Segmentation

Incorporating Depth Information into Few-Shot Semantic Segmentation

Directional Graph Networks with Hard Weight Assignments

Extending Single Beam Lidar to Full Resolution by Fusing with Single Image Depth Estimation

Weight Estimation from an RGB-D Camera in Top-View Configuration

A GAN-Based Blind Inpainting Method for Masonry Wall Images

Automatic Semantic Segmentation of Structural Elements related to the Spinal Cord in the Lumbar Region by Using Convolutional Neural Networks

Semantic Object Segmentation in Cultural Sites Using Real and Synthetic Data

Light3DPose: Real-Time Multi-Person 3D Pose Estimation from Multiple Views

Partially Supervised Multi-Task Network for Single-View Dietary Assessment

Multiple Document Datasets Pre-Training Improves Text Line Detection with Deep Neural Networks

Walk the Lines: Object Contour Tracing CNN for Contour Completion of Ships

A Fine-Grained Dataset and Its Efficient Semantic Segmentation for Unstructured Driving Scenarios

Polarimetric Image Augmentation

RescueNet: Joint Building Segmentation and Damage Assessment from Satellite Imagery

IPT: A Dataset for Identity Preserved Tracking in Closed Domains

Semantic Segmentation Refinement Using Entropy and Boundary-guided Monte Carlo Sampling and Directed Regional Search

MixedFusion: 6D Object Pose Estimation from Decoupled RGB-Depth Features

Revisiting Sequence-To-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Semi-Supervised Deep Learning Techniques for Spectrum Reconstruction

Yolo+FPN: 2D and 3D Fused Object Detection with an RGB-D Camera

Multiscale Attention-Based Prototypical Network for Few-Shot Semantic Segmentation

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

Joint Supervised and Self-Supervised Learning for 3D Real World Challenges

Progressive Scene Segmentation Based on Self-Attention Mechanism

P2D: A Self-Supervised Method for Depth Estimation from Polarimetry

Surface IR Reflectance Estimation and Material Recognition Using ToF Camera