ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation

Jing Liu, Xiaona Zhang, Zhaoxin Li, Tianlu Mao

Auto-TLDR; Multi-scale Residual Pyramid Attention Network for Monocular Depth Estimation

Abstract Slides Poster

Monocular depth estimation is a challenging problem in computer vision and is crucial for understanding 3D scene geometry. Recently, deep convolutional neural networks (DCNNs) based methods have improved the estimation accuracy significantly. However, existing methods fail to consider complex textures and geometries in scenes, thereby resulting in loss of local details, distorted object boundaries, and blurry reconstruction. In this paper, we proposed an end-to-end Multi-scale Residual Pyramid Attention Network (MRPAN) to mitigate these problems.First,we propose a Multi-scale Attention Context Aggregation (MACA) module, which consists of Spatial Attention Module (SAM) and Global Attention Module (GAM). By considering the position and scale correlation of pixels from spatial and global perspectives, the proposed module can adaptively learn the similarity between pixels so as to obtain more global context information of the image and recover the complex structure in the scene. Then we proposed an improved Residual Refinement Module (RRM) to further refine the scene structure, giving rise to deeper semantic information and retain more local details. Experimental results show that our method achieves more promisin performance in object boundaries and local details compared with other state-of-the-art methods.

Similar papers

Dynamic Guided Network for Monocular Depth Estimation

Xiaoxia Xing, Yinghao Cai, Yiping Yang, Dayong Wen

Auto-TLDR; DGNet: Dynamic Guidance Upsampling for Self-attention-Decoding for Monocular Depth Estimation

Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation

Similar papers

Dynamic Guided Network for Monocular Depth Estimation

Ordinal Depth Classification Using Region-Based Self-Attention

Delivering Meaningful Representation for Monocular Depth Estimation

Attention Stereo Matching Network

Global-Local Attention Network for Semantic Segmentation in Aerial Images

Real-Time Monocular Depth Estimation with Extremely Light-Weight Neural Network

Real-Time Semantic Segmentation Via Region and Pixel Context Network

Extending Single Beam Lidar to Full Resolution by Fusing with Single Image Depth Estimation

Deeply-Fused Attentive Network for Stereo Matching

DARN: Deep Attentive Refinement Network for Liver Tumor Segmentation from 3D CT Volume

Enhanced Feature Pyramid Network for Semantic Segmentation

PSDNet: A Balanced Architecture of Accuracy and Parameters for Semantic Segmentation

Single Image Deblurring Using Bi-Attention Network

Spatial-Related and Scale-Aware Network for Crowd Counting

Learning Stereo Matchability in Disparity Regression Networks

Progressive Scene Segmentation Based on Self-Attention Mechanism

Partially Supervised Multi-Task Network for Single-View Dietary Assessment

Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting

Multi-Direction Convolution for Semantic Segmentation

Towards Efficient 3D Point Cloud Scene Completion Via Novel Depth View Synthesis

Attention Pyramid Module for Scene Recognition

P2D: A Self-Supervised Method for Depth Estimation from Polarimetry

Transitional Asymmetric Non-Local Neural Networks for Real-World Dirt Road Segmentation

GSTO: Gated Scale-Transfer Operation for Multi-Scale Feature Learning in Semantic Segmentation

Boundary-Aware Graph Convolution for Semantic Segmentation

Context-Aware Residual Module for Image Classification

FastCompletion: A Cascade Network with Multiscale Group-Fused Inputs for Real-Time Depth Completion

Efficient-Receptive Field Block with Group Spatial Attention Mechanism for Object Detection

Leveraging a Weakly Adversarial Paradigm for Joint Learning of Disparity and Confidence Estimation

DE-Net: Dilated Encoder Network for Automated Tongue Segmentation

Accurate Cell Segmentation in Digital Pathology Images Via Attention Enforced Networks

DEN: Disentangling and Exchanging Network for Depth Completion

Selective Kernel and Motion-Emphasized Loss Based Attention-Guided Network for HDR Imaging of Dynamic Scenes

Object Detection Model Based on Scene-Level Region Proposal Self-Attention

Free-Form Image Inpainting Via Contrastive Attention Network

Object Detection on Monocular Images with Two-Dimensional Canonical Correlation Analysis

TSMSAN: A Three-Stream Multi-Scale Attentive Network for Video Saliency Detection

Semantic Segmentation Refinement Using Entropy and Boundary-guided Monte Carlo Sampling and Directed Regional Search

Attention Based Coupled Framework for Road and Pothole Segmentation

6D Pose Estimation with Correlation Fusion

Enhanced Vote Network for 3D Object Detection in Point Clouds

Dual-Attention Guided Dropblock Module for Weakly Supervised Object Localization

Do Not Treat Boundaries and Regions Differently: An Example on Heart Left Atrial Segmentation

CSpA-DN: Channel and Spatial Attention Dense Network for Fusing PET and MRI Images

SFPN: Semantic Feature Pyramid Network for Object Detection

Aggregating Object Features Based on Attention Weights for Fine-Grained Image Retrieval

DA-RefineNet: Dual-Inputs Attention RefineNet for Whole Slide Image Segmentation

CT-UNet: An Improved Neural Network Based on U-Net for Building Segmentation in Remote Sensing Images