ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

GSTO: Gated Scale-Transfer Operation for Multi-Scale Feature Learning in Semantic Segmentation

Zhuoying Wang, Yongtao Wang, Zhi Tang, Yangyan Li, Ying Chen, Haibin Ling, Weisi Lin

Auto-TLDR; Gated Scale-Transfer Operation for Semantic Segmentation

Abstract Slides Poster

Existing CNN-based methods for semantic segmentation heavily depend on multi-scale features to meet the requirements of both semantic comprehension and detail preservation. State-of-the-art segmentation networks widely exploit conventional scale-transfer operations, i.e., up-sampling and down-sampling to learn multi-scale features. In this work, we find that these operations lead to scale-confused features and suboptimal performance because they are spatial-invariant and directly transit all feature information cross scales without spatial selection. To address this issue, we propose the Gated Scale-Transfer Operation (GSTO) to properly transit spatial-filtered features to another scale. Specifically, GSTO can work either with or without extra supervision. Unsupervised GSTO is learned from the feature itself while the supervised one is guided by the supervised probability matrix. Both forms of GSTO are lightweight and plug-and-play, which can be flexibly integrated into networks or modules for learning better multi-scale features. In particular, by plugging GSTO into HRNet, we get a more powerful backbone (namely GSTO-HRNet) for pixel labeling, and it achieves new state-of-the-art results on multiple benchmarks for semantic segmentation including Cityscapes, LIP and Pascal Context, with negligible extra computational cost. Moreover, experiment results demonstrate that GSTO can also significantly boost the performance of multi-scale feature aggregation modules like PPM and ASPP.

Similar papers

Enhanced Feature Pyramid Network for Semantic Segmentation

Mucong Ye, Ouyang Jinpeng, Ge Chen, Jing Zhang, Xiaogang Yu

Auto-TLDR; EFPN: Enhanced Feature Pyramid Network for Semantic Segmentation

GSTO: Gated Scale-Transfer Operation for Multi-Scale Feature Learning in Semantic Segmentation

Similar papers

Enhanced Feature Pyramid Network for Semantic Segmentation

Global-Local Attention Network for Semantic Segmentation in Aerial Images

Boundary-Aware Graph Convolution for Semantic Segmentation

PSDNet: A Balanced Architecture of Accuracy and Parameters for Semantic Segmentation

Multi-Direction Convolution for Semantic Segmentation

Fast and Accurate Real-Time Semantic Segmentation with Dilated Asymmetric Convolutions

Semantic Segmentation Refinement Using Entropy and Boundary-guided Monte Carlo Sampling and Directed Regional Search

Real-Time Semantic Segmentation Via Region and Pixel Context Network

Efficient High-Resolution High-Level-Semantic Representation Learning for Human Pose Estimation

Transitional Asymmetric Non-Local Neural Networks for Real-World Dirt Road Segmentation

Attention Pyramid Module for Scene Recognition

Efficient-Receptive Field Block with Group Spatial Attention Mechanism for Object Detection

Learning a Dynamic High-Resolution Network for Multi-Scale Pedestrian Detection

Single Image Deblurring Using Bi-Attention Network

Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting

SFPN: Semantic Feature Pyramid Network for Object Detection

Deeply-Fused Attentive Network for Stereo Matching

A Boundary-Aware Distillation Network for Compressed Video Semantic Segmentation

CASNet: Common Attribute Support Network for Image Instance and Panoptic Segmentation

Enhancing Semantic Segmentation of Aerial Images with Inhibitory Neurons

Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation

Spatial-Related and Scale-Aware Network for Crowd Counting

Progressive Scene Segmentation Based on Self-Attention Mechanism

Context-Aware Residual Module for Image Classification

DARN: Deep Attentive Refinement Network for Liver Tumor Segmentation from 3D CT Volume

Attention Stereo Matching Network

Accurate Cell Segmentation in Digital Pathology Images Via Attention Enforced Networks

CT-UNet: An Improved Neural Network Based on U-Net for Building Segmentation in Remote Sensing Images

P2 Net: Augmented Parallel-Pyramid Net for Attention Guided Pose Estimation

UHRSNet: A Semantic Segmentation Network Specifically for Ultra-High-Resolution Images

Multiscale Attention-Based Prototypical Network for Few-Shot Semantic Segmentation

Do Not Treat Boundaries and Regions Differently: An Example on Heart Left Atrial Segmentation

Bidirectional Matrix Feature Pyramid Network for Object Detection

Cross-Domain Semantic Segmentation of Urban Scenes Via Multi-Level Feature Alignment

A Fine-Grained Dataset and Its Efficient Semantic Segmentation for Unstructured Driving Scenarios

Weakly Supervised Body Part Segmentation with Pose Based Part Priors

Joint Semantic-Instance Segmentation of 3D Point Clouds: Instance Separation and Semantic Fusion

Dynamic Guided Network for Monocular Depth Estimation

Ordinal Depth Classification Using Region-Based Self-Attention

PRF-Ped: Multi-Scale Pedestrian Detector with Prior-Based Receptive Field

Hierarchically Aggregated Residual Transformation for Single Image Super Resolution

Nighttime Pedestrian Detection Based on Feature Attention and Transformation

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

Attention Based Coupled Framework for Road and Pothole Segmentation

Dual-Attention Guided Dropblock Module for Weakly Supervised Object Localization

PCANet: Pyramid Context-Aware Network for Retinal Vessel Segmentation

Incorporating Depth Information into Few-Shot Semantic Segmentation

Object Detection Model Based on Scene-Level Region Proposal Self-Attention