ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Second-Order Attention Guided Convolutional Activations for Visual Recognition

Shannan Chen, Qian Wang, Qiule Sun, Bin Liu, Jianxin Zhang, Qiang Zhang

Auto-TLDR; Second-order Attention Guided Network for Convolutional Neural Networks for Visual Recognition

Abstract Slides Poster

Recently, modeling deep convolutional activations by the global second-order pooling has shown great advance on visual recognition tasks. However, most of the existing deep second-order statistical models mainly compute second-order statistics of activations of the last convolutional layer as image representations, and they seldom introduce second-order statistics into earlier layers to better fit network topology, thus limiting the representational ability to a certain extent. Motivated by the flexibility of attention blocks that are commonly plugged into intermediate layers of deep convolutional networks (ConvNets), this work makes an attempt to combine deep second-order statistics with attention mechanisms in ConvNets, and further proposes a novel Second-order Attention Guided Network (SoAG-Net) for visual recognition. More specifically, SoAG-Net involves several SoAG modules seemingly inserted into intermediate layers of the network, in which SoAG collects second-order statistics of convolutional activations by polynomial kernel approximation to predict channel-wise attention maps utilized for guiding the learning of convolutional activations through tensor scaling along channel dimension. SoAG improves the nonlinearity of ConvNets and enables ConvNets to fit more complicated distribution of convolutional activations. Experiment results on three commonly used datasets illuminate that SoAG-Net outperforms its counterparts and achieves competitive performance with state-of-the-art models under the same backbone.

Similar papers

Attention Pyramid Module for Scene Recognition

Zhinan Qiao, Xiaohui Yuan, Chengyuan Zhuang, Abolfazl Meyarian

Auto-TLDR; Attention Pyramid Module for Multi-Scale Scene Recognition

Second-Order Attention Guided Convolutional Activations for Visual Recognition

Similar papers

Attention Pyramid Module for Scene Recognition

Attention As Activation

Multi-Order Feature Statistical Model for Fine-Grained Visual Categorization

Dual-Attention Guided Dropblock Module for Weakly Supervised Object Localization

Improved Residual Networks for Image and Video Recognition

PSDNet: A Balanced Architecture of Accuracy and Parameters for Semantic Segmentation

HANet: Hybrid Attention-Aware Network for Crowd Counting

Region-Based Non-Local Operation for Video Classification

Efficient-Receptive Field Block with Group Spatial Attention Mechanism for Object Detection

More Correlations Better Performance: Fully Associative Networks for Multi-Label Image Classification

SCA Net: Sparse Channel Attention Module for Action Recognition

An Improved Bilinear Pooling Method for Image-Based Action Recognition

Self and Channel Attention Network for Person Re-Identification

Semantic Bilinear Pooling for Fine-Grained Recognition

Context-Aware Residual Module for Image Classification

Dynamic Multi-Path Neural Network

WeightAlign: Normalizing Activations by Weight Alignment

Transitional Asymmetric Non-Local Neural Networks for Real-World Dirt Road Segmentation

Cross-Layer Information Refining Network for Single Image Super-Resolution

ACRM: Attention Cascade R-CNN with Mix-NMS for Metallic Surface Defect Detection

Global-Local Attention Network for Semantic Segmentation in Aerial Images

Progressive Scene Segmentation Based on Self-Attention Mechanism

Arbitrary Style Transfer with Parallel Self-Attention

Feature-Dependent Cross-Connections in Multi-Path Neural Networks

Attention Stereo Matching Network

Learnable Higher-Order Representation for Action Recognition

CQNN: Convolutional Quadratic Neural Networks

Dynamic Guided Network for Monocular Depth Estimation

Real-Time Semantic Segmentation Via Region and Pixel Context Network

GSTO: Gated Scale-Transfer Operation for Multi-Scale Feature Learning in Semantic Segmentation

RSAN: Residual Subtraction and Attention Network for Single Image Super-Resolution

Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation

3D Attention Mechanism for Fine-Grained Classification of Table Tennis Strokes Using a Twin Spatio-Temporal Convolutional Neural Networks

Aggregating Object Features Based on Attention Weights for Fine-Grained Image Retrieval

Boundary-Aware Graph Convolution for Semantic Segmentation

DARN: Deep Attentive Refinement Network for Liver Tumor Segmentation from 3D CT Volume

Improving Batch Normalization with Skewness Reduction for Deep Neural Networks

Learning Recurrent High-Order Statistics for Skeleton-Based Hand Gesture Recognition

Deeply-Fused Attentive Network for Stereo Matching

Channel-Wise Dense Connection Graph Convolutional Network for Skeleton-Based Action Recognition

Learn to Segment Retinal Lesions and Beyond

MFI: Multi-Range Feature Interchange for Video Action Recognition

ScarfNet: Multi-Scale Features with Deeply Fused and Redistributed Semantics for Enhanced Object Detection

Collaborative Human Machine Attention Module for Character Recognition

Ordinal Depth Classification Using Region-Based Self-Attention

Fine-Tuning DARTS for Image Classification

FatNet: A Feature-Attentive Network for 3D Point Cloud Processing

Image Representation Learning by Transformation Regression