Guoxi Huang
Paper download is intended for registered attendees only, and is
subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.
Papers from this author
Region-Based Non-Local Operation for Video Classification
Auto-TLDR; Regional-based Non-Local Operation for Deep Self-Attention in Convolutional Neural Networks
Abstract Slides Poster Similar
Convolutional Neural Networks (CNNs) model long-range dependencies by deeply stacking convolution operations with small window sizes, which makes the optimizations difficult. This paper presents region-based non-local operation (RNL), a family of self-attention mechanisms, which can directly capture long-range dependencies without a deep stack of local operations. Given an intermediate feature map, our method recalibrates the feature at a position by aggregating information from the neighboring regions of all positions. By combining a channel attention module with the proposed RNL, we design an attention chain, which can be integrated into off-the-shelf CNNs for end-to-end training. We evaluate our method on two video classification benchmarks. The experimental result of our method outperforms other attention mechanisms, and we achieve state-of-the-art performance on Something-Something V1.