ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Exploring the Ability of CNNs to Generalise to Previously Unseen Scales Over Wide Scale Ranges

Ylva Jansson, Tony Lindeberg

Auto-TLDR; A theoretical analysis of invariance and covariance properties of scale channel networks

Abstract Slides Poster

The ability to handle large scale variations is crucial for many real world visual tasks. A straightforward approach to handling scale in a deep neural network is to process multiple rescaled image copies in a set of scale channels (subnetworks). Scale invariance can then, in principle, be achieved by using weight sharing between the scale channels together with max or average pooling over the outputs from the scale channels. The ability of such scale channel networks to generalise to scales not present in the training set over significant scale ranges has, however, not previously been explored. We, therefore, present a theoretical analysis of invariance and covariance properties of scale channel networks and perform an experimental evaluation of the ability of different types of scale channel networks to generalise to previously unseen scales. We identify limitations of previous approaches and propose a new type of foveated scale channel architecture, where the scale channels process increasingly larger parts of the image with decreasing resolution. Our proposed FovMax and FovAvg networks perform almost identically over a scale range of 8 also when training on single scale training data and give improvements in the small sample regime.

Similar papers

Understanding When Spatial Transformer Networks Do Not Support Invariance, and What to Do about It

Lukas Finnveden, Ylva Jansson, Tony Lindeberg

Auto-TLDR; Spatial Transformer Networks are unable to support invariance when transforming CNN feature maps

Exploring the Ability of CNNs to Generalise to Previously Unseen Scales Over Wide Scale Ranges

Similar papers

Understanding When Spatial Transformer Networks Do Not Support Invariance, and What to Do about It

Attention Pyramid Module for Scene Recognition

Locality-Promoting Representation Learning

Trainable Spectrally Initializable Matrix Transformations in Convolutional Neural Networks

FatNet: A Feature-Attentive Network for 3D Point Cloud Processing

Image Representation Learning by Transformation Regression

Efficient-Receptive Field Block with Group Spatial Attention Mechanism for Object Detection

Convolutional STN for Weakly Supervised Object Localization

ESResNet: Environmental Sound Classification Based on Visual Domain Models

Verifying the Causes of Adversarial Examples

ResFPN: Residual Skip Connections in Multi-Resolution Feature Pyramid Networks for Accurate Dense Pixel Matching

Kernel-based Graph Convolutional Networks

InsideBias: Measuring Bias in Deep Networks and Application to Face Gender Biometrics

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

A Novel Region of Interest Extraction Layer for Instance Segmentation

On the Minimal Recognizable Image Patch

A Close Look at Deep Learning with Small Data

Quaternion Capsule Networks

Generalization Comparison of Deep Neural Networks Via Output Sensitivity

Combined Invariants to Gaussian Blur and Affine Transformation

SiamMT: Real-Time Arbitrary Multi-Object Tracking

Improved Residual Networks for Image and Video Recognition

Light3DPose: Real-Time Multi-Person 3D Pose Estimation from Multiple Views

PS^2-Net: A Locally and Globally Aware Network for Point-Based Semantic Segmentation

Transferable Model for Shape Optimization subject to Physical Constraints

Uncertainty Guided Recognition of Tiny Craters on the Moon

PSDNet: A Balanced Architecture of Accuracy and Parameters for Semantic Segmentation

Hierarchically Aggregated Residual Transformation for Single Image Super Resolution

Interpolation in Auto Encoders with Bridge Processes

Contextual Classification Using Self-Supervised Auxiliary Models for Deep Neural Networks

ResNet-Like Architecture with Low Hardware Requirements

Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

Force Banner for the Recognition of Spatial Relations

Dimensionality Reduction for Data Visualization and Linear Classification, and the Trade-Off between Robustness and Classification Accuracy

Learning a Dynamic High-Resolution Network for Multi-Scale Pedestrian Detection

Feature-Dependent Cross-Connections in Multi-Path Neural Networks

Aggregating Object Features Based on Attention Weights for Fine-Grained Image Retrieval

On Resource-Efficient Bayesian Network Classifiers and Deep Neural Networks

Context-Aware Residual Module for Image Classification

CQNN: Convolutional Quadratic Neural Networks

Recursive Convolutional Neural Networks for Epigenomics

Can Data Placement Be Effective for Neural Networks Classification Tasks? Introducing the Orthogonal Loss

Ordinal Depth Classification Using Region-Based Self-Attention

ScarfNet: Multi-Scale Features with Deeply Fused and Redistributed Semantics for Enhanced Object Detection

On the Information of Feature Maps and Pruning of Deep Neural Networks

Filtered Batch Normalization

Revisiting the Training of Very Deep Neural Networks without Skip Connections

A CNN-RNN Framework for Image Annotation from Visual Cues and Social Network Metadata