ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Generalized Local Attention Pooling for Deep Metric Learning

Carlos Roig Mari, David Varas, Issey Masuda, Juan Carlos Riveiro, Elisenda Bou-Balust

Auto-TLDR; Generalized Local Attention Pooling for Deep Metric Learning

Abstract Slides Poster

Deep metric learning has been key to recent advances in face verification and image retrieval amongst others. These systems consist on a feature extraction block (extracts feature maps from images) followed by a spatial dimensionality reduction block (generates compact image representations from the feature maps) and an embedding generation module (projects the image representation to the embedding space). While research on deep metric learning has focused on improving the losses for the embedding generation module, the dimensionality reduction block has been overlooked. In this work, we propose a novel method to generate compact image representations which uses local spatial information through an attention mechanism, named Generalized Local Attention Pooling (GLAP). This method, instead of being placed at the end layer of the backbone, is connected at an intermediate level, resulting in lower memory requirements. We assess the performance of the aforementioned method by comparing it with multiple dimensionality reduction techniques, demonstrating the importance of using attention weights to generate robust compact image representations. Moreover, we compare the performance of multiple state-of-the-art losses using the standard deep metric learning system against the same experiment with our GLAP. Experiments showcase that the proposed Generalized Local Attention Pooling mechanism outperforms other pooling methods when compared with current state-of-the-art losses for deep metric learning.

Similar papers

Multi-Level Deep Learning Vehicle Re-Identification Using Ranked-Based Loss Functions

Eleni Kamenou, Jesus Martinez-Del-Rincon, Paul Miller, Patricia Devlin - Hill

Auto-TLDR; Multi-Level Re-identification Network for Vehicle Re-Identification

Generalized Local Attention Pooling for Deep Metric Learning

Similar papers

Multi-Level Deep Learning Vehicle Re-Identification Using Ranked-Based Loss Functions

Aggregating Object Features Based on Attention Weights for Fine-Grained Image Retrieval

Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

Building Computationally Efficient and Well-Generalizing Person Re-Identification Models with Metric Learning

Nonlinear Ranking Loss on Riemannian Potato Embedding

Deep Top-Rank Counter Metric for Person Re-Identification

Loop-closure detection by LiDAR scan re-identification

Adaptive L2 Regularization in Person Re-Identification

Not 3D Re-ID: Simple Single Stream 2D Convolution for Robust Video Re-Identification

Progressive Learning Algorithm for Efficient Person Re-Identification

Semantic Bilinear Pooling for Fine-Grained Recognition

Attention-Based Deep Metric Learning for Near-Duplicate Video Retrieval

SL-DML: Signal Level Deep Metric Learning for Multimodal One-Shot Action Recognition

Self and Channel Attention Network for Person Re-Identification

Multi-Order Feature Statistical Model for Fine-Grained Visual Categorization

FatNet: A Feature-Attentive Network for 3D Point Cloud Processing

Dual-Attention Guided Dropblock Module for Weakly Supervised Object Localization

Learning Embeddings for Image Clustering: An Empirical Study of Triplet Loss Approaches

Local Propagation for Few-Shot Learning

3D Facial Matching by Spiral Convolutional Metric Learning and a Biometric Fusion-Net of Demographic Properties

Attentive Part-Aware Networks for Partial Person Re-Identification

Multi-Label Contrastive Focal Loss for Pedestrian Attribute Recognition

Audio-Based Near-Duplicate Video Retrieval with Audio Similarity Learning

Equation Attention Relationship Network (EARN) : A Geometric Deep Metric Framework for Learning Similar Math Expression Embedding

Temporally Coherent Embeddings for Self-Supervised Video Representation Learning

Rethinking ReID：Multi-Feature Fusion Person Re-Identification Based on Orientation Constraints

Top-DB-Net: Top DropBlock for Activation Enhancement in Person Re-Identification

One-Shot Representational Learning for Joint Biometric and Device Authentication

TAAN: Task-Aware Attention Network for Few-Shot Classification

Contextual Classification Using Self-Supervised Auxiliary Models for Deep Neural Networks

Augmented Bi-Path Network for Few-Shot Learning

Multi-Scale Cascading Network with Compact Feature Learning for RGB-Infrared Person Re-Identification

Multi-Attribute Learning with Highly Imbalanced Data

Recognizing Bengali Word Images - A Zero-Shot Learning Perspective

3D Attention Mechanism for Fine-Grained Classification of Table Tennis Strokes Using a Twin Spatio-Temporal Convolutional Neural Networks

Attention As Activation

RGB-Infrared Person Re-Identification Via Image Modality Conversion

DFH-GAN: A Deep Face Hashing with Generative Adversarial Network

SSDL: Self-Supervised Domain Learning for Improved Face Recognition

Attention Pyramid Module for Scene Recognition

DAIL: Dataset-Aware and Invariant Learning for Face Recognition

Batch-Incremental Triplet Sampling for Training Triplet Networks Using Bayesian Updating Theorem

Efficient-Receptive Field Block with Group Spatial Attention Mechanism for Object Detection

Few-Shot Few-Shot Learning and the Role of Spatial Attention

Learnable Higher-Order Representation for Action Recognition

Automated Whiteboard Lecture Video Summarization by Content Region Detection and Representation

Self-Supervised Joint Encoding of Motion and Appearance for First Person Action Recognition

Exploiting Knowledge Embedded Soft Labels for Image Recognition