ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Two-Level Attention-Based Fusion Learning for RGB-D Face Recognition

Hardik Uppal, Alireza Sepas-Moghaddam, Michael Greenspan, Ali Etemad

Auto-TLDR; Fused RGB-D Facial Recognition using Attention-Aware Feature Fusion

Abstract Slides Poster

With recent advances in RGB-D sensing technologies as well as improvements in machine learning and fusion techniques, RGB-D facial recognition has become an active area of research. A novel attention aware method is proposed to fuse two image modalities, RGB and depth, for enhanced RGB-D facial recognition. The proposed method first extracts features from both modalities using a convolutional feature extractor. These features are then fused using a two layer attention mechanism. The first layer focuses on the fused feature maps generated by the feature extractor, exploiting the relationship between feature maps using LSTM recurrent learning. The second layer focuses on the spatial features of those maps using convolution. The training database is preprocessed and augmented through a set of geometric transformations, and the learning process is further aided using transfer learning from a pure 2D RGB image training process. Comparative evaluations demonstrate that the proposed method outperforms other state-of-the-art approaches, including both traditional and deep neural network-based methods, on the challenging CurtinFaces and IIIT-D RGB-D benchmark databases, achieving classification accuracies over 98.2% and 99.3% respectively. The proposed attention mechanism is also compared with other attention mechanisms, demonstrating more accurate results.

Similar papers

6D Pose Estimation with Correlation Fusion

Yi Cheng, Hongyuan Zhu, Ying Sun, Cihan Acar, Wei Jing, Yan Wu, Liyuan Li, Cheston Tan, Joo-Hwee Lim

Auto-TLDR; Intra- and Inter-modality Fusion for 6D Object Pose Estimation with Attention Mechanism

Two-Level Attention-Based Fusion Learning for RGB-D Face Recognition

Similar papers

6D Pose Estimation with Correlation Fusion

Attentive Hybrid Feature Based a Two-Step Fusion for Facial Expression Recognition

Gait Recognition Using Multi-Scale Partial Representation Transformation with Capsules

Multi-Stage Attention Based Visual Question Answering

MANet: Multimodal Attention Network Based Point-View Fusion for 3D Shape Recognition

Facial Expression Recognition Using Residual Masking Network

Depth Videos for the Classification of Micro-Expressions

Face Anti-Spoofing Using Spatial Pyramid Pooling

Attention-Driven Body Pose Encoding for Human Activity Recognition

Collaborative Human Machine Attention Module for Character Recognition

An Improved Bilinear Pooling Method for Image-Based Action Recognition

Question-Agnostic Attention for Visual Question Answering

Attentive Part-Aware Networks for Partial Person Re-Identification

FatNet: A Feature-Attentive Network for 3D Point Cloud Processing

A Cross Domain Multi-Modal Dataset for Robust Face Anti-Spooﬁng

MixedFusion: 6D Object Pose Estimation from Decoupled RGB-Depth Features

PSDNet: A Balanced Architecture of Accuracy and Parameters for Semantic Segmentation

SAT-Net: Self-Attention and Temporal Fusion for Facial Action Unit Detection

Vision-Based Multi-Modal Framework for Action Recognition

Weight Estimation from an RGB-D Camera in Top-View Configuration

Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

Exploring Spatial-Temporal Representations for fNIRS-based Intimacy Detection via an Attention-enhanced Cascade Convolutional Recurrent Neural Network

PrivAttNet: Predicting Privacy Risks in Images Using Visual Attention

Pose-Robust Face Recognition by Deep Meta Capsule Network-Based Equivariant Embedding

Incorporating Depth Information into Few-Shot Semantic Segmentation

Adaptive Feature Fusion Network for Gaze Tracking in Mobile Tablets

Video-Based Facial Expression Recognition Using Graph Convolutional Networks

Enhancing Deep Semantic Segmentation of RGB-D Data with Entangled Forests

Video Face Manipulation Detection through Ensemble of CNNs

Dual-Attention Guided Dropblock Module for Weakly Supervised Object Localization

Attention Pyramid Module for Scene Recognition

Flow-Guided Spatial Attention Tracking for Egocentric Activity Recognition

Answer-Checking in Context: A Multi-Modal Fully Attention Network for Visual Question Answering

Improving Visual Relation Detection Using Depth Maps

Multi-Scale Cascading Network with Compact Feature Learning for RGB-Infrared Person Re-Identification

Space-Time Domain Tensor Neural Networks: An Application on Human Pose Classification

Generalized Iris Presentation Attack Detection Algorithm under Cross-Database Settings

Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation

Context-Aware Residual Module for Image Classification

Attention As Activation

CSpA-DN: Channel and Spatial Attention Dense Network for Fusing PET and MRI Images

Efficient-Receptive Field Block with Group Spatial Attention Mechanism for Object Detection

Joint Face Alignment and 3D Face Reconstruction with Efficient Convolution Neural Networks

Unsupervised Disentangling of Viewpoint and Residues Variations by Substituting Representations for Robust Face Recognition

Ordinal Depth Classification Using Region-Based Self-Attention

Age Gap Reducer-GAN for Recognizing Age-Separated Faces

Lightweight Low-Resolution Face Recognition for Surveillance Applications

3D Attention Mechanism for Fine-Grained Classification of Table Tennis Strokes Using a Twin Spatio-Temporal Convolutional Neural Networks