ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Directed Variational Cross-encoder Network for Few-Shot Multi-image Co-segmentation

Sayan Banerjee, Divakar Bhat S, Subhasis Chaudhuri, Rajbabu Velmurugan

Auto-TLDR; Directed Variational Inference Cross Encoder for Class Agnostic Co-Segmentation of Multiple Images

Abstract Slides Poster

In this paper, we propose a novel framework for class agnostic co-segmentation of multiple images using comparatively smaller datasets. We have developed a novel encoder-decoder network termed as DVICE (Directed Variational Inference Cross Encoder), which learns a continuous embedding space to ensure better similarity learning. We employ a combination of the proposed variational encoder-decoder and a novel few-shot learning approach to tackle the small sample size problem in co-segmentation. Furthermore, the proposed framework does not use any semantic class labels and is entirely class agnostic. Through exhaustive experimentation using a small volume of data over multiple datasets, we have demonstrated that our approach outperforms all existing state-of-the-art techniques.

Similar papers

Multiscale Attention-Based Prototypical Network for Few-Shot Semantic Segmentation

Yifei Zhang, Desire Sidibe, Olivier Morel, Fabrice Meriaudeau

Auto-TLDR; Few-shot Semantic Segmentation with Multiscale Feature Attention

Abstract Slides Similar

Deep learning-based image understanding techniques require a large number of labeled images for training. Few-shot semantic segmentation, on the contrary, aims at generalizing the segmentation ability of the model to new categories given only a few labeled samples. To tackle this problem, we propose a novel prototypical network (MAPnet) with multiscale feature attention. To fully exploit the representative features of target classes, we firstly extract rich contextual information of labeled support images via a multiscale feature enhancement module. The learned prototypes from support features provide further semantic guidance on the query image. Then we adaptively integrate multiple similarity-guided probability maps by attention mechanism, yielding an optimal pixel-wise prediction. Furthermore, the proposed method was validated on the PASCAL-5i dataset in terms of 1-way N-shot evaluation. We also test the model with weak annotations, including scribble and bounding box annotations. Both the qualitative and quantitative results demonstrate the advantages of our approach over other state-of-the-art methods.

TAAN: Task-Aware Attention Network for Few-Shot Classification

Zhe Wang, Li Liu, Fanzhang Li

Auto-TLDR; TAAN: Task-Aware Attention Network for Few-Shot Classification

Directed Variational Cross-encoder Network for Few-Shot Multi-image Co-segmentation

Similar papers

Multiscale Attention-Based Prototypical Network for Few-Shot Semantic Segmentation

TAAN: Task-Aware Attention Network for Few-Shot Classification

Dual-Attention Guided Dropblock Module for Weakly Supervised Object Localization

Incorporating Depth Information into Few-Shot Semantic Segmentation

Augmented Bi-Path Network for Few-Shot Learning

Enhanced Feature Pyramid Network for Semantic Segmentation

Free-Form Image Inpainting Via Contrastive Attention Network

Complementing Representation Deficiency in Few-Shot Image Classification: A Meta-Learning Approach

PSDNet: A Balanced Architecture of Accuracy and Parameters for Semantic Segmentation

Progressive Scene Segmentation Based on Self-Attention Mechanism

Video Semantic Segmentation Using Deep Multi-View Representation Learning

Attention Based Coupled Framework for Road and Pothole Segmentation

Revisiting Sequence-To-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Efficient-Receptive Field Block with Group Spatial Attention Mechanism for Object Detection

Boundary-Aware Graph Convolution for Semantic Segmentation

Few-Shot Few-Shot Learning and the Role of Spatial Attention

Few-Shot Learning Based on Metric Learning Using Class Augmentation

Explanation-Guided Training for Cross-Domain Few-Shot Classification

Coarse-To-Fine Foreground Segmentation Based on Co-Occurrence Pixel-Block and Spatio-Temporal Attention Model

3D Medical Multi-Modal Segmentation Network Guided by Multi-Source Correlation Constraint

A Joint Representation Learning and Feature Modeling Approach for One-Class Recognition

Aggregating Object Features Based on Attention Weights for Fine-Grained Image Retrieval

ACRM: Attention Cascade R-CNN with Mix-NMS for Metallic Surface Defect Detection

GSTO: Gated Scale-Transfer Operation for Multi-Scale Feature Learning in Semantic Segmentation

Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting

Deeply-Fused Attentive Network for Stereo Matching

Makeup Style Transfer on Low-Quality Images with Weighted Multi-Scale Attention

Learning Object Deformation and Motion Adaption for Semi-Supervised Video Object Segmentation

Multi-Label Contrastive Focal Loss for Pedestrian Attribute Recognition

An Improved Bilinear Pooling Method for Image-Based Action Recognition

Global-Local Attention Network for Semantic Segmentation in Aerial Images

Two-Level Attention-Based Fusion Learning for RGB-D Face Recognition

CT-UNet: An Improved Neural Network Based on U-Net for Building Segmentation in Remote Sensing Images

Multi-Scale Cascading Network with Compact Feature Learning for RGB-Infrared Person Re-Identification

Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

Coarse to Fine: Progressive and Multi-Task Learning for Salient Object Detection

Dynamic Guided Network for Monocular Depth Estimation

Do Not Treat Boundaries and Regions Differently: An Example on Heart Left Atrial Segmentation

GuCNet: A Guided Clustering-Based Network for Improved Classification

MetaMix: Improved Meta-Learning with Interpolation-based Consistency Regularization

Attentive Part-Aware Networks for Partial Person Re-Identification

Generalized Local Attention Pooling for Deep Metric Learning

Transitional Asymmetric Non-Local Neural Networks for Real-World Dirt Road Segmentation

Semantics to Space(S2S): Embedding Semantics into Spatial Space for Zero-Shot Verb-Object Query Inferencing

Pose-Robust Face Recognition by Deep Meta Capsule Network-Based Equivariant Embedding

Image Representation Learning by Transformation Regression

Forground-Guided Vehicle Perception Framework

A Prototype-Based Generalized Zero-Shot Learning Framework for Hand Gesture Recognition