ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Coarse to Fine: Progressive and Multi-Task Learning for Salient Object Detection

Dong-Goo Kang, Sangwoo Park, Joonki Paik

Auto-TLDR; Progressive and mutl-task learning scheme for salient object detection

Abstract Slides Poster

Most deep learning-based salient object detection (SOD) methods tried to manipulate the convolution block to effectively capture the context of object. In this paper, we propose a novel method, called progressive and mutl-task learning scheme, to extract the context of object by only manipulating the learning scheme without changing the network architecture. The progressive learning scheme is a method to grow the decoder progressively in the train phase. In other words, starting from easier low-resolution layers, it gradually adds high-resolution layers. Although the progressive learning successfullyl captures the context of object, its output boundary tends to be rough. To solve this problem, we also propose a multi-task learning (MTL) scheme that processes the object saliency map and contour in a single network jointly. The proposed MTL scheme trains the network in an edge-preserved direction through an auxiliary branch that learns contours. The proposed a learning scheme can be combined with other convolution block manipulation methods. Extensive experiments on five datasets show that the proposed method performs best compared with state-of-the-art methods in most cases.

Similar papers

Enhanced Feature Pyramid Network for Semantic Segmentation

Mucong Ye, Ouyang Jinpeng, Ge Chen, Jing Zhang, Xiaogang Yu

Auto-TLDR; EFPN: Enhanced Feature Pyramid Network for Semantic Segmentation

Coarse to Fine: Progressive and Multi-Task Learning for Salient Object Detection

Similar papers

Enhanced Feature Pyramid Network for Semantic Segmentation

FastSal: A Computationally Efficient Network for Visual Saliency Prediction

Point In: Counting Trees with Weakly Supervised Segmentation Network

Boundary-Aware Graph Convolution for Semantic Segmentation

Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting

TinyVIRAT: Low-Resolution Video Action Recognition

Revisiting Sequence-To-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

TSMSAN: A Three-Stream Multi-Scale Attentive Network for Video Saliency Detection

Saliency Prediction on Omnidirectional Images with Brain-Like Shallow Neural Network

Super-Resolution Guided Pore Detection for Fingerprint Recognition

Do Not Treat Boundaries and Regions Differently: An Example on Heart Left Atrial Segmentation

Utilising Visual Attention Cues for Vehicle Detection and Tracking

Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation

Dynamic Guided Network for Monocular Depth Estimation

Transitional Asymmetric Non-Local Neural Networks for Real-World Dirt Road Segmentation

CT-UNet: An Improved Neural Network Based on U-Net for Building Segmentation in Remote Sensing Images

Video Semantic Segmentation Using Deep Multi-View Representation Learning

SFPN: Semantic Feature Pyramid Network for Object Detection

PSDNet: A Balanced Architecture of Accuracy and Parameters for Semantic Segmentation

Learning to Segment Clustered Amoeboid Cells from Brightfield Microscopy Via Multi-Task Learning with Adaptive Weight Selection

Feature Embedding Based Text Instance Grouping for Largely Spaced and Occluded Text Detection

An Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped Text

Multi-Resolution Fusion and Multi-Scale Input Priors Based Crowd Counting

Boosting High-Level Vision with Joint Compression Artifacts Reduction and Super-Resolution

Multi-Laplacian GAN with Edge Enhancement for Face Super Resolution

Early Wildfire Smoke Detection in Videos

Real-Time Monocular Depth Estimation with Extremely Light-Weight Neural Network

Free-Form Image Inpainting Via Contrastive Attention Network

DE-Net: Dilated Encoder Network for Automated Tongue Segmentation

Robust Localization of Retinal Lesions Via Weakly-Supervised Learning

Adaptive Image Compression Using GAN Based Semantic-Perceptual Residual Compensation

VGG-Embedded Adaptive Layer-Normalized Crowd Counting Net with Scale-Shuffling Modules

Learning a Dynamic High-Resolution Network for Multi-Scale Pedestrian Detection

Learning from Web Data: Improving Crowd Counting Via Semi-Supervised Learning

Small Object Detection by Generative and Discriminative Learning

BG-Net: Boundary-Guided Network for Lung Segmentation on Clinical CT Images

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

DA-RefineNet: Dual-Inputs Attention RefineNet for Whole Slide Image Segmentation

Learning Object Deformation and Motion Adaption for Semi-Supervised Video Object Segmentation

Global-Local Attention Network for Semantic Segmentation in Aerial Images

HANet: Hybrid Attention-Aware Network for Crowd Counting

Learning Semantic Representations Via Joint 3D Face Reconstruction and Facial Attribute Estimation

Delivering Meaningful Representation for Monocular Depth Estimation

Spatial-Related and Scale-Aware Network for Crowd Counting

PrivAttNet: Predicting Privacy Risks in Images Using Visual Attention

Semantic Segmentation of Breast Ultrasound Image with Pyramid Fuzzy Uncertainty Reduction and Direction Connectedness Feature

MRP-Net: A Light Multiple Region Perception Neural Network for Multi-Label AU Detection

Machine-Learned Regularization and Polygonization of Building Segmentation Masks