ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

FourierNet: Compact Mask Representation for Instance Segmentation Using Differentiable Shape Decoders

Hamd Ul Moqeet Riaz, Nuri Benbarka, Andreas Zell

Auto-TLDR; FourierNet: A Single shot, anchor-free, fully convolutional instance segmentation method that predicts a shape vector

Abstract Slides Poster

We present FourierNet, a single shot, anchor-free, fully convolutional instance segmentation method that predicts a shape vector. Consequently, this shape vector is converted into the masks' contour points using a fast numerical transform. Compared to previous methods, we introduce a new training technique, where we utilize a differentiable shape decoder, which manages the automatic weight balancing of the shape vector's coefficients. We used the Fourier series as a shape encoder because of its coefficient interpretability and fast implementation. FourierNet shows promising results compared to polygon representation methods, achieving 30.6 mAP on the MS COCO 2017 benchmark. At lower image resolutions, it runs at 26.6 FPS with 24.3 mAP. It reaches 23.3 mAP using just eight parameters to represent the mask (note that at least four parameters are needed for bounding box prediction only). Qualitative analysis shows that suppressing a reasonable proportion of higher frequencies of Fourier series, still generates meaningful masks. These results validate our understanding that lower frequency components hold higher information for the segmentation task, and therefore, we can achieve a compressed representation. Code is available at: github.com/cogsys-tuebingen/FourierNet.

Similar papers

SyNet: An Ensemble Network for Object Detection in UAV Images

Berat Mert Albaba, Sedat Ozer

Auto-TLDR; SyNet: Combining Multi-Stage and Single-Stage Object Detection for Aerial Images

Abstract Poster Similar

Recent advances in camera equipped drone applications and their widespread use increased the demand on vision based object detection algorithms for aerial images. Object detection process is inherently a challenging task as a generic computer vision problem, however, since the use of object detection algorithms on UAVs (or on drones) is relatively a new area, it remains as a more challenging problem to detect objects in aerial images. There are several reasons for that including: (i) the lack of large drone datasets including large object variance, (ii) the large orientation and scale variance in drone images when compared to the ground images, and (iii) the difference in texture and shape features between the ground and the aerial images. Deep learning based object detection algorithms can be classified under two main categories: (a) single-stage detectors and (b) multi-stage detectors. Both single-stage and multi-stage solutions have their advantages and disadvantages over each other. However, a technique to combine the good sides of each of those solutions could yield even a stronger solution than each of those solutions individually. In this paper, we propose an ensemble network, SyNet, that combines a multi-stage method with a single-stage one with the motivation of decreasing the high false negative rate of multi-stage detectors and increasing the quality of the single-stage detector proposals. As building blocks, CenterNet and Cascade R-CNN with pretrained feature extractors are utilized along with an ensembling strategy. We report the state of the art results obtained by our proposed solution on two different datasets: namely MS-COCO and visDrone with \%52.1 $mAP_{IoU = 0.75}$ is obtained on MS-COCO $val2017$ dataset and \%26.2 $mAP_{IoU = 0.75}$ is obtained on VisDrone $test-set$. Our code is available at: https://github.com/mertalbaba/SyNet}{https://github.com/mer talbaba/SyNet

SFPN: Semantic Feature Pyramid Network for Object Detection

Yi Gan, Wei Xu, Jianbo Su

Auto-TLDR; SFPN: Semantic Feature Pyramid Network to Address Information Dilution Issue in FPN

FourierNet: Compact Mask Representation for Instance Segmentation Using Differentiable Shape Decoders

Similar papers

SyNet: An Ensemble Network for Object Detection in UAV Images

SFPN: Semantic Feature Pyramid Network for Object Detection

CASNet: Common Attribute Support Network for Image Instance and Panoptic Segmentation

A Novel Region of Interest Extraction Layer for Instance Segmentation

Bidirectional Matrix Feature Pyramid Network for Object Detection

CenterRepp: Predict Central Representative Point Set's Distribution for Detection

Small Object Detection by Generative and Discriminative Learning

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

Detecting Objects with High Object Region Percentage

Forground-Guided Vehicle Perception Framework

Scene Text Detection with Selected Anchors

Siamese Dynamic Mask Estimation Network for Fast Video Object Segmentation

HPERL: 3D Human Pose Estimastion from RGB and LiDAR

Detective: An Attentive Recurrent Model for Sparse Object Detection

StrongPose: Bottom-up and Strong Keypoint Heat Map Based Pose Estimation

Revisiting Sequence-To-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

FeatureNMS: Non-Maximum Suppression by Learning Feature Embeddings

One-Stage Multi-Task Detector for 3D Cardiac MR Imaging

Yolo+FPN: 2D and 3D Fused Object Detection with an RGB-D Camera

Object Detection Model Based on Scene-Level Region Proposal Self-Attention

Object Detection in the DCT Domain: Is Luminance the Solution?

Efficient Grouping for Keypoint Detection

Construction Worker Hardhat-Wearing Detection Based on an Improved BiFPN

Foreground-Focused Domain Adaption for Object Detection

Cascade Saliency Attention Network for Object Detection in Remote Sensing Images

Learning a Dynamic High-Resolution Network for Multi-Scale Pedestrian Detection

Hybrid Cascade Point Search Network for High Precision Bar Chart Component Detection

End-To-End Deep Learning Methods for Automated Damage Detection in Extreme Events at Various Scales

Utilising Visual Attention Cues for Vehicle Detection and Tracking

Point In: Counting Trees with Weakly Supervised Segmentation Network

Object Detection on Monocular Images with Two-Dimensional Canonical Correlation Analysis

SynDHN: Multi-Object Fish Tracker Trained on Synthetic Underwater Videos

Superpixel-Based Refinement for Object Proposal Generation

Hierarchical Head Design for Object Detectors

An Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped Text

Nighttime Pedestrian Detection Based on Feature Attention and Transformation

VTT: Long-Term Visual Tracking with Transformers

Tiny Object Detection in Aerial Images

Convolutional STN for Weakly Supervised Object Localization

Neural Compression and Filtering for Edge-assisted Real-time Object Detection in Challenged Networks

Simple Multi-Resolution Representation Learning for Human Pose Estimation

P2 Net: Augmented Parallel-Pyramid Net for Attention Guided Pose Estimation

MagnifierNet: Learning Efficient Small-Scale Pedestrian Detector towards Multiple Dense Regions

RSINet: Rotation-Scale Invariant Network for Online Visual Tracking

Machine-Learned Regularization and Polygonization of Building Segmentation Masks

DualBox: Generating BBox Pair with Strong Correspondence Via Occlusion Pattern Clustering and Proposal Refinement

Uncertainty Guided Recognition of Tiny Craters on the Moon

S-VoteNet: Deep Hough Voting with Spherical Proposal for 3D Object Detection