ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

VGG-Embedded Adaptive Layer-Normalized Crowd Counting Net with Scale-Shuffling Modules

Dewen Guo, Jie Feng, Bingfeng Zhou

Auto-TLDR; VadaLN: VGG-embedded Adaptive Layer Normalization for Crowd Counting

Abstract Slides Poster

Crowd counting is widely used in real-time congestion monitoring and public security. Due to the limited data, many methods have little ability to be generalized because the differences between feature domains are not taken into consideration. We propose VGG-embedded adaptive layer normalization (VadaLN) to filter the features that irrelevant to the counting tasks in order that the counting results should not be affected by the image quality, color or illumination. VadaLN is implemented on the pretrained VGG-16 backbone. There is no additional learning parameters required through our method. VadaLN incoporates the proposed scale-shuffling modules (SSM) to relax the distortions in upsampling operations. Besides, non-aligned training methdology for the estimation of density maps is leveraged by an adversarial contextual loss (ACL) to improve the counting performance. Based on the proposed method, we construct an end-to-end trainable baseline model without bells and whistles, namely VadaLNet, which outperforms several recent state-of-the-art methods on commonly used challenging standard benchmarks. The intermediate scale-shuffled results are combined to formulate a scale-complementary strategy as a more powerful network, namely as VadaLNeSt. We implement VadaLNeSt on standard benchmarks, e.g. ShanghaiTech (Part A & Part B), UCF_CC_50, and UCF_QNRF, to show the superiority of our method.

Similar papers

Spatial-Related and Scale-Aware Network for Crowd Counting

Lei Li, Yuan Dong, Hongliang Bai

Auto-TLDR; Spatial Attention for Crowd Counting

VGG-Embedded Adaptive Layer-Normalized Crowd Counting Net with Scale-Shuffling Modules

Similar papers

Spatial-Related and Scale-Aware Network for Crowd Counting

Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting

HANet: Hybrid Attention-Aware Network for Crowd Counting

Multi-Resolution Fusion and Multi-Scale Input Priors Based Crowd Counting

PHNet: Parasite-Host Network for Video Crowd Counting

Learning from Web Data: Improving Crowd Counting Via Semi-Supervised Learning

Learning Error-Driven Curriculum for Crowd Counting

DAPC: Domain Adaptation People Counting Via Style-Level Transfer Learning and Scene-Aware Estimation

Point In: Counting Trees with Weakly Supervised Segmentation Network

Boosting High-Level Vision with Joint Compression Artifacts Reduction and Super-Resolution

Learning a Dynamic High-Resolution Network for Multi-Scale Pedestrian Detection

MagnifierNet: Learning Efficient Small-Scale Pedestrian Detector towards Multiple Dense Regions

Distortion-Adaptive Grape Bunch Counting for Omnidirectional Images

Hierarchically Aggregated Residual Transformation for Single Image Super Resolution

PRF-Ped: Multi-Scale Pedestrian Detector with Prior-Based Receptive Field

Delivering Meaningful Representation for Monocular Depth Estimation

Bidirectional Matrix Feature Pyramid Network for Object Detection

Mutual-Supervised Feature Modulation Network for Occluded Pedestrian Detection

Small Object Detection by Generative and Discriminative Learning

Residual Fractal Network for Single Image Super Resolution by Widening and Deepening

Dynamic Guided Network for Monocular Depth Estimation

Free-Form Image Inpainting Via Contrastive Attention Network

Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation

TinyVIRAT: Low-Resolution Video Action Recognition

Nighttime Pedestrian Detection Based on Feature Attention and Transformation

Efficient-Receptive Field Block with Group Spatial Attention Mechanism for Object Detection

Attention Pyramid Module for Scene Recognition

Enhanced Feature Pyramid Network for Semantic Segmentation

Coarse to Fine: Progressive and Multi-Task Learning for Salient Object Detection

Thermal Image Enhancement Using Generative Adversarial Network for Pedestrian Detection

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

Progressive Splitting and Upscaling Structure for Super-Resolution

SFPN: Semantic Feature Pyramid Network for Object Detection

Face Super-Resolution Network with Incremental Enhancement of Facial Parsing Information

Wavelet Attention Embedding Networks for Video Super-Resolution

TSMSAN: A Three-Stream Multi-Scale Attentive Network for Video Saliency Detection

PSDNet: A Balanced Architecture of Accuracy and Parameters for Semantic Segmentation

FastSal: A Computationally Efficient Network for Visual Saliency Prediction

P2 Net: Augmented Parallel-Pyramid Net for Attention Guided Pose Estimation

Small Object Detection Leveraging on Simultaneous Super-Resolution

Boundary-Aware Graph Convolution for Semantic Segmentation

Super-Resolution Guided Pore Detection for Fingerprint Recognition

Cascade Saliency Attention Network for Object Detection in Remote Sensing Images

Global-Local Attention Network for Semantic Segmentation in Aerial Images

Deeply-Fused Attentive Network for Stereo Matching

Single Image Super-Resolution with Dynamic Residual Connection

Learning to Rank for Active Learning: A Listwise Approach

Cross-Domain Semantic Segmentation of Urban Scenes Via Multi-Level Feature Alignment