ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Context for Object Detection Via Lightweight Global and Mid-Level Representations

Mesut Erhan Unal, Adriana Kovashka

Auto-TLDR; Context-Based Object Detection with Semantic Similarity

Abstract Slides Poster

We propose an approach for explicitly capturing context in object detection. We model visual and geometric relationships between object regions, but also model the global scene as a first-class participant. In contrast to prior approaches, both the context we rely on, as well as our proposed mechanism for belief propagation over regions, is lightweight. We also experiment with capturing similarities between regions at a semantic level, by modeling class co-occurrence and linguistic similarity between class names. We show that our approach significantly outperforms Faster R-CNN, and performs competitively with a much more costly approach that also models context.

Similar papers

Object Detection Using Dual Graph Network

Shengjia Chen, Zhixin Li, Feicheng Huang, Canlong Zhang, Huifang Ma

Auto-TLDR; A Graph Convolutional Network for Object Detection with Key Relation Information

Abstract Slides Similar

Most object detection methods focus only on the local information near the region proposal and ignore the object's global semantic relation and local spatial relation information, resulting in limited performance. To capture and explore these important relations, we propose a detection method based on a graph convolutional network (GCN). Two independent relation graph networks are used to obtain the global semantic information of the object in labels and the local spatial information in images. Semantic relation networks can implicitly acquire global knowledge, and by constructing a directed graph on the dataset, each node is represented by the word embedding of labels and then sent to the GCN to obtain high-level semantic representation. The spatial relation network encodes the relation by the positional relation module and the visual connection module, and enriches the object features through local key information from objects. The feature representation is further improved by aggregating the outputs of the two networks. Instead of directly disseminating visual features in the network, the dual-graph network explores more advanced feature information, giving the detector the ability to obtain key relations in labels and region proposals. Experiments on the PASCAL VOC and MS COCO datasets demonstrate that key relation information significantly improve the performance of detection with better ability to detect small objects and reasonable boduning box. The results on COCO dataset demonstrate our method obtains around 32.3% improvement on AP in terms of small objects.

Adaptive Word Embedding Module for Semantic Reasoning in Large-Scale Detection

Yu Zhang, Xiaoyu Wu, Ruolin Zhu

Auto-TLDR; Adaptive Word Embedding Module for Object Detection

Context for Object Detection Via Lightweight Global and Mid-Level Representations

Similar papers

Object Detection Using Dual Graph Network

Adaptive Word Embedding Module for Semantic Reasoning in Large-Scale Detection

Using Scene Graphs for Detecting Visual Relationships

MAGNet: Multi-Region Attention-Assisted Grounding of Natural Language Queries at Phrase Level

Incrementally Zero-Shot Detection by an Extreme Value Analyzer

Boundary-Aware Graph Convolution for Semantic Segmentation

Human-Centric Parsing Network for Human-Object Interaction Detection

Exploring and Exploiting the Hierarchical Structure of a Scene for Scene Graph Generation

More Correlations Better Performance: Fully Associative Networks for Multi-Label Image Classification

Detective: An Attentive Recurrent Model for Sparse Object Detection

Improving Visual Relation Detection Using Depth Maps

Zero-Shot Text Classification with Semantically Extended Graph Convolutional Network

Detecting Objects with High Object Region Percentage

SyNet: An Ensemble Network for Object Detection in UAV Images

Semantics to Space(S2S): Embedding Semantics into Spatial Space for Zero-Shot Verb-Object Query Inferencing

ScarfNet: Multi-Scale Features with Deeply Fused and Redistributed Semantics for Enhanced Object Detection

A Novel Region of Interest Extraction Layer for Instance Segmentation

Prior Knowledge about Attributes: Learning a More Effective Potential Space for Zero-Shot Recognition

CASNet: Common Attribute Support Network for Image Instance and Panoptic Segmentation

SFPN: Semantic Feature Pyramid Network for Object Detection

Open Set Domain Recognition Via Attention-Based GCN and Semantic Matching Optimization

Forground-Guided Vehicle Perception Framework

Multi-Modal Contextual Graph Neural Network for Text Visual Question Answering

Small Object Detection by Generative and Discriminative Learning

Question-Agnostic Attention for Visual Question Answering

Small Object Detection Leveraging on Simultaneous Super-Resolution

Tiny Object Detection in Aerial Images

Foreground-Focused Domain Adaption for Object Detection

Learning a Dynamic High-Resolution Network for Multi-Scale Pedestrian Detection

Privacy Attributes-Aware Message Passing Neural Network for Visual Privacy Attributes Classification

ACRM: Attention Cascade R-CNN with Mix-NMS for Metallic Surface Defect Detection

Exploiting Knowledge Embedded Soft Labels for Image Recognition

Activity and Relationship Modeling Driven Weakly Supervised Object Detection

MagnifierNet: Learning Efficient Small-Scale Pedestrian Detector towards Multiple Dense Regions

Context Aware Group Activity Recognition

Hierarchical Head Design for Object Detectors

Label Incorporated Graph Neural Networks for Text Classification

VSR++: Improving Visual Semantic Reasoning for Fine-Grained Image-Text Matching

Object Detection Model Based on Scene-Level Region Proposal Self-Attention

Few-Shot Few-Shot Learning and the Role of Spatial Attention

Efficient-Receptive Field Block with Group Spatial Attention Mechanism for Object Detection

Transformer-Encoder Detector Module: Using Context to Improve Robustness to Adversarial Attacks on Object Detection

Iterative Bounding Box Annotation for Object Detection

Hybrid Cascade Point Search Network for High Precision Bar Chart Component Detection

Self-Selective Context for Interaction Recognition

HPERL: 3D Human Pose Estimastion from RGB and LiDAR

Dual Path Multi-Modal High-Order Features for Textual Content Based Visual Question Answering

StrongPose: Bottom-up and Strong Keypoint Heat Map Based Pose Estimation