ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Exploring and Exploiting the Hierarchical Structure of a Scene for Scene Graph Generation

Ikuto Kurosawa, Tetsunori Kobayashi, Yoshihiko Hayashi

Auto-TLDR; A Hierarchical Model for Scene Graph Generation

Abstract Slides Poster

The scene graph of an image is an explicit, concise representation of the image; hence, it can be used in various applications such as visual question answering or robot vision. We propose a novel neural network model for generating scene graphs that maintain global consistency, which prevents the generation of unrealistic scene graphs; the performance in the scene graph generation task is expected to improve. Our proposed model is used to construct a hierarchical structure whose leaf nodes correspond to objects depicted in the image, and a message is passed along the estimated structure on the fly. To this end, we aggregate features of all objects into the root node of the hierarchical structure, and the global context is back-propagated to the root node to maintain all the object nodes. The experimental results on the Visual Genome dataset indicate that the proposed model outperformed the existing models in scene graph generation tasks. We further qualitatively confirmed that the hierarchical structures captured by the proposed model seemed to be valid.

Similar papers

Human-Centric Parsing Network for Human-Object Interaction Detection

Guanyu Chen, Chong Chen, Zhicheng Zhao, Fei Su

Auto-TLDR; Human-Centric Parsing Network for Human-Object Interactions Detection

Exploring and Exploiting the Hierarchical Structure of a Scene for Scene Graph Generation

Similar papers

Human-Centric Parsing Network for Human-Object Interaction Detection

Using Scene Graphs for Detecting Visual Relationships

Context for Object Detection Via Lightweight Global and Mid-Level Representations

Privacy Attributes-Aware Message Passing Neural Network for Visual Privacy Attributes Classification

What Nodes Vote To? Graph Classification without Readout Phase

TreeRNN: Topology-Preserving Deep Graph Embedding and Learning

Equation Attention Relationship Network (EARN) : A Geometric Deep Metric Framework for Learning Similar Math Expression Embedding

Multi-Modal Contextual Graph Neural Network for Text Visual Question Answering

Classification of Intestinal Gland Cell-Graphs Using Graph Neural Networks

Region and Relations Based Multi Attention Network for Graph Classification

Object Detection Using Dual Graph Network

Improving Visual Relation Detection Using Depth Maps

A General Model for Learning Node and Graph Representations Jointly

MAGNet: Multi-Region Attention-Assisted Grounding of Natural Language Queries at Phrase Level

Boundary-Aware Graph Convolution for Semantic Segmentation

Integrating Historical States and Co-Attention Mechanism for Visual Dialog

PICK: Processing Key Information Extraction from Documents Using Improved Graph Learning-Convolutional Networks

FashionGraph: Understanding Fashion Data Using Scene Graph Generation

Image Inpainting with Contrastive Relation Network

Zero-Shot Text Classification with Semantically Extended Graph Convolutional Network

Cross-View Relation Networks for Mammogram Mass Detection

Graph Discovery for Visual Test Generation

More Correlations Better Performance: Fully Associative Networks for Multi-Label Image Classification

Multi-Stage Attention Based Visual Question Answering

Edge-Aware Graph Attention Network for Ratio of Edge-User Estimation in Mobile Networks

Question-Agnostic Attention for Visual Question Answering

A Novel Region of Interest Extraction Layer for Instance Segmentation

Object Detection Model Based on Scene-Level Region Proposal Self-Attention

Detective: An Attentive Recurrent Model for Sparse Object Detection

SIMCO: SIMilarity-Based Object COunting

Detecting Objects with High Object Region Percentage

Label Incorporated Graph Neural Networks for Text Classification

Transformer Reasoning Network for Image-Text Matching and Retrieval

Modeling Extent-Of-Texture Information for Ground Terrain Recognition

GCNs-Based Context-Aware Short Text Similarity Model

Learning Connectivity with Graph Convolutional Networks

CAggNet: Crossing Aggregation Network for Medical Image Segmentation

AOAM: Automatic Optimization of Adjacency Matrix for Graph Convolutional Network

DualBox: Generating BBox Pair with Strong Correspondence Via Occlusion Pattern Clustering and Proposal Refinement

CASNet: Common Attribute Support Network for Image Instance and Panoptic Segmentation

Adaptive Word Embedding Module for Semantic Reasoning in Large-Scale Detection

Point In: Counting Trees with Weakly Supervised Segmentation Network

Dual Path Multi-Modal High-Order Features for Textual Content Based Visual Question Answering

Delivering Meaningful Representation for Monocular Depth Estimation

Reinforcement Learning with Dual Attention Guided Graph Convolution for Relation Extraction

Named Entity Recognition and Relation Extraction with Graph Neural Networks in Semi Structured Documents

Revisiting Graph Neural Networks: Graph Filtering Perspective

A Grid-Based Representation for Human Action Recognition