ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

End-To-End Hierarchical Relation Extraction for Generic Form Understanding

Tuan Anh Nguyen Dang, Duc-Thanh Hoang, Quang Bach Tran, Chih-Wei Pan, Thanh-Dat Nguyen

Auto-TLDR; Joint Entity Labeling and Link Prediction for Form Understanding in Noisy Scanned Documents

Abstract Slides Poster

Form understanding is a challenging problem which aims to recognize semantic entities from the input document and their hierarchical relations. Previous approaches face a significant difficulty dealing with the complexity of the task, thus treat these objectives separately. To this end, we present a novel deep neural network to jointly perform both Entity Labeling and link prediction in an end-to-end fashion. Our model extends the Multi-stage Attentional U-Net architecture with the Part-Intensity Fields and Part-Association Fields for link prediction, enriching the spatial information flow with the additional supervision from Entity Linking. We demonstrate the effectiveness of the model on the \textit{Form Understanding in Noisy Scanned Documents} \textit{(FUNSD)} dataset, where our method substantially outperforms the original model and state-of-the-art baselines in both Entity Labeling and Entity Linking task.

Similar papers

Named Entity Recognition and Relation Extraction with Graph Neural Networks in Semi Structured Documents

Manuel Carbonell, Pau Riba, Mauricio Villegas, Alicia Fornés, Josep Llados

Auto-TLDR; Graph Neural Network for Entity Recognition and Relation Extraction in Semi-Structured Documents

Abstract Slides Similar

The use of administrative documents to communicate and leave record of business information requires of methods able to automatically extract and understand the content from such documents in a robust and efficient way. In addition, the semi-structured nature of these reports is specially suited for the use of graph-based representations which are flexible enough to adapt to the deformations from the different document templates. Moreover, Graph Neural Networks provide the proper methodology to learn relations among the data elements in these documents. In this work we study the use of Graph Neural Network architectures to tackle the problem of entity recognition and relation extraction in semi-structured documents. Our approach achieves state of the art results on the three tasks involved in the process. Moreover, the experimentation with two datasets of different nature demonstrates the good generalization ability of our approach.

PICK: Processing Key Information Extraction from Documents Using Improved Graph Learning-Convolutional Networks

Wenwen Yu, Ning Lu, Xianbiao Qi, Ping Gong, Rong Xiao

Auto-TLDR; PICK: A Graph Learning Framework for Key Information Extraction from Documents

End-To-End Hierarchical Relation Extraction for Generic Form Understanding

Similar papers

Named Entity Recognition and Relation Extraction with Graph Neural Networks in Semi Structured Documents

PICK: Processing Key Information Extraction from Documents Using Improved Graph Learning-Convolutional Networks

Multiple Document Datasets Pre-Training Improves Text Line Detection with Deep Neural Networks

Combining Deep and Ad-Hoc Solutions to Localize Text Lines in Ancient Arabic Document Images

Text Baseline Recognition Using a Recurrent Convolutional Neural Network

DUET: Detection Utilizing Enhancement for Text in Scanned or Captured Documents

An Integrated Approach of Deep Learning and Symbolic Analysis for Digital PDF Table Extraction

Segmenting Messy Text: Detecting Boundaries in Text Derived from Historical Newspaper Images

The HisClima Database: Historical Weather Logs for Automatic Transcription and Information Extraction

Feature Embedding Based Text Instance Grouping for Largely Spaced and Occluded Text Detection

Watch Your Strokes: Improving Handwritten Text Recognition with Deformable Convolutions

CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

Unsupervised deep learning for text line segmentation

StrongPose: Bottom-up and Strong Keypoint Heat Map Based Pose Estimation

Learning to Sort Handwritten Text Lines in Reading Order through Estimated Binary Order Relations

Multi-Modal Contextual Graph Neural Network for Text Visual Question Answering

Zero-Shot Text Classification with Semantically Extended Graph Convolutional Network

ID Documents Matching and Localization with Multi-Hypothesis Constraints

Dual Path Multi-Modal High-Order Features for Textual Content Based Visual Question Answering

Efficient Grouping for Keypoint Detection

A Gated and Bifurcated Stacked U-Net Module for Document Image Dewarping

Light3DPose: Real-Time Multi-Person 3D Pose Estimation from Multiple Views

Reinforcement Learning with Dual Attention Guided Graph Convolution for Relation Extraction

Simple Multi-Resolution Representation Learning for Human Pose Estimation

Vision-Based Layout Detection from Scientific Literature Using Recurrent Convolutional Neural Networks

PIN: A Novel Parallel Interactive Network for Spoken Language Understanding

LODENet: A Holistic Approach to Offline Handwritten Chinese and Japanese Text Line Recognition

CKG: Dynamic Representation Based on Context and Knowledge Graph

RefiNet: 3D Human Pose Refinement with Depth Maps

Using Scene Graphs for Detecting Visual Relationships

A Novel Attention-Based Aggregation Function to Combine Vision and Language

HPERL: 3D Human Pose Estimastion from RGB and LiDAR

A Grid-Based Representation for Human Action Recognition

Scene Text Detection with Selected Anchors

Cross-Supervised Joint-Event-Extraction with Heterogeneous Information Networks

Mutually Guided Dual-Task Network for Scene Text Detection

GCNs-Based Context-Aware Short Text Similarity Model

An Evaluation of DNN Architectures for Page Segmentation of Historical Newspapers

ConvMath : A Convolutional Sequence Network for Mathematical Expression Recognition

Visual Style Extraction from Chart Images for Chart Restyling

P2 Net: Augmented Parallel-Pyramid Net for Attention Guided Pose Estimation

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

Revisiting Sequence-To-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Weakly Supervised Attention Rectification for Scene Text Recognition

Boundary-Aware Graph Convolution for Semantic Segmentation

Image-Based Table Cell Detection: A New Dataset and an Improved Detection Method

Improving Visual Relation Detection Using Depth Maps

Label Incorporated Graph Neural Networks for Text Classification