ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Weakly Supervised Attention Rectification for Scene Text Recognition

Chengyu Gu, Shilin Wang, Yiwei Zhu, Zheng Huang, Kai Chen

Auto-TLDR; An auxiliary supervision branch for attention-based scene text recognition

Abstract Slides Poster

Scene text recognition has become a hot topic in recent years due to its booming real-life applications. Attention-based encoder-decoder framework has become one of the most popular frameworks especially in the irregular text scenario. However, the “attention drift” problem reduces the recognition performance for most existing attention-based scene text recognition methods. To solve this problem, we propose an auxiliary supervision branch along with the attention-based encoder-decoder framework. A new loss function is designed to refine the feature map and to help the attention region align the target character area. Compared with existing attention rectification mechanisms, our method does not require character-level annotations or introduce any additional trainable parameter. Furthermore, our method can improve the performance for both RNN-Attention and Scaled Dot-Product Attention. The experiment results on various benchmarks have demonstrated that the proposed approach outperforms the state-of-the-art methods in both regular and irregular text recognition scenarios.

Similar papers

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

Qi Song, Qianyi Jiang, Xiaolin Wei, Nan Li, Rui Zhang

Auto-TLDR; ReADS: Rectified Attentional Double Supervised Network for General Scene Text Recognition

Weakly Supervised Attention Rectification for Scene Text Recognition

Similar papers

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

A Multi-Head Self-Relation Network for Scene Text Recognition

Gaussian Constrained Attention Network for Scene Text Recognition

MEAN: A Multi-Element Attention Based Network for Scene Text Recognition

IBN-STR: A Robust Text Recognizer for Irregular Text in Natural Scenes

Text Recognition in Real Scenarios with a Few Labeled Samples

Recognizing Multiple Text Sequences from an Image by Pure End-To-End Learning

2D License Plate Recognition based on Automatic Perspective Rectification

Cost-Effective Adversarial Attacks against Scene Text Recognition

Text Recognition - Real World Data and Where to Find Them

Stratified Multi-Task Learning for Robust Spotting of Scene Texts

Sample-Aware Data Augmentor for Scene Text Recognition

Cross-Lingual Text Image Recognition Via Multi-Task Sequence to Sequence Learning

Robust Lexicon-Free Confidence Prediction for Text Recognition

Global Context-Based Network with Transformer for Image2latex

DUET: Detection Utilizing Enhancement for Text in Scanned or Captured Documents

A Transformer-Based Radical Analysis Network for Chinese Character Recognition

Feature Embedding Based Text Instance Grouping for Largely Spaced and Occluded Text Detection

Dual Path Multi-Modal High-Order Features for Textual Content Based Visual Question Answering

Watch Your Strokes: Improving Handwritten Text Recognition with Deformable Convolutions

Scene Text Detection with Selected Anchors

Self-Training for Domain Adaptive Scene Text Detection

LODENet: A Holistic Approach to Offline Handwritten Chinese and Japanese Text Line Recognition

Mutually Guided Dual-Task Network for Scene Text Detection

ConvMath : A Convolutional Sequence Network for Mathematical Expression Recognition

An Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped Text

TCATD: Text Contour Attention for Scene Text Detection

Local Gradient Difference Based Mass Features for Classification of 2D-3D Natural Scene Text Images

Multi-Task Learning Based Traditional Mongolian Words Recognition

RLST: A Reinforcement Learning Approach to Scene Text Detection Refinement

Enhancing Handwritten Text Recognition with N-Gram Sequencedecomposition and Multitask Learning

Improving Word Recognition Using Multiple Hypotheses and Deep Embeddings

PICK: Processing Key Information Extraction from Documents Using Improved Graph Learning-Convolutional Networks

PIN: A Novel Parallel Interactive Network for Spoken Language Understanding

Context Matters: Self-Attention for Sign Language Recognition

A Fast and Accurate Object Detector for Handwritten Digit String Recognition

Transferable Adversarial Attacks for Deep Scene Text Detection

Multi-Modal Contextual Graph Neural Network for Text Visual Question Answering

End-To-End Hierarchical Relation Extraction for Generic Form Understanding

Label or Message: A Large-Scale Experimental Survey of Texts and Objects Co-Occurrence

Attentive Part-Aware Networks for Partial Person Re-Identification

Stroke Based Posterior Attention for Online Handwritten Mathematical Expression Recognition

Radical Counter Network for Robust Chinese Character Recognition

Convolutional STN for Weakly Supervised Object Localization

Fusion of Global-Local Features for Image Quality Inspection of Shipping Label

The HisClima Database: Historical Weather Logs for Automatic Transcription and Information Extraction

Recursive Recognition of Offline Handwritten Mathematical Expressions

Context Visual Information-Based Deliberation Network for Video Captioning