ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

Qi Song, Qianyi Jiang, Xiaolin Wei, Nan Li, Rui Zhang

Auto-TLDR; ReADS: Rectified Attentional Double Supervised Network for General Scene Text Recognition

Abstract Slides Poster

In recent years, scene text recognition is always regarded as a sequence-to-sequence problem. Connectionist Temporal Classification (CTC) and Attentional sequence recognition (Attn) are two very prevailing approaches to tackle this problem while they may fail in some scenarios respectively. CTC concentrates more on every individual character but is weak in text semantic dependency modeling. Attn based methods have better context semantic modeling ability while tends to overfit on limited training data. In this paper, we elaborately design a Rectified Attentional Double Supervised Network (ReADS) for general scene text recognition. To overcome the weakness of CTC and Attn, both of them are applied in our method but with different modules in two supervised branches which can make a complementary to each other. Moreover, effective spatial and channel attention mechanisms are introduced to eliminate background noise and extract valid foreground information. Finally, a simple rectified network is implemented to rectify irregular text. The ReADS can be trained end-to-end and only word-level annotations are required. Extensive experiments on various benchmarks verify the effectiveness of ReADS which achieves state-of-the-art performance.

Similar papers

Robust Lexicon-Free Confidence Prediction for Text Recognition

Qi Song, Qianyi Jiang, Rui Zhang, Xiaolin Wei

Auto-TLDR; Confidence Measurement for Optical Character Recognition using Single-Input Multi-Output Network

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

Similar papers

Robust Lexicon-Free Confidence Prediction for Text Recognition

IBN-STR: A Robust Text Recognizer for Irregular Text in Natural Scenes

Weakly Supervised Attention Rectification for Scene Text Recognition

A Multi-Head Self-Relation Network for Scene Text Recognition

Gaussian Constrained Attention Network for Scene Text Recognition

2D License Plate Recognition based on Automatic Perspective Rectification

MEAN: A Multi-Element Attention Based Network for Scene Text Recognition

Sample-Aware Data Augmentor for Scene Text Recognition

Recognizing Multiple Text Sequences from an Image by Pure End-To-End Learning

Text Recognition - Real World Data and Where to Find Them

Text Recognition in Real Scenarios with a Few Labeled Samples

Cost-Effective Adversarial Attacks against Scene Text Recognition

Cross-Lingual Text Image Recognition Via Multi-Task Sequence to Sequence Learning

Stratified Multi-Task Learning for Robust Spotting of Scene Texts

LODENet: A Holistic Approach to Offline Handwritten Chinese and Japanese Text Line Recognition

Global Context-Based Network with Transformer for Image2latex

Watch Your Strokes: Improving Handwritten Text Recognition with Deformable Convolutions

ConvMath : A Convolutional Sequence Network for Mathematical Expression Recognition

Enhancing Handwritten Text Recognition with N-Gram Sequencedecomposition and Multitask Learning

Feature Embedding Based Text Instance Grouping for Largely Spaced and Occluded Text Detection

DUET: Detection Utilizing Enhancement for Text in Scanned or Captured Documents

An Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped Text

A Transformer-Based Radical Analysis Network for Chinese Character Recognition

Dual Path Multi-Modal High-Order Features for Textual Content Based Visual Question Answering

Recursive Recognition of Offline Handwritten Mathematical Expressions

Mutually Guided Dual-Task Network for Scene Text Detection

Multi-Task Learning Based Traditional Mongolian Words Recognition

Improving Word Recognition Using Multiple Hypotheses and Deep Embeddings

Scene Text Detection with Selected Anchors

PICK: Processing Key Information Extraction from Documents Using Improved Graph Learning-Convolutional Networks

Local Gradient Difference Based Mass Features for Classification of 2D-3D Natural Scene Text Images

A Fast and Accurate Object Detector for Handwritten Digit String Recognition

RLST: A Reinforcement Learning Approach to Scene Text Detection Refinement

TCATD: Text Contour Attention for Scene Text Detection

Transferable Adversarial Attacks for Deep Scene Text Detection

Context Matters: Self-Attention for Sign Language Recognition

Online Trajectory Recovery from Offline Handwritten Japanese Kanji Characters of Multiple Strokes

PSDNet: A Balanced Architecture of Accuracy and Parameters for Semantic Segmentation

Attention As Activation

On-Device Text Image Super Resolution

Flow-Guided Spatial Attention Tracking for Egocentric Activity Recognition

Self-Training for Domain Adaptive Scene Text Detection

Attentive Part-Aware Networks for Partial Person Re-Identification

Boosting High-Level Vision with Joint Compression Artifacts Reduction and Super-Resolution

3D Attention Mechanism for Fine-Grained Classification of Table Tennis Strokes Using a Twin Spatio-Temporal Convolutional Neural Networks

A Gated and Bifurcated Stacked U-Net Module for Document Image Dewarping

Two-Level Attention-Based Fusion Learning for RGB-D Face Recognition

Global-Local Attention Network for Semantic Segmentation in Aerial Images