ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Enhancing Handwritten Text Recognition with N-Gram Sequencedecomposition and Multitask Learning

Vasiliki Tassopoulou, George Retsinas, Petros Maragos

Auto-TLDR; Multi-task Learning for Handwritten Text Recognition

Abstract Slides Poster

Current state-of-the-art approaches in the field of Handwritten Text Recognition are predominately single task with unigram, character level target units. In our work, we utilize a Multi-task Learning scheme, training the model to perform decompositions of the target sequence with target units of different granularity, from fine tocoarse. We consider this method as a way to utilize n-gram information, implicitly, in the training process, while the final recognition is performed using only the unigram output. Unigram decoding of sucha multi-task approach highlights the capability of the learned internal representations, imposed by the different n-grams at the training step. We select n-grams as our target units and we experiment from unigrams till fourgrams, namely subword level granularities.These multiple decompositions are learned from the network with task-specific CTC losses. Concerning network architectures, we pro-pose two alternatives, namely the Hierarchical and the Block Multi-task. Overall, our proposed model, even though evaluated only onthe unigram task, outperforms its counterpart single-task by absolute 2.52% WER and 1.02% CER, in the greedy decoding, without any computational overhead during inference, hinting towards success-fully imposing an implicit language model

Similar papers

Watch Your Strokes: Improving Handwritten Text Recognition with Deformable Convolutions

Iulian Cojocaru, Silvia Cascianelli, Lorenzo Baraldi, Massimiliano Corsini, Rita Cucchiara

Auto-TLDR; Deformable Convolutional Neural Networks for Handwritten Text Recognition

Enhancing Handwritten Text Recognition with N-Gram Sequencedecomposition and Multitask Learning

Similar papers

Watch Your Strokes: Improving Handwritten Text Recognition with Deformable Convolutions

LODENet: A Holistic Approach to Offline Handwritten Chinese and Japanese Text Line Recognition

Cross-Lingual Text Image Recognition Via Multi-Task Sequence to Sequence Learning

Recursive Recognition of Offline Handwritten Mathematical Expressions

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

Multi-Task Learning Based Traditional Mongolian Words Recognition

Improving Word Recognition Using Multiple Hypotheses and Deep Embeddings

Robust Lexicon-Free Confidence Prediction for Text Recognition

Online Trajectory Recovery from Offline Handwritten Japanese Kanji Characters of Multiple Strokes

The HisClima Database: Historical Weather Logs for Automatic Transcription and Information Extraction

ConvMath : A Convolutional Sequence Network for Mathematical Expression Recognition

Context Matters: Self-Attention for Sign Language Recognition

A Few-Shot Learning Approach for Historical Ciphered Manuscript Recognition

Continuous Sign Language Recognition with Iterative Spatiotemporal Fine-Tuning

Weakly Supervised Attention Rectification for Scene Text Recognition

Stroke Based Posterior Attention for Online Handwritten Mathematical Expression Recognition

Recognizing Multiple Text Sequences from an Image by Pure End-To-End Learning

Writer Identification Using Deep Neural Networks: Impact of Patch Size and Number of Patches

IBN-STR: A Robust Text Recognizer for Irregular Text in Natural Scenes

Global Context-Based Network with Transformer for Image2latex

Radical Counter Network for Robust Chinese Character Recognition

Stratified Multi-Task Learning for Robust Spotting of Scene Texts

PICK: Processing Key Information Extraction from Documents Using Improved Graph Learning-Convolutional Networks

PIN: A Novel Parallel Interactive Network for Spoken Language Understanding

A Transformer-Based Radical Analysis Network for Chinese Character Recognition

Segmenting Messy Text: Detecting Boundaries in Text Derived from Historical Newspaper Images

A Fast and Accurate Object Detector for Handwritten Digit String Recognition

2D License Plate Recognition based on Automatic Perspective Rectification

Predicting Chemical Properties Using Self-Attention Multi-Task Learning Based on SMILES Representation

Textual-Content Based Classification of Bundles of Untranscribed of Manuscript Images

Extracting Action Hierarchies from Action Labels and their Use in Deep Action Recognition

Exploiting the Logits: Joint Sign Language Recognition and Spell-Correction

Handwritten Digit String Recognition Using Deep Autoencoder Based Segmentation and ResNet Based Recognition Approach

Multimodal Side-Tuning for Document Classification

Fast Approximate Modelling of the Next Combination Result for Stopping the Text Recognition in a Video

Text Recognition - Real World Data and Where to Find Them

Gaussian Constrained Attention Network for Scene Text Recognition

Text Synopsis Generation for Egocentric Videos

Enriching Video Captions with Contextual Text

MA-LSTM: A Multi-Attention Based LSTM for Complex Pattern Extraction

Cross-People Mobile-Phone Based Airwriting Character Recognition

Recognizing Bengali Word Images - A Zero-Shot Learning Perspective

Text Recognition in Real Scenarios with a Few Labeled Samples

Revisiting Sequence-To-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Audio-Visual Speech Recognition Using a Two-Step Feature Fusion Strategy

Trainable Spectrally Initializable Matrix Transformations in Convolutional Neural Networks

Exploring Spatial-Temporal Representations for fNIRS-based Intimacy Detection via an Attention-enhanced Cascade Convolutional Recurrent Neural Network

Moto: Enhancing Embedding with Multiple Joint Factors for Chinese Text Classification