ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Feature Embedding Based Text Instance Grouping for Largely Spaced and Occluded Text Detection

Pan Gao, Qi Wan, Renwu Gao, Linlin Shen

Auto-TLDR; Text Instance Embedding Based Feature Embeddings for Multiple Text Instance Grouping

Abstract Slides Poster

A text instance can be easily detected as multiple ones due to the large space between texts/characters, curved shape and partial occlusion. In this paper, a feature embedding based text instance grouping algorithm is proposed to solve this problem. To learn the feature space, a TIEM (Text Instance Embedding Module) is trained to minimize the within instance scatter and maximize the between instance scatter. Similarity between different text instances are measured in the feature space and merged if they meet certain conditions. Experimental results show that our approach can effectively connect text regions that belong to the same text instance. Competitive performance of our approach has been achieved on CTW1500, Total-Text, IC15 and a subset consists of texts selected from the three datasets, with large spacing and occlusions.

Similar papers

An Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped Text

Xijun Qian, Yifan Liu, Yu-Bin Yang

Auto-TLDR; TIKD: threshold insensitive kernel detector for arbitrary shaped text

Abstract Slides Similar

Recently, segmentation-based methods are popular in scene text detection due to the segmentation results can easily represent scene text of arbitrary shapes. However, previous works segment text instances the same as normal objects. It is obvious that the edge of the text instance differs from normal objects. In this paper, we propose a threshold insensitive kernel detector for arbitrary shaped text called TIKD, which includes a simple but stable base model and a new loss weight called Decay Loss Weight (DLW). By suppressing outlier pixels in a gradual way, the DLW can lead the network to detect more accurate text instances. Our method shows great power in accuracy and stability. It is worth mentioning that we achieve the precision, recall, f-measure of 88.7%, 83.7%, 86.1% respectively on the Total-Text dataset, with a fast speed of 16.3 frames per second. What’s more, even if we set the threshold in an extreme situation range from 0.1 to 0.9, our method can always achieve a stable f-measure over 79.9% on the Total-Text dataset.

TCATD: Text Contour Attention for Scene Text Detection

Ziling Hu, Wu Xingjiao, Jing Yang

Auto-TLDR; Text Contour Attention Text Detector

Feature Embedding Based Text Instance Grouping for Largely Spaced and Occluded Text Detection

Similar papers

An Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped Text

TCATD: Text Contour Attention for Scene Text Detection

Mutually Guided Dual-Task Network for Scene Text Detection

Scene Text Detection with Selected Anchors

Transferable Adversarial Attacks for Deep Scene Text Detection

DUET: Detection Utilizing Enhancement for Text in Scanned or Captured Documents

Self-Training for Domain Adaptive Scene Text Detection

Stratified Multi-Task Learning for Robust Spotting of Scene Texts

Recognizing Multiple Text Sequences from an Image by Pure End-To-End Learning

Cascade Saliency Attention Network for Object Detection in Remote Sensing Images

Automated Whiteboard Lecture Video Summarization by Content Region Detection and Representation

Local Gradient Difference Based Mass Features for Classification of 2D-3D Natural Scene Text Images

A Multi-Head Self-Relation Network for Scene Text Recognition

RLST: A Reinforcement Learning Approach to Scene Text Detection Refinement

MEAN: A Multi-Element Attention Based Network for Scene Text Recognition

SFPN: Semantic Feature Pyramid Network for Object Detection

Robust Lexicon-Free Confidence Prediction for Text Recognition

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

Vision-Based Layout Detection from Scientific Literature Using Recurrent Convolutional Neural Networks

IBN-STR: A Robust Text Recognizer for Irregular Text in Natural Scenes

Text Recognition - Real World Data and Where to Find Them

Weakly Supervised Attention Rectification for Scene Text Recognition

CASNet: Common Attribute Support Network for Image Instance and Panoptic Segmentation

MagnifierNet: Learning Efficient Small-Scale Pedestrian Detector towards Multiple Dense Regions

Gaussian Constrained Attention Network for Scene Text Recognition

Sample-Aware Data Augmentor for Scene Text Recognition

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

Construction Worker Hardhat-Wearing Detection Based on an Improved BiFPN

Dual Path Multi-Modal High-Order Features for Textual Content Based Visual Question Answering

Learning a Dynamic High-Resolution Network for Multi-Scale Pedestrian Detection

Bidirectional Matrix Feature Pyramid Network for Object Detection

Text Recognition in Real Scenarios with a Few Labeled Samples

2D License Plate Recognition based on Automatic Perspective Rectification

A Fast and Accurate Object Detector for Handwritten Digit String Recognition

End-To-End Hierarchical Relation Extraction for Generic Form Understanding

Watch Your Strokes: Improving Handwritten Text Recognition with Deformable Convolutions

An Integrated Approach of Deep Learning and Symbolic Analysis for Digital PDF Table Extraction

Text Baseline Recognition Using a Recurrent Convolutional Neural Network

A Novel Region of Interest Extraction Layer for Instance Segmentation

Object Detection Model Based on Scene-Level Region Proposal Self-Attention

Forground-Guided Vehicle Perception Framework

Image-Based Table Cell Detection: A New Dataset and an Improved Detection Method

P2 Net: Augmented Parallel-Pyramid Net for Attention Guided Pose Estimation

Boosting High-Level Vision with Joint Compression Artifacts Reduction and Super-Resolution

Cost-Effective Adversarial Attacks against Scene Text Recognition

Multiple Document Datasets Pre-Training Improves Text Line Detection with Deep Neural Networks

Hybrid Cascade Point Search Network for High Precision Bar Chart Component Detection

Detecting Objects with High Object Region Percentage