Jizhong Han

Papers from this author

A Multi-Head Self-Relation Network for Scene Text Recognition

Zhou Junwei, Hongchao Gao, Jiao Dai, Dongqin Liu, Jizhong Han

Responsive image

Auto-TLDR; Multi-head Self-relation Network for Scene Text Recognition

Slides Poster Similar

The text embedded in scene images can be seen everywhere in our lives. However, recognizing text from natural scene images is still a challenge because of its diverse shapes and distorted patterns. Recently, advanced recognition networks generally treat scene text recognition as a sequence prediction task. Although achieving excellent performance, these recognition networks consider the feature map cells as independent individuals and update cells state without utilizing the information of their neighboring cells. And the local receptive field of traditional convolutional neural network (CNN) makes a single cell that cannot cover the whole text region in an image. Due to these issues, the existing recognition networks cannot extract the global context in a visual scene. To deal with the above problems, we propose a Multi-head Self-relation Network(MSRN) for scene text recognition in this paper. The MSRN consists of several multi-head self-relation layers, which is designed for extracting the global context of a visual scene, so that transforms a cell into a new cell that fuses the information of the related cells. Furthermore, experiments over several public datasets demonstrate that our proposed recognition network achieves superior performance on several benchmark datasets including IC03, IC13, IC15, SVT-Perspective.