Qing Wang

Papers from this author

A Transformer-Based Radical Analysis Network for Chinese Character Recognition

Chen Yang, Qing Wang, Jun Du, Jianshu Zhang, Changjie Wu, Jiaming Wang

Responsive image

Auto-TLDR; Transformer-based Radical Analysis Network for Chinese Character Recognition

Slides Poster Similar

Recently, a novel radical analysis network (RAN) has the capability of effectively recognizing unseen Chinese character classes and largely reducing the requirement of training data by treating a Chinese character as a hierarchical composition of radicals rather than a single character class.} However, when dealing with more challenging issues, such as the recognition of complicated characters, low-frequency character categories, and characters in natural scenes, RAN still has a lot of room for improvement. In this paper, we explore options to further improve the structure generalization and robustness capability of RAN with the Transformer architecture, which has achieved start-of-the-art results for many sequence-to-sequence tasks. More specifically, we propose to replace the original attention module in RAN with the transformer decoder, which is named as a transformer-based radical analysis network (RTN). The experimental results show that the proposed approach can significantly outperform the RAN on both printed Chinese character database and natural scene Chinese character database. Meanwhile, further analysis proves that RTN can be better generalized to complex samples and low-frequency characters, and has better robustness in recognizing Chinese characters with different attributes.

Stroke Based Posterior Attention for Online Handwritten Mathematical Expression Recognition

Changjie Wu, Qing Wang, Jianshu Zhang, Jun Du, Jiaming Wang, Jiajia Wu, Jin-Shui Hu

Responsive image

Auto-TLDR; Posterior Attention for Online Handwritten Mathematical Expression Recognition

Slides Poster Similar

Recently, many researches propose to employ attention based encoder-decoder models to convert a sequence of trajectory points into a LaTeX string for online handwritten mathematical expression recognition (OHMER), and the recognition performance of these models critically relies on the accuracy of the attention. In this paper, unlike previous methods which basically employ a soft attention model, we propose to employ a posterior attention model, which modifies the attention probabilities after observing the output probabilities generated by the soft attention model. In order to further improve the posterior attention mechanism, we propose a stroke average pooling layer to aggregate point-level features obtained from the encoder into stroke-level features. We argue that posterior attention is better to be implemented on stroke-level features than point-level features as the output probabilities generated by stroke is more convincing than generated by point, and we prove that through experimental analysis. Validated on the CROHME competition task, we demonstrate that stroke based posterior attention achieves expression recognition rates of 54.26% on CROHME 2014 and 51.75% on CROHME 2016. According to attention visualization analysis, we empirically demonstrate that the posterior attention mechanism can achieve better alignment accuracy than the soft attention mechanism.