ICPR2020 Paper Browser

Mélodie Boillet

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Papers from this author

Multiple Document Datasets Pre-Training Improves Text Line Detection with Deep Neural Networks

Mélodie Boillet, Christopher Kermorvant, Thierry Paquet

Auto-TLDR; A fully convolutional network for document layout analysis

Abstract Slides Similar

In this paper, we introduce a fully convolutional network for the document layout analysis task. While state-of-the-art methods are using models pre-trained on natural scene images, our method relies on a U-shaped model trained from scratch for detecting objects from historical documents. We consider the line segmentation task and more generally the layout analysis problem as a pixel-wise classification task then our model outputs a pixel-labeling of the input images. We show that our method outperforms state-of-the-art methods on various datasets and also demonstrate that the pre-trained parts on natural scene images are not required to reach good results. In addition, we show that pre-training on multiple document datasets can improve the performances. We evaluate the models using various metrics to have a fair and complete comparison between the methods.