ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

UDBNET: Unsupervised Document Binarization Network Via Adversarial Game

Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, Partha Pratim Roy, Umapada Pal

Auto-TLDR; Three-player Min-max Adversarial Game for Unsupervised Document Binarization

Abstract Slides Poster

Degraded document image binarization is one of the most challenging tasks in the domain of document image analysis. In this paper, we present a novel approach towards document image binarization by introducing three-player min-max adversarial game. We train the network in an unsupervised setup by assuming that we do not have any paired-training data. In our approach, an Adversarial Texture Augmentation Network (ATANet) first superimposes the texture of a degraded reference image over a clean image. Later, the clean image along with its generated degraded version constitute the pseudo paired-data which is used to train the Unsupervised Document Binarization Network (UDBNet). Following this approach, we have enlarged the document binarization datasets as it generates multiple images having same content feature but different textual feature. These generated noisy images are then fed into the UDBNet to get back the clean version. The joint discriminator which is the third-player of our three-player min-max adversarial game tries to couple both the ATANet and UDBNet. The three-player min-max adversarial game stops, when the distributions modelled by the ATANet and the UDBNet align to the same joint distribution over time. Thus, the joint discriminator enforces the UDBNet to perform better on real degraded image. The experimental results indicate the superior performance of the proposed model over existing state-of-the-art algorithm on widely used DIBCO datasets. The source code of the proposed system is publicly available at https://github.com/VIROBO-15/UDBNET.

Similar papers

Efficient Shadow Detection and Removal Using Synthetic Data with Domain Adaptation

Rui Guo, Babajide Ayinde, Hao Sun

Auto-TLDR; Shadow Detection and Removal with Domain Adaptation and Synthetic Image Database

Abstract Poster Similar

In recent years, learning based shadow detection and removal approaches have shown prospects and, in most cases, yielded state-of-the-art results. The performance of these approaches, however, relies heavily on the construction of training database of shadow images, shadow-free versions, and shadow maps as ground truth. This conventional data gathering method is time-consuming, expensive, or even practically intractable to realize especially for outdoor scenes with complicated shadow patterns, thus limiting the size of the data available for training. In this paper, we leverage on large high quality synthetic image database and domain adaptation to eliminate the bottlenecks resulting from insufficient training samples and domain bias. Specifically, our approach utilizes adversarial training to predict near-pixel-perfect shadow map from synthetic shadow image for downstream shadow removal steps. At inference time, we capitalize on domain adaptation via image style transfer to map the style of real- world scene to that of synthetic scene for the purpose of detecting and subsequently removing shadow. Comprehensive experiments indicate that our approach outperforms state-of-the-art methods on select benchmark datasets.

Unsupervised deep learning for text line segmentation

Berat Kurar Barakat, Ahmad Droby, Reem Alaasam, Borak Madi, Irina Rabaev, Raed Shammes, Jihad El-Sana

Auto-TLDR; Unsupervised Deep Learning for Handwritten Text Line Segmentation without Annotation

Abstract Poster Similar

We present an unsupervised deep learning method for text line segmentation that is inspired by the relative variance between text lines and spaces among text lines. Handwritten text line segmentation is important for the efficiency of further processing. A common method is to train a deep learning network for embedding the document image into an image of blob lines that are tracing the text lines. Previous methods learned such embedding in a supervised manner, requiring the annotation of many document images. This paper presents an unsupervised embedding of document image patches without a need for annotations. The number of foreground pixels over the text lines is relatively different from the number of foreground pixels over the spaces among text lines. Generating similar and different pairs relying on this principle definitely leads to outliers. However, as the results show, the outliers do not harm the convergence and the network learns to discriminate the text lines from the spaces between text lines. Remarkably, with a challenging Arabic handwritten text line segmentation dataset, VML-AHTE, we achieved superior performance over the supervised methods. Additionally, the proposed method was evaluated on the ICDAR 2017 and ICFHR 2010 handwritten text line segmentation datasets.

Ancient Document Layout Analysis: Autoencoders Meet Sparse Coding

Homa Davoudi, Marco Fiorucci, Arianna Traviglia

Auto-TLDR; Unsupervised Unsupervised Representation Learning for Document Layout Analysis

UDBNET: Unsupervised Document Binarization Network Via Adversarial Game

Similar papers

Efficient Shadow Detection and Removal Using Synthetic Data with Domain Adaptation

Unsupervised deep learning for text line segmentation

Ancient Document Layout Analysis: Autoencoders Meet Sparse Coding

Boundary Guided Image Translation for Pose Estimation from Ultra-Low Resolution Thermal Sensor

Cycle-Consistent Adversarial Networks and Fast Adaptive Bi-Dimensional Empirical Mode Decomposition for Style Transfer

Few-Shot Font Generation with Deep Metric Learning

Local Facial Attribute Transfer through Inpainting

SIDGAN: Single Image Dehazing without Paired Supervision

Galaxy Image Translation with Semi-Supervised Noise-Reconstructed Generative Adversarial Networks

Stylized-Colorization for Line Arts

A Gated and Bifurcated Stacked U-Net Module for Document Image Dewarping

Unsupervised Domain Adaptation with Multiple Domain Discriminators and Adaptive Self-Training

Unsupervised Face Manipulation Via Hallucination

Combining Deep and Ad-Hoc Solutions to Localize Text Lines in Ancient Arabic Document Images

Shape Consistent 2D Keypoint Estimation under Domain Shift

On-Device Text Image Super Resolution

Thermal Image Enhancement Using Generative Adversarial Network for Pedestrian Detection

The Role of Cycle Consistency for Generating Better Human Action Videos from a Single Frame

Multi-Domain Image-To-Image Translation with Adaptive Inference Graph

Improving Word Recognition Using Multiple Hypotheses and Deep Embeddings

High Resolution Face Age Editing

A GAN-Based Blind Inpainting Method for Masonry Wall Images

Identity-Preserved Face Beauty Transformation with Conditional Generative Adversarial Networks

Unsupervised Multi-Task Domain Adaptation

Semantic-Guided Inpainting Network for Complex Urban Scenes Manipulation

Text Recognition in Real Scenarios with a Few Labeled Samples

Cross-Domain Semantic Segmentation of Urban Scenes Via Multi-Level Feature Alignment

Learning Low-Shot Generative Networks for Cross-Domain Data

Enlarging Discriminative Power by Adding an Extra Class in Unsupervised Domain Adaptation

Robust Pedestrian Detection in Thermal Imagery Using Synthesized Images

Cascade Attention Guided Residue Learning GAN for Cross-Modal Translation

DEN: Disentangling and Exchanging Network for Depth Completion

Towards Artifacts-Free Image Defogging

Stratified Multi-Task Learning for Robust Spotting of Scene Texts

Pose Variation Adaptation for Person Re-Identification

Super-Resolution Guided Pore Detection for Fingerprint Recognition

GarmentGAN: Photo-Realistic Adversarial Fashion Transfer

Detail Fusion GAN: High-Quality Translation for Unpaired Images with GAN-Based Data Augmentation

DUET: Detection Utilizing Enhancement for Text in Scanned or Captured Documents

Multimodal Side-Tuning for Document Classification

Data Augmentation Via Mixed Class Interpolation Using Cycle-Consistent Generative Adversarial Networks Applied to Cross-Domain Imagery

Position-Aware and Symmetry Enhanced GAN for Radial Distortion Correction

Detail-Revealing Deep Low-Dose CT Reconstruction

Combining GANs and AutoEncoders for Efficient Anomaly Detection

Watch Your Strokes: Improving Handwritten Text Recognition with Deformable Convolutions

Adaptive Image Compression Using GAN Based Semantic-Perceptual Residual Compensation

Age Gap Reducer-GAN for Recognizing Age-Separated Faces

UCCTGAN: Unsupervised Clothing Color Transformation Generative Adversarial Network