Wen-Hung Liao

Papers from this author

Defense Mechanism against Adversarial Attacks Using Density-Based Representation of Images

Yen-Ting Huang, Wen-Hung Liao, Chen-Wei Huang

Responsive image

Auto-TLDR; Adversarial Attacks Reduction Using Input Recharacterization

Slides Poster Similar

Adversarial examples are slightly modified inputs devised to cause erroneous inference of deep learning models. Protection against the intervention of adversarial examples is a fundamental issue that needs to be addressed before the wide adoption of deep-learning based intelligent systems. In this research, we utilize the method known as input recharacterization to effectively eliminate the perturbations found in the adversarial examples. By converting images from the intensity domain into density-based representation using halftoning operation, performance of the classifier can be properly maintained. With adversarial attacks generated using FGSM, I-FGSM, and PGD, the top-5 accuracy of the hybrid model can still achieve 80.97%, 78.77%, 81.56%, respectively. Although the accuracy has been slightly affected, the influence of adversarial examples is significantly discounted. The average improvement over existing input transform defense mechanisms is approximately 10%.

Investigation of DNN Model Robustness Using Heterogeneous Datasets

Wen-Hung Liao, Yen-Ting Huang

Responsive image

Auto-TLDR; Evaluating the Dependency of Deep Learning on Heterogeneous Data Set for Learning

Slides Poster Similar

Deep learning framework has been successfully applied to tackle many challenging tasks in pattern recognition and computer vision thanks to its ability to automatically extract representative features from the training data. Such type of data-driven approach, however, is subject to the criticism of too much dependency on the training set. In this research, we attempt to investigate the validity of this statement: ‘deep learning is only as good as its data’ by evaluating the performance of deep learning models using heterogeneous data sets, in which distinct representations of the same source data are employed for training/testing. We have examined three cases: low-resolution image, severely compressed input and halftone image in this work. Our preliminary results indicate that such dependency indeed exists. Classifier performance drops considerably when the model is tested with modified or transformed input. The best outcomes are obtained when the model is trained with hybrid input.

Toward Text-Independent Cross-Lingual Speaker Recognition Using English-Mandarin-Taiwanese Dataset

Yi-Chieh Wu, Wen-Hung Liao

Responsive image

Auto-TLDR; Cross-lingual Speech for Biometric Recognition

Poster Similar

Over 40% of the world's population is bilingual. Existing speaker identification/verification systems, however, assume the same language type for both enrollment and recognition stages. In this work, we investigate the feasibility of employing multilingual speech for biometric application. We establish a dataset containing audio recorded in English, Mandarin and Taiwanese. Three acoustic features, namely, i-vector, d-vector and x-vector have been evaluated for both speaker verification (SV) and identification (SI) tasks. Preliminary experimental results indicate that x-vector achieves the best overall performance. Additionally, model trained with hybrid data demonstrates highest accuracy associated with the cost of data collection efforts. In SI tasks, we obtained over 91\% cross-lingual accuracy all models using 3-second audio. In SV tasks, the EER among cross-lingual test is at most 6.52\%, which is observed on the model trained by English corpus. The outcome suggests the feasibility of adopting cross-lingual speech in building text-independent speaker recognition systems.