Wajdi Ghezaiel

Papers from this author

Hybrid Network for End-To-End Text-Independent Speaker Identification

Wajdi Ghezaiel, Luc Brun, Olivier Lezoray
Track 2: Biometrics, Human Analysis and Behavior Understanding
Wed 13 Jan 2021 at 14:00 in session PS T2.3

Responsive image

Auto-TLDR; Text-Independent Speaker Identification with Scattering Wavelet Network and Convolutional Neural Networks

Underline Similar papers

Deep learning has recently improved the performance of Speaker Identification (SI) systems. Promising results have been obtained with Convolutional Neural Networks (CNNs). This success are mostly driven by the advent of large datasets. However in the context of commercial applications, collection of large amount of training data is not always possible. In addition, robustness of a SI system is adversely effected by short utterances. SI with only a few and short utterances is a challenging problem. Therefore, in this paper, we propose a novel text-independent speaker identification system. The proposed system can identify speakers by learning from only few training short utterances examples. To achieve this, we combine CNN with Scattering Wavelet Network. We propose a two-stage feature extraction framework using a two-layer wavelet scattering network coupled with a CNN for SI system. The proposed architecture takes variable length speech segments. To evaluate the effectiveness of the proposed approach, Timit and Librispeech datasets are used in the experiments. These conducted experiments show that our hybrid architecture performs successfully for SI, even with a small number and short duration of training samples. In comparaison with related methods, the obtained results shows that an hybrid architecture achieve better performance.