Olivier Lezoray

Papers from this author

Graph Signal Active Contours

Olivier Lezoray
Track 5: Image and Signal Processing
Tue 12 Jan 2021 at 15:00 in session OS T5.2

Responsive image

Auto-TLDR; Adaptation of Active Contour Without Edges for Graph Signal Processing

Underline Similar papers

With the advent of data living on vertices of graphs, there is much interest in processing the so-called graph signals for partitioning tasks. As active contours have had much impact in the image processing community, their formulation on graphs is of importance to the field of graph signal processing. This paper proposes an adaptation on graphs of a model that combines the Geodesic Active Contour and the Active Contour Without Edges models. In addition, specific terms depending on graphs are introduced in the formulation. This adaptation is solved using a level set formulation with a gradient descent that can be expressed as a morphological front evolution process. Experimental results on different kinds of graphs signals show the benefit of the approach.

Hybrid Network for End-To-End Text-Independent Speaker Identification

Wajdi Ghezaiel, Luc Brun, Olivier Lezoray
Track 2: Biometrics, Human Analysis and Behavior Understanding
Wed 13 Jan 2021 at 14:00 in session PS T2.3

Responsive image

Auto-TLDR; Text-Independent Speaker Identification with Scattering Wavelet Network and Convolutional Neural Networks

Underline Similar papers

Deep learning has recently improved the performance of Speaker Identification (SI) systems. Promising results have been obtained with Convolutional Neural Networks (CNNs). This success are mostly driven by the advent of large datasets. However in the context of commercial applications, collection of large amount of training data is not always possible. In addition, robustness of a SI system is adversely effected by short utterances. SI with only a few and short utterances is a challenging problem. Therefore, in this paper, we propose a novel text-independent speaker identification system. The proposed system can identify speakers by learning from only few training short utterances examples. To achieve this, we combine CNN with Scattering Wavelet Network. We propose a two-stage feature extraction framework using a two-layer wavelet scattering network coupled with a CNN for SI system. The proposed architecture takes variable length speech segments. To evaluate the effectiveness of the proposed approach, Timit and Librispeech datasets are used in the experiments. These conducted experiments show that our hybrid architecture performs successfully for SI, even with a small number and short duration of training samples. In comparaison with related methods, the obtained results shows that an hybrid architecture achieve better performance.

Learning Recurrent High-Order Statistics for Skeleton-Based Hand Gesture Recognition

Xuan Son Nguyen, Luc Brun, Olivier Lezoray, S├ębastien Bougleux
Track 2: Biometrics, Human Analysis and Behavior Understanding
Tue 12 Jan 2021 at 14:00 in session OS T2.1

Responsive image

Auto-TLDR; Exploiting High-Order Statistics in Recurrent Neural Networks for Hand Gesture Recog-nition

Underline Similar papers

High-order statistics have been proven useful inthe framework of Convolutional Neural Networks (CNN) fora variety of computer vision tasks. In this paper, we proposeto exploit high-order statistics in the framework of RecurrentNeural Networks (RNN) for skeleton-based hand gesture recog-nition. Our method is based on the Statistical Recurrent Units(SRU), an un-gated architecture that has been introduced as analternative model for Long-Short Term Memory (LSTM) andGate Recurrent Unit (GRU). The SRU captures sequential infor-mation by generating recurrent statistics that depend on a contextof previously seen data and by computing moving averages atdifferent scales. The integration of high-order statistics in theSRU significantly improves the performance of the original one,resulting in a model that is competitive to state-of-the-art methodson the Dynamic Hand Gesture (DHG) dataset, and outperformsthem on the First-Person Hand Action (FPHA) dataset.