ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Dynamically Mitigating Data Discrepancy with Balanced Focal Loss for Replay Attack Detection

Yongqiang Dou, Haocheng Yang, Maolin Yang, Yanyan Xu, Dengfeng Ke

Auto-TLDR; Anti-Spoofing with Balanced Focal Loss Function and Combination Features

Abstract Slides Poster

It becomes urgent to design effective anti-spoofing algorithms for vulnerable automatic speaker verification systems due to the advancement of high-quality playback devices. Current studies mainly treat anti-spoofing as a binary classification problem between bonafide and spoofed utterances, while lack of indistinguishable samples makes it difficult to train a robust spoofing detector. In this paper, we argue that for anti-spoofing, it needs more attention for indistinguishable samples over easily-classified ones in the modeling process, to make correct discrimination a top priority. Therefore, to mitigate the data discrepancy between training and inference, we propose to leverage a balanced focal loss function as the training objective to dynamically scale the loss based on the traits of the sample itself. Besides, in the experiments, we select three kinds of features that contain both magnitude-based and phase-based information to form complementary and informative features. Experimental results on the ASVspoof2019 dataset demonstrate the superiority of the proposed methods by comparison between our systems and top-performing ones. Systems trained with the balanced focal loss perform significantly better than conventional cross-entropy loss. With complementary features, our fusion system with only three kinds of features outperforms other systems containing five or more complex single models by 22.5% for min-tDCF and 7% for EER, achieving a min-tDCF and an EER of 0.0124 and 0.55% respectively. Furthermore, we present and discuss the evaluation results on real replay data apart from the simulated ASVspoof2019 data, indicating that research for anti-spoofing still has a long way to go.

Similar papers

ResMax: Detecting Voice Spoofing Attacks with Residual Network and Max Feature Map

Il-Youp Kwak, Sungsu Kwag, Junhee Lee, Jun Ho Huh, Choong-Hoon Lee, Youngbae Jeon, Jeonghwan Hwang, Ji Won Yoon

Auto-TLDR; ASVspoof 2019: A Lightweight Automatic Speaker Verification Spoofing and Countermeasures System

Dynamically Mitigating Data Discrepancy with Balanced Focal Loss for Replay Attack Detection

Similar papers

ResMax: Detecting Voice Spoofing Attacks with Residual Network and Max Feature Map

Face Anti-Spoofing Using Spatial Pyramid Pooling

Face Anti-Spoofing Based on Dynamic Color Texture Analysis Using Local Directional Number Pattern

A Cross Domain Multi-Modal Dataset for Robust Face Anti-Spooﬁng

MixNet for Generalized Face Presentation Attack Detection

ESResNet: Environmental Sound Classification Based on Visual Domain Models

Hybrid Network for End-To-End Text-Independent Speaker Identification

Detection of Calls from Smart Speaker Devices

Disentangled Representation Based Face Anti-Spoofing

Feature Engineering and Stacked Echo State Networks for Musical Onset Detection

Are Spoofs from Latent Fingerprints a Real Threat for the Best State-Of-Art Liveness Detectors?

Toward Text-Independent Cross-Lingual Speaker Recognition Using English-Mandarin-Taiwanese Dataset

Improving Gravitational Wave Detection with 2D Convolutional Neural Networks

Generalized Iris Presentation Attack Detection Algorithm under Cross-Database Settings

DenseRecognition of Spoken Languages

Multi-Label Contrastive Focal Loss for Pedestrian Attribute Recognition

Video Face Manipulation Detection through Ensemble of CNNs

Verifying the Causes of Adversarial Examples

Trainable Spectrally Initializable Matrix Transformations in Convolutional Neural Networks

A Weak Coupling of Semi-Supervised Learning with Generative Adversarial Networks for Malware Classification

Adversarially Training for Audio Classifiers

Radar Image Reconstruction from Raw ADC Data Using Parametric Variational Autoencoder with Domain Adaptation

EasiECG: A Novel Inter-Patient Arrhythmia Classification Method Using ECG Waves

Three-Dimensional Lip Motion Network for Text-Independent Speaker Recognition

Detection of Makeup Presentation Attacks Based on Deep Face Representations

3D Facial Matching by Spiral Convolutional Metric Learning and a Biometric Fusion-Net of Demographic Properties

Audio-Visual Speech Recognition Using a Two-Step Feature Fusion Strategy

Attack-Agnostic Adversarial Detection on Medical Data Using Explainable Machine Learning

Detecting Manipulated Facial Videos: A Time Series Solution

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

Building Computationally Efficient and Well-Generalizing Person Re-Identification Models with Metric Learning

A Close Look at Deep Learning with Small Data

Super-Resolution Guided Pore Detection for Fingerprint Recognition

Malware Detection by Exploiting Deep Learning over Binary Programs

An Experimental Evaluation of Recent Face Recognition Losses for Deepfake Detection

Task-based Focal Loss for Adversarially Robust Meta-Learning

The Application of Capsule Neural Network Based CNN for Speech Emotion Recognition

Energy Minimum Regularization in Continual Learning

Level Three Synthetic Fingerprint Generation

Evaluation of Anomaly Detection Algorithms for the Real-World Applications

Cut and Compare: End-To-End Offline Signature Verification Network

Which are the factors affecting the performance of audio surveillance systems?

Defense Mechanism against Adversarial Attacks Using Density-Based Representation of Images

MRP-Net: A Light Multiple Region Perception Neural Network for Multi-Label AU Detection

Color, Edge, and Pixel-Wise Explanation of Predictions Based onInterpretable Neural Network Model

Motion Complementary Network for Efficient Action Recognition

Planar 3D Transfer Learning for End to End Unimodal MRI Unbalanced Data Segmentation

Viability of Optical Coherence Tomography for Iris Presentation Attack Detection