Fingerprints, Forever Young?

Roman Kessler, Olaf Henniger, Christoph Busch

Responsive image

Auto-TLDR; Mated Similarity Scores for Fingerprint Recognition: A Hierarchical Linear Model

Slides Poster

In the present study we analyzed longitudinal fingerprint data of 20 data subjects, acquired over a time span of up to 12 years. Using hierarchical linear modeling, we aimed to delineate mated similarity scores as a function of fingerprint quality and of the time interval between reference and probe images. Our results did not reveal effects on mated similarity scores caused by an increasing time interval across subjects, but rather individual effects on mated similarity scores. The results are in line with the general assumption that the fingerprint as a biometric characteristic and the features extracted from it do not change over the adult life span. However, it contradicts several related studies that reported noticeable template ageing effects. We discuss why different findings regarding ageing of references in fingerprint recognition systems were made.

Similar papers

How Unique Is a Face: An Investigative Study

Michal Balazia, S L Happy, Francois Bremond, Antitza Dantcheva

Responsive image

Auto-TLDR; Uniqueness of Face Recognition: Exploring the Impact of Factors such as image resolution, feature representation, database size, age and gender

Slides Poster Similar

Face recognition has been widely accepted as a means of identification in applications ranging from border control to security in the banking sector. Surprisingly, while widely accepted, we still lack the understanding of the uniqueness or distinctiveness of face as a biometric characteristic. In this work, we study the impact of factors such as image resolution, feature representation, database size, age and gender on uniqueness denoted by the Kullback-Leibler divergence between genuine and impostor distributions. Towards understanding the impact, we present experimental results on the datasets AT&T, LFW, IMDb-Face, as well as ND-TWINS, with the feature extraction algorithms VGGFace, VGG16, ResNet50, InceptionV3, MobileNet and DenseNet121, that reveal the quantitative impact of the named factors. While these are early results, our findings indicate the need for a better understanding of the concept of biometric uniqueness and its implication on face recognition.

Level Three Synthetic Fingerprint Generation

Andre Wyzykowski, Mauricio Pamplona Segundo, Rubisley Lemes

Responsive image

Auto-TLDR; Synthesis of High-Resolution Fingerprints with Pore Detection Using CycleGAN

Slides Poster Similar

Today's legal restrictions that protect the privacy of biometric data are hampering fingerprint recognition researches. For instance, all high-resolution fingerprint databases ceased to be publicly available. To address this problem, we present a novel hybrid approach to synthesize realistic, high-resolution fingerprints. First, we improved Anguli, a handcrafted fingerprint generator, to obtain dynamic ridge maps with sweat pores and scratches. Then, we trained a CycleGAN to transform these maps into realistic fingerprints. Unlike other CNN-based works, we can generate several images for the same identity. We used our approach to create a synthetic database with 7400 images in an attempt to propel further studies in this field without raising legal issues. We included sweat pore annotations in 740 images to encourage research developments in pore detection. In our experiments, we employed two fingerprint matching approaches to confirm that real and synthetic databases have similar performance. We conducted a human perception analysis where sixty volunteers could hardly differ between real and synthesized fingerprints. Given that we also favorably compare our results with the most advanced works in the literature, our experimentation suggests that our approach is the new state-of-the-art.

Rotation Detection in Finger Vein Biometrics Using CNNs

Bernhard Prommegger, Georg Wimmer, Andreas Uhl

Responsive image

Auto-TLDR; A CNN based rotation detector for finger vein recognition

Slides Poster Similar

Finger vein recognition deals with the identification of subjects based on their venous pattern within the fingers. The recognition accuracy of finger vein recognition systems suffers from different internal and external factors. One of the major problems are misplacements of the finger during acquisition. In particular longitudinal finger rotation poses a severe problem for such recognition systems. The detection and correction of such rotations is a difficult task as typically finger vein scanners acquire only a single image from the vein pattern. Therefore, important information such as the shape of the finger or the depth of the veins within the finger, which are needed for the rotation detection, are not available. This work presents a CNN based rotation detector that is capable of estimating the rotational difference between vein images of the same finger without providing any additional information. The experiments executed not only show that the method delivers highly accurate results, but it also generalizes so that the trained CNN can also be applied on data sets which have not been included during the training of the CNN. Correcting the rotation difference between images using the CNN's rotation prediction leads to EER improvements between 50-260% for a well-established vein-pattern based method (Maximum Curvature) on four public finger vein databases.

One-Shot Representational Learning for Joint Biometric and Device Authentication

Sudipta Banerjee, Arun Ross

Responsive image

Auto-TLDR; Joint Biometric and Device Recognition from a Single Biometric Image

Slides Poster Similar

In this work, we propose a method to simultaneously perform (i) biometric recognition (\textit{i.e.}, identify the individual), and (ii) device recognition, (\textit{i.e.}, identify the device) from a single biometric image, say, a face image, using a one-shot schema. Such a joint recognition scheme can be useful in devices such as smartphones for enhancing security as well as privacy. We propose to automatically learn a joint representation that encapsulates both biometric-specific and sensor-specific features. We evaluate the proposed approach using iris, face and periocular images acquired using near-infrared iris sensors and smartphone cameras. Experiments conducted using 14,451 images from 13 sensors resulted in a rank-1 identification accuracy of upto 99.81\% and a verification accuracy of upto 100\% at a false match rate of 1\%.

Can You Really Trust the Sensor's PRNU? How Image Content Might Impact the Finger Vein Sensor Identification Performance

Dominik Söllinger, Luca Debiasi, Andreas Uhl

Responsive image

Auto-TLDR; Finger vein imagery can cause the PRNU estimate to be biased by image content

Slides Poster Similar

We study the impact of highly correlated image content on the estimated sensor PRNU and its impact on the sensor identification performance. Based on eight publicly available finger vein datasets, we show formally and experimentally that the nature of finger vein imagery can cause the estimated PRNU to be biased by image content and lead to a fairly bad PRNU estimate. Such bias can cause a false increase in sensor identification performance depending on the dataset composition. Our results indicate that independent of the biometric modality, examining the quality of the estimated PRNU is essential before claiming the sensor identification performance to be good.

Are Spoofs from Latent Fingerprints a Real Threat for the Best State-Of-Art Liveness Detectors?

Roberto Casula, Giulia Orrù, Daniele Angioni, Xiaoyi Feng, Gian Luca Marcialis, Fabio Roli

Responsive image

Auto-TLDR; ScreenSpoof: Attacks using latent fingerprints against state-of-art fingerprint liveness detectors and verification systems

Slides Similar

We investigated the threat level of realistic attacks using latent fingerprints against sensors equipped with state-of-art liveness detectors and fingerprint verification systems which integrate such liveness algorithms. To the best of our knowledge, only a previous investigation was done with spoofs from latent prints. In this paper, we focus on using snapshot pictures of latent fingerprints. These pictures provide molds, that allows, after some digital processing, to fabricate high-quality spoofs. Taking a snapshot picture is much simpler than developing fingerprints left on a surface by magnetic powders and lifting the trace by a tape. What we are interested here is to evaluate preliminary at which extent attacks of the kind can be considered a real threat for state-of-art fingerprint liveness detectors and verification systems. To this aim, we collected a novel data set of live and spoof images fabricated with snapshot pictures of latent fingerprints. This data set provide a set of attacks at the most favourable conditions. We refer to this method and the related data set as "ScreenSpoof". Then, we tested with it the performances of the best liveness detection algorithms, namely, the three winners of the LivDet competition. Reported results point out that the ScreenSpoof method is a threat of the same level, in terms of detection and verification errors, than that of attacks using spoofs fabricated with the full consensus of the victim. We think that this is a notable result, never reported in previous work.

Finger Vein Recognition and Intra-Subject Similarity Evaluation of Finger Veins Using the CNN Triplet Loss

Georg Wimmer, Bernhard Prommegger, Andreas Uhl

Responsive image

Auto-TLDR; Finger vein recognition using CNNs and hard triplet online selection

Slides Poster Similar

Finger vein recognition deals with the identification of subjects based on their venous pattern within the fingers. There is a lot of prior work using hand crafted features, but only little work using CNN based recognition systems. This article proposes a new approach using CNNs that utilizes the triplet loss function together with hard triplet online selection for finger vein recognition. The CNNs are used for three different use cases: (1) the classical recognition use case, where every finger of a subject is considered as a separate class, (2) an evaluation of the similarity of left and right hand fingers from the same subject and (3) an evaluation of the similarity of different fingers of the same subject. The results show that the proposed approach achieves superior results compared to prior work on finger vein recognition using the triplet loss function. Furtherly, we show that different fingers of the same subject, especially same fingers from the left and right hand, show enough similarities to perform recognition. The last statement contradicts the current understanding in the literature for finger vein biometry, in which it is assumed that different fingers of the same subject are unique identities.

Super-Resolution Guided Pore Detection for Fingerprint Recognition

Syeda Nyma Ferdous, Ali Dabouei, Jeremy Dawson, Nasser M. Nasarabadi

Responsive image

Auto-TLDR; Super-Resolution Generative Adversarial Network for Fingerprint Recognition Using Pore Features

Slides Poster Similar

Performance of fingerprint recognition algorithms substantially rely on fine features extracted from fingerprints. Apart from minutiae and ridge patterns, pore features have proven to be usable for fingerprint recognition. Although features from minutiae and ridge patterns are quite attainable from low-resolution images, using pore features is practical only if the fingerprint image is of high resolution which necessitates a model that enhances the image quality of the conventional 500 ppi legacy fingerprints preserving the fine details. To find a solution for recovering pore information from low-resolution fingerprints, we adopt a joint learning-based approach that combines both super-resolution and pore detection networks. Our modified single image Super-Resolution Generative Adversarial Network (SRGAN) framework helps to reliably reconstruct high-resolution fingerprint samples from low-resolution ones assisting the pore detection network to identify pores with a high accuracy. The network jointly learns a distinctive feature representation from a real low-resolution fingerprint sample and successfully synthesizes a high-resolution sample from it. To add discriminative information and uniqueness for all the subjects, we have integrated features extracted from a deep fingerprint verifier with the SRGAN quality discriminator. We also add ridge reconstruction loss, utilizing ridge patterns to make the best use of extracted features. Our proposed method solves the recognition problem by improving the quality of fingerprint images. High recognition accuracy of the synthesized samples that is close to the accuracy achieved using the original high-resolution images validate the effectiveness of our proposed model.

Attribute-Based Quality Assessment for Demographic Estimation in Face Videos

Fabiola Becerra-Riera, Annette Morales-González, Heydi Mendez-Vazquez, Jean-Luc Dugelay

Responsive image

Auto-TLDR; Facial Demographic Estimation in Video Scenarios Using Quality Assessment

Slides Similar

Most existing works regarding facial demographic estimation are focused on still image datasets, although nowadays the need to analyze video content in real applications is increasing. We propose to tackle gender, age and ethnicity estimation in the context of video scenarios. Our main contribution is to use an attribute-specific quality assessment procedure to select best quality frames from a video sequence for each of the three demographic modalities. Best quality frames are classified with fine-tuned MobileNet models and a final video prediction is obtained with a majority voting strategy among the best selected frames. Our validation on three different datasets and our comparison with state-of-the-art models, show the effectiveness of the proposed demographic classifiers and the quality pipeline, which allows to reduce both: the number of frames to be classified and the processing time in practical applications; and improves the soft biometrics prediction accuracy.

Face Image Quality Assessment for Model and Human Perception

Ken Chen, Yichao Wu, Zhenmao Li, Yudong Wu, Ding Liang

Responsive image

Auto-TLDR; A labour-saving method for FIQA training with contradictory data from multiple sources

Slides Poster Similar

Practical face image quality assessment (FIQA) models are trained under the supervision of labeled data, which requires more or less human labor. The human labeled quality scores are consistent with perceptual intuition but laborious. On the other hand, models can be trained with data generated automatically by the recognition models with artificially selected references. However, the recognition scores are sometimes inaccurate, which may give wrong quality scores during FIQA training. In this paper, we propose a labour-saving method for quality scores generation. For the first time, we conduct systematic investigations to show that there exist severe contradictions between different types of target quality, namely distribution gap (DG). To bridge the gap, we propose a novel framework for training FIQA models by combining the merits of data from different sources. In order to make the target score from multiple sources compatible, we design a method called quality distribution alignment (QDA). Meanwhile, to correct the wrong target by recognition models, contradictory samples selection (CSS) is adopted to select samples from the human labeled dataset adaptively. Extensive experiments and analysis on public benchmarks including MegaFace has demonstrated the superiority of our in terms of effectiveness and efficiency.

3D Facial Matching by Spiral Convolutional Metric Learning and a Biometric Fusion-Net of Demographic Properties

Soha Sadat Mahdi, Nele Nauwelaers, Philip Joris, Giorgos Bouritsas, Imperial London, Sergiy Bokhnyak, Susan Walsh, Mark Shriver, Michael Bronstein, Peter Claes

Responsive image

Auto-TLDR; Multi-biometric Fusion for Biometric Verification using 3D Facial Mesures

Slides Similar

Face recognition is a widely accepted biometric verification tool, as the face contains a lot of information about the identity of a person. In this study, a 2-step neural-based pipeline is presented for matching 3D facial shape to multiple DNA-related properties (sex, age, BMI and genomic background). The first step consists of a triplet loss-based metric learner that compresses facial shape into a lower dimensional embedding while preserving information about the property of interest. Most studies in the field of metric learning have only focused on Euclidean data. In this work, geometric deep learning is employed to learn directly from 3D facial meshes. To this end, spiral convolutions are used along with a novel mesh-sampling scheme that retains uniformly sampled 3D points at different levels of resolution. The second step is a multi-biometric fusion by a fully connected neural network. The network takes an ensemble of embeddings and property labels as input and returns genuine and imposter scores. Since embeddings are accepted as an input, there is no need to train classifiers for the different properties and available data can be used more efficiently. Results obtained by a 10-fold cross-validation for biometric verification show that combining multiple properties leads to stronger biometric systems. Furthermore, the proposed neural-based pipeline outperforms a linear baseline, which consists of principal component analysis, followed by classification with linear support vector machines and a Naïve Bayes-based score-fuser.

InsideBias: Measuring Bias in Deep Networks and Application to Face Gender Biometrics

Ignacio Serna, Alejandro Peña Almansa, Aythami Morales, Julian Fierrez

Responsive image

Auto-TLDR; InsideBias: Detecting Bias in Deep Neural Networks from Face Images

Slides Poster Similar

This work explores the biases in learning processes based on deep neural network architectures. We analyze how bias affects deep learning processes through a toy example using the MNIST database and a case study in gender detection from face images. We employ two gender detection models based on popular deep neural networks. We present a comprehensive analysis of bias effects when using an unbalanced training dataset on the features learned by the models. We show how bias impacts in the activations of gender detection models based on face images. We finally propose InsideBias, a novel method to detect biased models. InsideBias is based on how the models represent the information instead of how they perform, which is the normal practice in other existing methods for bias detection. Our strategy with InsideBias allows to detect biased models with very few samples (only 15 images in our case study). Our experiments include 72K face images from 24K identities and 3 ethnic groups.

Seasonal Inhomogeneous Nonconsecutive Arrival Process Search and Evaluation

Kimberly Holmgren, Paul Gibby, Joseph Zipkin

Responsive image

Auto-TLDR; SINAPSE: Fitting a Sparse Time Series Model to Seasonal Data

Slides Poster Similar

Time series often exhibit seasonal patterns, and identification of these patterns is essential to understanding the data and predicting future behavior. Most methods train on large datasets and can fail to predict far past the training data. This limitation becomes more pronounced when data is sparse. This paper presents a method to fit a model to seasonal time series data that maintains predictive power when data is limited. This method, called \textit{SINAPSE}, combines statistical model fitting with an information criteria to search for disjoint, and possibly nonconsecutive, regimes underlying the data, allowing for a sparse representation resistant to overfitting.

Video Analytics Gait Trend Measurement for Fall Prevention and Health Monitoring

Lawrence O'Gorman, Xinyi Liu, Md Imran Sarker, Mariofanna Milanova

Responsive image

Auto-TLDR; Towards Health Monitoring of Gait with Deep Learning

Slides Poster Similar

We design a video analytics system to measure gait over time and detect trend and outliers in the data. The purpose is for health monitoring, the thesis being that trend especially can lead to early detection of declining health and be used to prevent accidents such as falls in the elderly. We use the OpenPose deep learning tool for recognizing the back and neck angle features of walking people, and measure speed as well. Trend and outlier statistics are calculated upon time series of these features. A challenge in this work is lack of testing data of decaying gait. We first designed experiments to measure consistency of the system on a healthy population, then analytically altered this real data to simulate gait decay. Results on about 4000 gait samples of 50 people over 3 months showed good separation of healthy gait subjects from those with trend or outliers, and furthermore the trend measurement was able to detect subtle decay in gait not easily discerned by the human eye.

Exploring Seismocardiogram Biometrics with Wavelet Transform

Po-Ya Hsu, Po-Han Hsu, Hsin-Li Liu

Responsive image

Auto-TLDR; Seismocardiogram Biometric Matching Using Wavelet Transform and Deep Learning Models

Slides Poster Similar

Seismocardiogram (SCG) has become easily accessible in the past decade owing to the advance of sensor technology. However, SCG biometric has not been widely explored. In this paper, we propose combining wavelet transform together with deep learning models, machine learning classifiers, or structural similarity metric to perform SCG biometric matching tasks. We validate the proposed methods on the publicly available dataset from PhysioNet database. The dataset contains one hour long electrocardiogram, breathing, and SCG data of 20 subjects. We train the models on the first five minute SCG and conduct identification on the last five minute SCG. We evaluate the identification and authentication performance with recognition rate and equal error rate, respectively. Based on the results, we show that wavelet transformed SCG biometric can achieve state-of-the-art performance when combined with deep learning models, machine learning classifiers, or structural similarity.

Detection of Makeup Presentation Attacks Based on Deep Face Representations

Christian Rathgeb, Pawel Drozdowski, Christoph Busch

Responsive image

Auto-TLDR; An Attack Detection Scheme for Face Recognition Using Makeup Presentation Attacks

Slides Poster Similar

Facial cosmetics have the ability to substantially alter the facial appearance, which can negatively affect the decisions of a face recognition. In addition, it was recently shown that the application of makeup can be abused to launch so-called makeup presentation attacks. In such attacks, the attacker might apply heavy makeup in order to achieve the facial appearance of a target subject for the purpose of impersonation. In this work, we assess the vulnerability of a COTS face recognition system to makeup presentation attacks employing the publicly available Makeup Induced Face Spoofing (MIFS) database. It is shown that makeup presentation attacks might seriously impact the security of the face recognition system. Further, we propose an attack detection scheme which distinguishes makeup presentation attacks from genuine authentication attempts by analysing differences in deep face representations obtained from potential makeup presentation attacks and corresponding target face images. The proposed detection system employs a machine learning-based classifier, which is trained with synthetically generated makeup presentation attacks utilizing a generative adversarial network for facial makeup transfer in conjunction with image warping. Experimental evaluations conducted using the MIFS database reveal a detection equal error rate of 0.7% for the task of separating genuine authentication attempts from makeup presentation attacks.

SoftmaxOut Transformation-Permutation Network for Facial Template Protection

Hakyoung Lee, Cheng Yaw Low, Andrew Teoh

Responsive image

Auto-TLDR; SoftmaxOut Transformation-Permutation Network for C cancellable Biometrics

Slides Poster Similar

In this paper, we propose a data-driven cancellable biometrics scheme, referred to as SoftmaxOut Transformation-Permutation Network (SOTPN). The SOTPN is a neural version of Random Permutation Maxout (RPM) transform, which was introduced for facial template protection. We present a specialized SoftmaxOut layer integrated with the permutable MaxOut units and the parameterized softmax function to approximate the non-differentiable permutation and the winner-takes-all operations in the RPM transform. On top of that, a novel pairwise ArcFace loss and a code balancing loss are also formulated to ensure that the SOTPN-transformed facial template is cancellable, discriminative, high entropy and free from quantization errors when coupled with the SoftmaxOut layer. The proposed SOTPN is evaluated on three face datasets, namely LFW, YouTube Face and Facescrub, and our experimental results disclosed that the SOTPN outperforms the RPM transform significantly.

Relative Feature Importance

Gunnar König, Christoph Molnar, Bernd Bischl, Moritz Grosse-Wentrup

Responsive image

Auto-TLDR; Relative Feature Importance for Interpretable Machine Learning

Slides Similar

Interpretable Machine Learning (IML) methods are used to gain insight into the relevance of a feature of interest for the performance of a model. Commonly used IML methods differ in whether they consider features of interest in isolation, e.g., Permutation Feature Importance (PFI), or in relation to all remaining feature variables, e.g., Conditional Feature Importance (CFI). As such, the perturbation mechanisms inherent to PFI and CFI represent extreme reference points. We introduce Relative Feature Importance (RFI), a generalization of PFI and CFI that allows for a more nuanced feature importance computation beyond the PFI versus CFI dichotomy. With RFI, the importance of a feature relative to any other subset of features can be assessed, including variables that were not available at training time. We derive general interpretation rules for RFI based on a detailed theoretical analysis of the implications of relative feature relevance, and demonstrate the method's usefulness on simulated examples.

Lookalike Disambiguation: Improving Face Identification Performance at Top Ranks

Thomas Swearingen, Arun Ross

Responsive image

Auto-TLDR; Lookalike Face Identification Using a Disambiguator for Lookalike Images

Poster Similar

A face identification system compares an unknown input probe image to a gallery of face images labeled with identities in order to determine the identity of the probe image. The result of identification is a ranked match list with the most similar gallery face image at the top (rank 1) and the least similar gallery face image at the bottom. In many systems, the top ranked gallery images may look very similar to the probe image as well as to each other and can sometimes result in the misidentification of the probe image. Such similar looking faces pertaining to different identities are referred to as lookalike faces. We hypothesize that a matcher specifically trained to disambiguate lookalike face images and combined with a regular face matcher may improve overall identification performance. This work proposes reranking the initial ranked match list using a disambiguator especially for lookalike face pairs. This work also evaluates schemes to select gallery images in the initial ranked match list that should be re-ranked. Experiments on the challenging TinyFace dataset shows that the proposed approach improves the closed-set identification accuracy of a state-of-the-art face matcher.

Age Gap Reducer-GAN for Recognizing Age-Separated Faces

Daksha Yadav, Naman Kohli, Mayank Vatsa, Richa Singh, Afzel Noore

Responsive image

Auto-TLDR; Generative Adversarial Network for Age-separated Face Recognition

Slides Poster Similar

In this paper, we propose a novel algorithm for matching faces with temporal variations caused due to age progression. The proposed generative adversarial network algorithm is a unified framework which combines facial age estimation and age-separated face verification. The key idea of this approach is to learn the age variations across time by conditioning the input image on the subject's gender and the target age group to which the face needs to be progressed. The loss function accounts for reducing the age gap between the original image and generated face image as well as preserving the identity. Both visual fidelity and quantitative evaluations demonstrate the efficacy of the proposed architecture on different facial age databases for age-separated face recognition.

One Step Clustering Based on A-Contrario Framework for Detection of Alterations in Historical Violins

Alireza Rezaei, Sylvie Le Hégarat-Mascle, Emanuel Aldea, Piercarlo Dondi, Marco Malagodi

Responsive image

Auto-TLDR; A-Contrario Clustering for the Detection of Altered Violins using UVIFL Images

Slides Poster Similar

Preventive conservation is an important practice in Cultural Heritage. The constant monitoring of the state of conservation of an artwork helps us reduce the risk of damage and number of interventions necessary. In this work, we propose a probabilistic approach for the detection of alterations on the surface of historical violins based on an a-contrario framework. Our method is a one step NFA clustering solution which considers grey-level and spatial density information in one background model. The proposed method is robust to noise and avoids parameter tuning and any assumption about the quantity of the worn out areas. We have used as input UV induced fluorescence (UVIFL) images for considering details not perceivable with visible light. Tests were conducted on image sequences included in the ``Violins UVIFL imagery'' dataset. Results illustrate the ability of the algorithm to distinguish the worn area from the surrounding regions. Comparisons with the state of the art clustering methods shows improved overall precision and recall.

A Local Descriptor with Physiological Characteristic for Finger Vein Recognition

Liping Zhang, Weijun Li, Ning Xin

Responsive image

Auto-TLDR; Finger vein-specific local feature descriptors based physiological characteristic of finger vein patterns

Slides Poster Similar

Local feature descriptors exhibit great superiority in finger vein recognition due to their stability and robustness against local changes in images. However, most of these are methods use general-purpose descriptors that do not consider finger vein-specific features. In this work, we propose a finger vein-specific local feature descriptors based physiological characteristic of finger vein patterns, i.e., histogram of oriented physiological Gabor responses (HOPGR), for finger vein recognition. First, a prior of directional characteristic of finger vein patterns is obtained in an unsupervised manner. Then the physiological Gabor filter banks are set up based on the prior information to extract the physiological responses and orientation. Finally, to make the feature robust against local changes in images, a histogram is generated as output by dividing the image into non-overlapping cells and overlapping blocks. Extensive experimental results on several databases clearly demonstrate that the proposed method outperforms most current state-of-the-art finger vein recognition methods.

Identifying Missing Children: Face Age-Progression Via Deep Feature Aging

Debayan Deb, Divyansh Aggarwal, Anil Jain

Responsive image

Auto-TLDR; Aging Face Features for Missing Children Identification

Similar

Given a face image of a recovered child at probe-age, we search a gallery of missing children with known identities and gallery-ages at which they were either lost or stolen in an attempt to unite the recovered child with his family. We propose a feature aging module that can age-progress deep face features output by a face matcher to improve the recognition accuracy of age-separated child face images. In addition, the feature aging module guides age-progression in the image space such that synthesized aged gallery faces can be utilized to further enhance cross-age face matching accuracy of any commodity face matcher. For time lapses larger than 10 years (the missing child is recovered after 10 or more years), the proposed age-progression module improves the closed-set identification accuracy of CosFace from 60.72% to 66.12% on a child celebrity dataset, namely ITWCC. The proposed method also outperforms state-of-the-art approaches with a rank-1 identification rate of 95.91%, compared to 94.91%, on a public aging dataset, FG-NET, and 99.58%, compared to 99.50%, on CACD-VS. These results suggest that aging face features enhances the ability to identify young children who are possible victims of child trafficking or abduction.

Cancelable Biometrics Vault: A Secure Key-Binding Biometric Cryptosystem Based on Chaffing and Winnowing

Osama Ouda, Karthik Nandakumar, Arun Ross

Responsive image

Auto-TLDR; Cancelable Biometrics Vault for Key-binding Biometric Cryptosystem Framework

Slides Poster Similar

Existing key-binding biometric cryptosystems, such as the Fuzzy Vault Scheme (FVS) and Fuzzy Commitment Scheme (FCS), employ Error Correcting Codes (ECC) to handle intra-user variations in biometric data. As a result, a trade-off exists between the key length and matching accuracy. Moreover, these systems are vulnerable to privacy leakage, i.e., it is trivial to recover the original biometric template given the secure sketch and its associated cryptographic key. In this work, we propose a novel key-binding biometric cryptosystem framework, referred to as Cancelable Biometrics Vault (CBV), to address the above two limitations. The CBV framework is inspired by the cryptographic principle of chaffing and winnowing. It utilizes the concept of cancelable biometrics (CB) to generate secure biometric templates, which in turn are used to encode bits in a cryptographic key. While the CBV framework is generic and does not rely on a specific biometric representation, it does assume the availability of a suitable (satisfying the requirements of accuracy preservation, non-invertibility, and non-linkability) CB scheme for the given representation. To demonstrate the usefulness of the proposed CBV framework, we implement this approach using an extended BioEncoding scheme, which is a CB scheme appropriate for bit strings such as iris-codes. Unlike the baseline BioEncoding scheme, the extended version proposed in this work fulfills all the three requirements of a CB construct. Experiments show that the decoding accuracy of the proposed CBV framework is comparable to the recognition accuracy of the underlying CB construct, namely, the extended BioEncoding scheme, regardless of the cryptographic key size.

Feasibility Study of Using MyoBand for Learning Electronic Keyboard

Sharmila Mani, Madhav Rao

Responsive image

Auto-TLDR; Autonomous Finger-Based Music Instrument Learning using Electromyography Using MyoBand and Machine Learning

Slides Poster Similar

Learning musical instrument like piano or electronic keyboard on average takes a decade time. Currently, musical instrument learning requires continuous supervision from the tutor, and self learning to reach expert level is considered impossible. On the other side, it often becomes unrealistic to stay connected with the music tutor for a long time and many learners stop halfway. To address this specific issue, online distance learning platform is implemented for music learning system, yet it does not support self learning, remains tutor dependent, and is not a scalable approach. In addition, there is no way for these platforms to verify whether user pressed a key note with the intended finger, which is significant for learning finger based musical instruments. To overcome this, an autonomous system to evaluate and guide in the learning process by continuously tracking finger movements via a non-camera based solution is proposed. Finger press triggers the muscle movements which are detected at the surface of the forearm in the form of surface Electromyography (sEMG) signals. The paper proposes tracking of finger press on an electronic keyboard using MyoBand [1] wearable device that provided 8 channels of sEMG signals. A machine learning (ML) approach was considered with eleven time and frequency domain features of sEMG signals, to classify musical note played by the instrument on corresponding finger press. The feature set was further standardized using standard scaler approach, and vector dimensions were reduced by Linear Discriminant Analysis (LDA) method. The resulting reduced dimension data was applied on Random Forest (RF) classifier to report best classification accuracy for our application. For training the RF model, several trails of 10 seconds sEMG signals were collected using wearable MyoBand device. Experiments involved single finger press to render a note in the musical instrument, and multiple finger press to define chord sequence on an electronic musical keyboard. Further analysis was performed to maximize the classification accuracy over the number of trials and optimize the position of electrodes for successful identification of musical note played. The proposed method achieves a classification accuracy of 74.25% for 5 musical note played on an electronic keyboard instrument with 4 MyoBand electrodes, and an accuracy of 95.83% with one electrode for identifying between four musical events including two major chords and two musical notes

On the Minimal Recognizable Image Patch

Mark Fonaryov, Michael Lindenbaum

Responsive image

Auto-TLDR; MIRC: A Deep Neural Network for Minimal Recognition on Partially Occluded Images

Slides Poster Similar

In contrast to human vision, common recognition algorithms often fail on partially occluded images. We propose characterizing, empirically, the algorithmic limits by finding a minimal recognizable patch (MRP) that is by itself sufficient to recognize the image. A specialized deep network allows us to find the most informative patches of a given size, and serves as an experimental tool. A human vision study recently characterized related (but different) minimally recognizable configurations (MIRCs) [1], for which we specify computational analogues (denoted cMIRCs). The drop in human decision accuracy associated with size reduction of these MIRCs is substantial and sharp. Interestingly, such sharp reductions were also found for the computational versions we specified.

Using Machine Learning to Refer Patients with Chronic Kidney Disease to Secondary Care

Lee Au-Yeung, Xianghua Xie, Timothy Marcus Scale, James Anthony Chess

Responsive image

Auto-TLDR; A Machine Learning Approach for Chronic Kidney Disease Prediction using Blood Test Data

Slides Poster Similar

There has been growing interest recently in using machine learning techniques as an aid in clinical medicine. Machine learning offers a range of classification algorithms which can be applied to medical data to aid in making clinical predictions. Recent studies have demonstrated the high predictive accuracy of various classification algorithms applied to clinical data. Several studies have already been conducted in diagnosing or predicting chronic kidney disease at various stages using different sets of variables. In this study we are investigating the use machine learning techniques with blood test data. Such a system could aid renal teams in making recommendations to primary care general practitioners to refer patients to secondary care where patients may benefit from earlier specialist assessment and medical intervention. We are able to achieve an overall accuracy of 88.48\% using logistic regression, 87.12\% using ANN and 85.29\% using SVM. ANNs performed with the highest sensitivity at 89.74\% compared to 86.67\% for logistic regression and 85.51\% for SVM.

3D Dental Biometrics: Automatic Pose-Invariant Dental Arch Extraction and Matching

Zhong Xin, Zhiyuan Zhang

Responsive image

Auto-TLDR; Automatic Dental Arch Extraction and Matching for 3D Dental Identification using Laser-Scanned Plasters

Slides Poster Similar

A novel automatic pose-invariant dental arch extraction and matching framework is developed for 3D dental identification using laser-scanned dental plasters. In our previous attempt [1-5], 3D point-based algorithms have been developed and they have shown a few advantages over existing 2D dental identifications. This study is a continuous effort in developing arch-based algorithms to extract and match dental arch feature in an automatic and pose-invariant way. As best as we know, this is the first attempt at automatic dental arch extraction and matching for 3D dental identification. A Radial Ray Algorithm (RRA) is proposed by projecting dental arch shape from 3D to 2D. This algorithm is fully automatic and fast. Preliminary identification result is obtained by matching 11 postmortem (PM) samples against 200 ante-mortem (AM) samples. 72.7% samples achieved top 5% accuracy. 90.9% samples achieved top 10% accuracy and all 11 samples (100%) achieved top 15.5% accuracy out of the 200-rank list. In addition, the time for identifying a single subject from 200 subjects has been significantly reduced from 45 minutes to 5 minutes by matching the extracted 2D dental arch. Although the extracted 2D arch feature is not as accurate and discriminative as the full 3D arch, it may serve as an important filter feature to improve the identification speed in future investigations.

3CS Algorithm for Efficient Gaussian Process Model Retrieval

Fabian Berns, Kjeld Schmidt, Ingolf Bracht, Christian Beecks

Responsive image

Auto-TLDR; Efficient retrieval of Gaussian Process Models for large-scale data using divide-&-conquer-based approach

Slides Poster Similar

Gaussian Process Models (GPMs) have been applied for various pattern recognition tasks due to their analytical tractability, ability to quantify uncertainty for their own results as well as to subsume prominent other regression techniques. Despite these promising prospects their super-quadratic computation time complexity for model selection and evaluation impedes its broader application for more than a few thousand data points. Although there have been many proposals towards Gaussian Processes for large-scale data, those only offer a linearly scaling improvement to a cubical scaling problem. In particular, solutions like the Nystrom approximation or sparse matrices are only taking fractions of the given data into account and subsequently lead to inaccurate models. In this paper, we thus propose a divide-&-conquer-based approach, that allows to efficiently retrieve GPMs for large-scale data. The resulting model is composed of independent pattern representations for non-overlapping segments of the given data and consequently reduces computation time significantly. Our performance analysis indicates that our proposal is able to outperform state-of-the-art algorithms for GPM retrieval with respect to the qualities of efficiency and accuracy.

Computational Data Analysis for First Quantization Estimation on JPEG Double Compressed Images

Sebastiano Battiato, Oliver Giudice, Francesco Guarnera, Giovanni Puglisi

Responsive image

Auto-TLDR; Exploiting Discrete Cosine Transform Coefficients for Multimedia Forensics

Slides Poster Similar

Multimedia Forensics experts work consists in providing answers about integrity of a specific media content and from where it comes from. Exploitation of any traces from JPEG double compressed images is often one of the main investigative path to be used for these purposes. Thus it is fundamental to have tools and algorithms able to safely estimate the first quantization matrix to further proceed with camera model identification and related tasks. In this paper, a technique based on extensive simulation is proposed, with the aim to infer the first quantization for a certain numbers of Discrete Cosine Transform (DCT) coefficients exploiting local image statistics without using any a-priori knowledge. The method provides also a reliable confidence value for the estimation which is of great importance for forensic purposes. Experimental results w.r.t. the state-of-the-art demonstrate the effectiveness of the proposed technique both in terms of precision and overall reliability.

DR2S: Deep Regression with Region Selection for Camera Quality Evaluation

Marcelin Tworski, Stéphane Lathuiliere, Salim Belkarfa, Attilio Fiandrotti, Marco Cagnazzo

Responsive image

Auto-TLDR; Texture Quality Estimation Using Deep Learning

Slides Poster Similar

In this work, we tackle the problem of estimating a camera capability to preserve fine texture details at a given lighting condition. Importantly, our texture preservation measurement should coincide with human perception. Consequently, we formulate our problem as a regression one and we introduce a deep convolutional network to estimate texture quality score. At training time, we use ground-truth quality scores provided by expert human annotators in order to obtain a subjective quality measure. In addition, we propose a region selection method to identify the image regions that are better suited at measuring perceptual quality. Finally, our experimental evaluation shows that our learning-based approach outperforms existing methods and that our region selection algorithm consistently improves the quality estimation.

A Low-Complexity R-Peak Detection Algorithm with Adaptive Thresholding for Wearable Devices

Tiago Rodrigues, Hugo Plácido Da Silva, Ana Luisa Nobre Fred, Sirisack Samoutphonh

Responsive image

Auto-TLDR; Real-Time and Low-Complexity R-peak Detection for Single Lead ECG Signals

Slides Poster Similar

A reliable detection of the R-peaks in an electrocardiogram (ECG) time series is a fundamental step for further rhythmic, heart rate variability (HRV) analysis, biometric recognition techniques and additional ECG waveform based analysis. In this paper, a novel real-time and low-complexity R-peak detection algorithm is presented for single lead ECG signals. The detection algorithm is divided in two stages. In the first pre-processing stage, the QRS complex is enhanced by taking the double derivative, squaring and moving window integration. In the second, the detection of the R-peak is achieved based on a finite state machine approach. The detection threshold is dynamically adapted and follows an exponential decay after each detection, making it suitable for R-peak detection under fast heart rate and R-wave amplitude changes with no additional search back. The proposed algorithm was evaluated in a private single lead ECG database acquired using a FieldWiz wearable device. The database comprises five recordings from four different subjects, recorded during dynamic conditions, running, trail running and gym sessions. The raw ECG signals were annotated for the R-peak and benchmarked against common QRS detectors and proposed method. The combined acquisition setup and presented approach resulted in R-peak detection Sensivitity (Se) of 99.77% and Positive Predictive Value of (PPV) of 99.18%, comparable to state of the art real time QRS detectors. Due to its low computational complexity, this method can be implemented in embedded wearable systems, suited for cardiovascular tracking devices in dynamic use cases and R-peak detection.

A Flatter Loss for Bias Mitigation in Cross-Dataset Facial Age Estimation

Ali Akbari, Muhammad Awais, Zhenhua Feng, Ammarah Farooq, Josef Kittler

Responsive image

Auto-TLDR; Cross-dataset Age Estimation for Neural Network Training

Slides Poster Similar

Existing studies in facial age estimation have mostly focused on intra-dataset protocols that assume training and test images captured under similar conditions. However, this is rarely valid in practical applications, where training and test sets usually have different characteristics. In this paper, we advocate a cross-dataset protocol for age estimation benchmarking. In order to improve the cross-dataset age estimation performance, we mitigate the inherent bias caused by the learning algorithm. To this end, we propose a novel loss function that is more effective for neural network training. The relative smoothness of the proposed loss function is its advantage with regards to the optimisation process performed by stochastic gradient decent. Its lower gradient, compared with existing loss functions, facilitates the discovery of and convergence to a better optimum, and consequently a better generalisation. The cross-dataset experimental results demonstrate the superiority of the proposed method over the state-of-the-art algorithms in terms of accuracy and generalisation capability.

Predicting Chemical Properties Using Self-Attention Multi-Task Learning Based on SMILES Representation

Sangrak Lim, Yong Oh Lee

Responsive image

Auto-TLDR; Self-attention based Transformer-Variant Model for Chemical Compound Properties Prediction

Slides Poster Similar

In the computational prediction of chemical compound properties, molecular descriptors and fingerprints encoded to low dimensional vectors are used. The selection of proper molecular descriptors and fingerprints is both important and challenging as the performance of such models is highly dependent on descriptors. To overcome this challenge, natural language processing models that utilize simplified molecular input line entry system as input were studied, and several transformer variant models achieved superior results when compared with conventional methods. In this study, we explored the structural differences of the transformer-variant model and proposed a new self-attention based model. The representation learning performance of the self-attention module was evaluated in a multi-task learning environment using imbalanced chemical datasets. The experiment results showed that our model achieved competitive outcomes on several benchmark datasets. The source code of our experiment is available at https://github.com/arwhirang/sa-mtl and the dataset is available from the same URL.

Wireless Localisation in WiFi Using Novel Deep Architectures

Peizheng Li, Han Cui, Aftab Khan, Usman Raza, Robert Piechocki, Angela Doufexi, Tim Farnham

Responsive image

Auto-TLDR; Deep Neural Network for Indoor Localisation of WiFi Devices in Indoor Environments

Slides Poster Similar

This paper studies the indoor localisation of WiFi devices based on a commodity chipset and standard channel sounding. First, we present a novel shallow neural network (SNN) in which features are extracted from the channel state information (CSI) corresponding to WiFi subcarriers received on different antennas and used to train the model. The single layer architecture of this localisation neural network makes it lightweight and easy-to-deploy on devices with stringent constraints on computational resources. We further investigate for localisation the use of deep learning models and design novel architectures for convolutional neural network (CNN) and long-short term memory (LSTM). We extensively evaluate these localisation algorithms for continuous tracking in indoor environments. Experimental results prove that even an SNN model, after a careful handcrafted feature extraction, can achieve accurate localisation. Meanwhile, using a well-organised architecture, the neural network models can be trained directly with raw data from the CSI and localisation features can be automatically extracted to achieve accurate position estimates. We also found that the performance of neural network-based methods are directly affected by the number of anchor access points (APs) regardless of their structure. With three APs, all neural network models proposed in this paper can obtain localisation accuracy of around 0.5 metres. In addition the proposed deep NN architecture reduces the data pre-processing time by 6.5 hours compared with a shallow NN using the data collected in our testbed. In the deployment phase, the inference time is also significantly reduced to 0.1 ms per sample. We also demonstrate the generalisation capability of the proposed method by evaluating models using different target movement characteristics to the ones in which they were trained.

XGBoost to Interpret the Opioid Patients’ StateBased on Cognitive and Physiological Measures

Arash Shokouhmand, Omid Dehzangi, Jad Ramadan, Victor Finomore, Nasser M. Nasarabadi, Ali Rezai

Responsive image

Auto-TLDR; Predicting the Wellness of Opioid Addictions Using Multi-modal Sensor Data

Poster Similar

Dealing with opioid addiction and its long-term consequences is of great importance, as the addiction to opioids is emerged gradually, and established strongly in a given patient's body. Based on recent research, quitting the opioid requires clinicians to arrange a gradual plan for the patients who deal with the difficulties of overcoming addiction. This, in turn, necessitates observing the patients' wellness periodically, which is conventionally made by setting clinical appointments. However, this approach of dealing runs the risk of relapse for patients, as there would not be any monitoring between the clinical sessions. Thus, we need to increase the number of clinical appointments for opioid patients, which is not feasible due to the high financial costs, and the patients not having enough forbearance. Nevertheless, with the advent of wearable sensors continuous patient monitoring becomes possible. However, the data collected through the sensors is pervasively noisy, where using sensors with different sampling frequency challenges the data processing. In this work, we handle this problem by using 12-hour resolution data from cognitive tests, along with heart rate (HR) and heart rate variability (HRV), sampled at each 15 and 180 seconds, respectively. The proposed recipe enables us to interpret the multi-modal sensor data as a feature space, where we can predict the wellness of the opioid patients by employing extreme gradient boosting (XGBoost), which results in 96.12% average accuracy of prediction as the best achieved performance.

Joint Learning Multiple Curvature Descriptor for 3D Palmprint Recognition

Lunke Fei, Bob Zhang, Jie Wen, Chunwei Tian, Peng Liu, Shuping Zhao

Responsive image

Auto-TLDR; Joint Feature Learning for 3D palmprint recognition using curvature data vectors

Slides Poster Similar

3D palmprint-based biometric recognition has drawn growing research attention due to its several merits over 2D counterpart such as robust structural measurement of a palm surface and high anti-counterfeiting capability. However, most existing 3D palmprint descriptors are hand-crafted that usually extract stationary features from 3D palmprint images. In this paper, we propose a feature learning method to jointly learn compact curvature feature descriptor for 3D palmprint recognition. We first form multiple curvature data vectors to completely sample the intrinsic curvature information of 3D palmprint images. Then, we jointly learn a feature projection function that project curvature data vectors into binary feature codes, which have the maximum inter-class variances and minimum intra-class distance so that they are discriminative. Moreover, we learn the collaborative binary representation of the multiple curvature feature codes by minimizing the information loss between the final representation and the multiple curvature features, so that the proposed method is more compact in feature representation and efficient in matching. Experimental results on the baseline 3D palmprint database demonstrate the superiority of the proposed method in terms of recognition performance in comparison with state-of-the-art 3D palmprint descriptors.

GazeMAE: General Representations of Eye Movements Using a Micro-Macro Autoencoder

Louise Gillian C. Bautista, Prospero Naval

Responsive image

Auto-TLDR; Fast and Slow Eye Movement Representations for Sentiment-agnostic Eye Tracking

Slides Poster Similar

Eye movements are intricate and dynamic events that contain a wealth of information about the subject and the stimuli. We propose an abstract representation of eye movements that preserve the important nuances in gaze behavior while being stimuli-agnostic. We consider eye movements as raw position and velocity signals and train a deep temporal convolutional autoencoder to learn micro-scale and macro-scale representations corresponding to the fast and slow features of eye movements. These joint representations are evaluated by fitting a linear classifier on various tasks and outperform other works in biometrics and stimuli classification. Further experiments highlight the validity and generalizability of this method, bringing eye tracking research closer to real-world applications.

Total Whitening for Online Signature Verification Based on Deep Representation

Xiaomeng Wu, Akisato Kimura, Kunio Kashino, Seiichi Uchida

Responsive image

Auto-TLDR; Total Whitening for Online Signature Verification

Slides Poster Similar

In deep metric learning targeted at time series, the correlation between feature activations may be easily enlarged through highly nonlinear neural networks, leading to suboptimal embedding effectiveness. An effective solution to this problem is whitening. For example, in online signature verification, whitening can be derived for three individual Gaussian distributions, namely the distributions of local features at all temporal positions 1) for all signatures of all subjects, 2) for all signatures of each particular subject, and 3) for each particular signature of each particular subject. This study proposes a unified method called total whitening that integrates these individual Gaussians. Total whitening rectifies the layout of multiple individual Gaussians to resemble a standard normal distribution, improving the balance between intraclass invariance and interclass discriminative power. Experimental results demonstrate that total whitening achieves state-of-the-art accuracy when tested on online signature verification benchmarks.

Uncertainty Guided Recognition of Tiny Craters on the Moon

Thorsten Wilhelm, Christian Wöhler

Responsive image

Auto-TLDR; Accurately Detecting Tiny Craters in Remote Sensed Images Using Deep Neural Networks

Slides Poster Similar

Accurately detecting craters in remotely sensed images is an important task when analysing the properties of planetary bodies. Commonly, only large craters in the range of several kilometres are detected. In this work we provide the first example of automatically detecting tiny craters in the range of several meters with the help of a deep neural network by using only a small set of annotated craters. Additionally, we propose a novel way to group overlapping detections and replace the commonly used non-maximum suppression with a probabilistic treatment. As a result, we receive valuable uncertainty estimates of the detections and the aggregated detections are shown to be vastly superior.

Recovery of 2D and 3D Layout Information through an Advanced Image Stitching Algorithm Using Scanning Electron Microscope Images

Aayush Singla, Bernhard Lippmann, Helmut Graeb

Responsive image

Auto-TLDR; Image Stitching for True Geometrical Layout Recovery in Nanoscale Dimension

Slides Poster Similar

Image stitching describes the process of reconstruction of a high resolution image from combining multiple images. Using a scanning electron microscope as the image source, individual images will show patterns in a nm dimension whereas the combined image may cover an area of several mm2. The recovery of the physical layout of modern semiconductor products manufactured in advanced technologies nodes down to 22 nm requires a perfect stitching process with no deviation with respect to the original design data, as any stitching error will result in failures during the reconstruction of the electrical design. In addition, the recovery of the complete design requires the acquisition of all individual layers of a semiconductor device which represent a 3D structure with interconnections defining error limits on the stitching error for each individual scanned image mosaic. An advanced stitching and alignment process is presented enabling a true geometrical layout recovery in nanoscale dimensions which is also applied and evaluated on other use cases from biological applications.

Location Prediction in Real Homes of Older Adults based on K-Means in Low-Resolution Depth Videos

Simon Simonsson, Flávia Dias Casagrande, Evi Zouganeli

Responsive image

Auto-TLDR; Semi-supervised Learning for Location Recognition and Prediction in Smart Homes using Depth Video Cameras

Slides Poster Similar

In this paper we propose a novel method for location recognition and prediction in smart homes based on semi-supervised learning. We use data collected from low-resolution depth video cameras installed in four apartments with older adults over 70 years of age, and collected during a period of one to seven weeks. The location of the person in the depth images is detected by a person detection algorithm adapted from YOLO (You Only Look Once). The locations extracted from the videos are then clustered using K-means clustering. Sequence prediction algorithms are used to predict the next cluster (location) based on the previous clusters (locations). The accuracy of predicting the next location is up to 91%, a significant improvement compared to the case where binary sensors are placed in the apartment based on human intuition. The paper presents an analysis on the effect of the memory length (i.e. the number of previous clusters used to predict the next one), and on the amount of recorded data required to converge.

Electroencephalography Signal Processing Based on Textural Features for Monitoring the Driver’s State by a Brain-Computer Interface

Giulia Orrù, Marco Micheletto, Fabio Terranova, Gian Luca Marcialis

Responsive image

Auto-TLDR; One-dimensional Local Binary Pattern Algorithm for Estimating Driver Vigilance in a Brain-Computer Interface System

Slides Poster Similar

In this study we investigate a textural processing method of electroencephalography (EEG) signal as an indicator to estimate the driver's vigilance in a hypothetical Brain-Computer Interface (BCI) system. The novelty of the solution proposed relies on employing the one-dimensional Local Binary Pattern (1D-LBP) algorithm for feature extraction from pre-processed EEG data. From the resulting feature vector, the classification is done according to three vigilance classes: awake, tired and drowsy. The claim is that the class transitions can be detected by describing the variations of the micro-patterns' occurrences along the EEG signal. The 1D-LBP is able to describe them by detecting mutual variations of the signal temporarily "close" as a short bit-code. Our analysis allows to conclude that the 1D-LBP adoption has led to significant performance improvement. Moreover, capturing the class transitions from the EEG signal is effective, although the overall performance is not yet good enough to develop a BCI for assessing the driver's vigilance in real environments.

Quantified Facial Temporal-Expressiveness Dynamics for Affect Analysis

Md Taufeeq Uddin, Shaun Canavan

Responsive image

Auto-TLDR; quantified facial Temporal-expressiveness Dynamics for quantified affect analysis

Poster Similar

The quantification of visual affect data (e.g. face images) is essential to build and monitor automated affect modeling systems efficiently. Considering this, this work proposes quantified facial Temporal-expressiveness Dynamics (TED) to quantify the expressiveness of human faces. The proposed algorithm leverages multimodal facial features by incorporating static and dynamic information to enable accurate measurements of facial expressiveness. We show that TED can be used for high-level tasks such as summarization of unstructured visual data, expectation from and interpretation of automated affect recognition models. To evaluate the positive impact of using TED, a case study was conducted on spontaneous pain using the UNBC-McMaster spontaneous shoulder pain dataset. Experimental results show the efficacy of using TED for quantified affect analysis.

Deep Transformation Models: Tackling Complex Regression Problems with Neural Network Based Transformation Models

Beate Sick, Torsten Hothorn, Oliver Dürr

Responsive image

Auto-TLDR; A Deep Transformation Model for Probabilistic Regression

Slides Poster Similar

We present a deep transformation model for probabilistic regression. Deep learning is known for outstandingly accurate predictions on complex data but in regression tasks it is predominantly used to just predict a single number. This ignores the non-deterministic character of most tasks. Especially if crucial decisions are based on the predictions, like in medical applications, it is essential to quantify the prediction uncertainty. The presented deep learning transformation model estimates the whole conditional probability distribution, which is the most thorough way to capture uncertainty about the outcome. We combine ideas from a statistical transformation model (most likely transformation) with recent transformation models from deep learning (normalizing flows) to predict complex outcome distributions. The core of the method is a parameterized transformation function which can be trained with the usual maximum likelihood framework using gradient descent. The method can be combined with existing deep learning architectures. For small machine learning benchmark datasets, we report state of the art performance for most dataset and partly even outperform it. Our method works for complex input data, which we demonstrate by employing a CNN architecture on image data.

A Cross Domain Multi-Modal Dataset for Robust Face Anti-Spoofing

Qiaobin Ji, Shugong Xu, Xudong Chen, Shan Cao, Shunqing Zhang

Responsive image

Auto-TLDR; Cross domain multi-modal FAS dataset GREAT-FASD and several evaluation protocols for academic community

Slides Poster Similar

Face Anti-spoofing (FAS) is a challenging problem due to the complex serving scenario and diverse face presentation attack patterns. Using single modal images which are usually captured with RGB cameras is not able to deal with the former because of serious overfitting problems. The existing multi-modal FAS datasets rarely pay attention to the cross domain problems, trainingFASsystemonthesedataleadstoinconsistenciesandlow generalization capabilities in deployment since imaging principles(structured light, TOF, etc.) and pre-processing methods vary between devices. We explore the subtle fine-grained differences betweeen multi-modal cameras and proposed a cross domain multi-modal FAS dataset GREAT-FASD and several evaluation protocols for academic community. Furthermore, we incorporate the multiplicative attention and center loss to enhance the representative power of CNN via seeking out complementary information as a powerful baseline. In addition, extensive experiments have been conducted on the proposed dataset to analyze the robustness to distinguish spoof faces and bona-fide faces. Experimental results show the effectiveness of proposed method and achieve the state-of-the-art competitive results. Finally, we visualize our future distribution in hidden space and observe that the proposed method is able to lead the network to generate a large margin for face anti-spoofing task

Approach for Document Detection by Contours and Contrasts

Daniil Tropin, Sergey Ilyuhin, Dmitry Nikolaev, Vladimir V. Arlazarov

Responsive image

Auto-TLDR; A countor-based method for arbitrary document detection on a mobile device

Slides Poster Similar

This paper considers the task of arbitrary document detection performed on a mobile device. The classical contour-based approach often mishandles cases with occlusion, complex background, or blur. Region-based approach, which relies on the contrast between object and background, does not have limitations, however its known implementations are highly resource-consuming. We propose a modification of a countor-based method, in which the competing hypotheses of the contour location are ranked according to the contrast between the areas inside and outside the border. In the performed experiments such modification leads to the 40% decrease of alternatives ordering errors and 10% decrease of the overall number of detection errors. We updated state-of-the-art performance on the open MIDV-500 dataset and demonstrated competitive results with the state-of-the-art on the SmartDoc dataset.

Quality-Based Representation for Unconstrained Face Recognition

Nelson Méndez-Llanes, Katy Castillo-Rosado, Heydi Mendez-Vazquez, Massimo Tistarelli

Responsive image

Auto-TLDR; activation map for face recognition in unconstrained environments

Slides Similar

Significant advances have been achieved in face recognition in the last decade thanks to the development of deep learning methods. However, recognizing faces captured in uncontrolled environments is still a challenging problem for the scientific community. In these scenarios, the performance of most of existing deep learning based methods abruptly falls, due to the bad quality of the face images. In this work, we propose to use an activation map to represent the quality information in a face image. Different face regions are analyzed to determine their quality and then only those regions with good quality are used to perform the recognition using a given deep face model. For experimental evaluation, in order to simulate unconstrained environments, three challenging databases, with different variations in appearance, were selected: the Labeled Faces in the Wild Database, the Celebrities in Frontal-Profile in the Wild Database, and the AR Database. Three deep face models were used to evaluate the proposal on these databases and in all cases, the use of the proposed activation map allows the improvement of the recognition rates obtained by the original models in a range from 0.3 up to 31%. The obtained results experimentally demonstrated that the proposal is able to select those face areas with higher discriminative power and enough identifying information, while ignores the ones with spurious information.