XGBoost to Interpret the Opioid Patients’ StateBased on Cognitive and Physiological Measures

Arash Shokouhmand, Omid Dehzangi, Jad Ramadan, Victor Finomore, Nasser M. Nasarabadi, Ali Rezai

Responsive image

Auto-TLDR; Predicting the Wellness of Opioid Addictions Using Multi-modal Sensor Data

Poster

Dealing with opioid addiction and its long-term consequences is of great importance, as the addiction to opioids is emerged gradually, and established strongly in a given patient's body. Based on recent research, quitting the opioid requires clinicians to arrange a gradual plan for the patients who deal with the difficulties of overcoming addiction. This, in turn, necessitates observing the patients' wellness periodically, which is conventionally made by setting clinical appointments. However, this approach of dealing runs the risk of relapse for patients, as there would not be any monitoring between the clinical sessions. Thus, we need to increase the number of clinical appointments for opioid patients, which is not feasible due to the high financial costs, and the patients not having enough forbearance. Nevertheless, with the advent of wearable sensors continuous patient monitoring becomes possible. However, the data collected through the sensors is pervasively noisy, where using sensors with different sampling frequency challenges the data processing. In this work, we handle this problem by using 12-hour resolution data from cognitive tests, along with heart rate (HR) and heart rate variability (HRV), sampled at each 15 and 180 seconds, respectively. The proposed recipe enables us to interpret the multi-modal sensor data as a feature space, where we can predict the wellness of the opioid patients by employing extreme gradient boosting (XGBoost), which results in 96.12% average accuracy of prediction as the best achieved performance.

Similar papers

Deep Learning Based Sepsis Intervention: The Modelling and Prediction of Severe Sepsis Onset

Gavin Tsang, Xianghua Xie

Responsive image

Auto-TLDR; Predicting Sepsis onset by up to six hours prior using a boosted cascading training methodology and adjustable margin hinge loss function

Slides Poster Similar

Sepsis presents a significant challenge to healthcare providers during critical care scenarios such as within an intensive care unit. The prognosis of the onset of severe septic shock results in significant increases in mortality rate, length of stay and readmission rates. Continual advancements in health informatics data allows for applications within the machine learning field to predict sepsis onset in a timely manner, allowing for effective preventative intervention of severe septic shock. A novel deep learning application is proposed to provide effective prediction of sepsis onset by up to six hours prior, involving the use of novel concepts such as a boosted cascading training methodology and adjustable margin hinge loss function. The proposed methodology provides statistically significant improvements to that of current machine learning based modelling applications based off the Physionet Computing in Cardiology 2019 challenge. Results show test F1 scores of 0.420, a significant improvement of 0.281 as compared to the next best challenger results.

Electroencephalography Signal Processing Based on Textural Features for Monitoring the Driver’s State by a Brain-Computer Interface

Giulia Orrù, Marco Micheletto, Fabio Terranova, Gian Luca Marcialis

Responsive image

Auto-TLDR; One-dimensional Local Binary Pattern Algorithm for Estimating Driver Vigilance in a Brain-Computer Interface System

Slides Poster Similar

In this study we investigate a textural processing method of electroencephalography (EEG) signal as an indicator to estimate the driver's vigilance in a hypothetical Brain-Computer Interface (BCI) system. The novelty of the solution proposed relies on employing the one-dimensional Local Binary Pattern (1D-LBP) algorithm for feature extraction from pre-processed EEG data. From the resulting feature vector, the classification is done according to three vigilance classes: awake, tired and drowsy. The claim is that the class transitions can be detected by describing the variations of the micro-patterns' occurrences along the EEG signal. The 1D-LBP is able to describe them by detecting mutual variations of the signal temporarily "close" as a short bit-code. Our analysis allows to conclude that the 1D-LBP adoption has led to significant performance improvement. Moreover, capturing the class transitions from the EEG signal is effective, although the overall performance is not yet good enough to develop a BCI for assessing the driver's vigilance in real environments.

Video Analytics Gait Trend Measurement for Fall Prevention and Health Monitoring

Lawrence O'Gorman, Xinyi Liu, Md Imran Sarker, Mariofanna Milanova

Responsive image

Auto-TLDR; Towards Health Monitoring of Gait with Deep Learning

Slides Poster Similar

We design a video analytics system to measure gait over time and detect trend and outliers in the data. The purpose is for health monitoring, the thesis being that trend especially can lead to early detection of declining health and be used to prevent accidents such as falls in the elderly. We use the OpenPose deep learning tool for recognizing the back and neck angle features of walking people, and measure speed as well. Trend and outlier statistics are calculated upon time series of these features. A challenge in this work is lack of testing data of decaying gait. We first designed experiments to measure consistency of the system on a healthy population, then analytically altered this real data to simulate gait decay. Results on about 4000 gait samples of 50 people over 3 months showed good separation of healthy gait subjects from those with trend or outliers, and furthermore the trend measurement was able to detect subtle decay in gait not easily discerned by the human eye.

Weight Estimation from an RGB-D Camera in Top-View Configuration

Marco Mameli, Marina Paolanti, Nicola Conci, Filippo Tessaro, Emanuele Frontoni, Primo Zingaretti

Responsive image

Auto-TLDR; Top-View Weight Estimation using Deep Neural Networks

Slides Poster Similar

The development of so-called soft-biometrics aims at providing information related to the physical and behavioural characteristics of a person. This paper focuses on bodyweight estimation based on the observation from a top-view RGB-D camera. In fact, the capability to estimate the weight of a person can be of help in many different applications, from health-related scenarios to business intelligence and retail analytics. To deal with this issue, a TVWE (Top-View Weight Estimation) framework is proposed with the aim of predicting the weight. The approach relies on the adoption of Deep Neural Networks (DNNs) that have been trained on depth data. Each network has also been modified in its top section to replace classification with prediction inference. The performance of five state-of-art DNNs has been compared, namely VGG16, ResNet, Inception, DenseNet and Efficient-Net. In addition, a convolutional auto-encoder has also been included for completeness. Considering the limited literature in this domain, the TVWE framework has been evaluated on a new publicly available dataset: “VRAI Weight estimation Dataset”, which also collects, for each subject, labels related to weight, gender, and height. The experimental results have demonstrated that the proposed methods are suitable for this task, bringing different and significant insights for the application of the solution in different domains.

Quantified Facial Temporal-Expressiveness Dynamics for Affect Analysis

Md Taufeeq Uddin, Shaun Canavan

Responsive image

Auto-TLDR; quantified facial Temporal-expressiveness Dynamics for quantified affect analysis

Poster Similar

The quantification of visual affect data (e.g. face images) is essential to build and monitor automated affect modeling systems efficiently. Considering this, this work proposes quantified facial Temporal-expressiveness Dynamics (TED) to quantify the expressiveness of human faces. The proposed algorithm leverages multimodal facial features by incorporating static and dynamic information to enable accurate measurements of facial expressiveness. We show that TED can be used for high-level tasks such as summarization of unstructured visual data, expectation from and interpretation of automated affect recognition models. To evaluate the positive impact of using TED, a case study was conducted on spontaneous pain using the UNBC-McMaster spontaneous shoulder pain dataset. Experimental results show the efficacy of using TED for quantified affect analysis.

Location Prediction in Real Homes of Older Adults based on K-Means in Low-Resolution Depth Videos

Simon Simonsson, Flávia Dias Casagrande, Evi Zouganeli

Responsive image

Auto-TLDR; Semi-supervised Learning for Location Recognition and Prediction in Smart Homes using Depth Video Cameras

Slides Poster Similar

In this paper we propose a novel method for location recognition and prediction in smart homes based on semi-supervised learning. We use data collected from low-resolution depth video cameras installed in four apartments with older adults over 70 years of age, and collected during a period of one to seven weeks. The location of the person in the depth images is detected by a person detection algorithm adapted from YOLO (You Only Look Once). The locations extracted from the videos are then clustered using K-means clustering. Sequence prediction algorithms are used to predict the next cluster (location) based on the previous clusters (locations). The accuracy of predicting the next location is up to 91%, a significant improvement compared to the case where binary sensors are placed in the apartment based on human intuition. The paper presents an analysis on the effect of the memory length (i.e. the number of previous clusters used to predict the next one), and on the amount of recorded data required to converge.

A General End-To-End Method for Characterizing Neuropsychiatric Disorders Using Free-Viewing Visual Scanning Tasks

Hong Yue Sean Liu, Jonathan Chung, Moshe Eizenman

Responsive image

Auto-TLDR; A general, data-driven, end-to-end framework that extracts relevant features of attentional bias from visual scanning behaviour and uses these features

Slides Poster Similar

The growing availability of eye-gaze tracking technology has allowed for its employment in a wide variety of applications, one of which is the objective diagnosis and monitoring of neuropsychiatric disorders from features of attentional bias extracted from visual scanning patterns. Current techniques in this field are largely comprised of non-generalizable methodologies that rely on domain expertise and study-specific assumptions. In this paper, we present a general, data-driven, end-to-end framework that extracts relevant features of attentional bias from visual scanning behaviour and uses these features to classify between subject groups with standard machine learning techniques. During the free-viewing task, subjects view sets of slides with thematic images while their visual scanning patterns (sets of ordered fixations) are monitored by an eye-tracking system. We encode fixations into relative visual attention maps (RVAMs) to describe measurement errors, and two data-driven methods are proposed to segment regions of interests from RVAMs: 1) using group average RVAMs, and 2) using difference of group average RVAMs. Relative fixation times within regions of interest are calculated and used as input features for a vanilla multilayered perceptron to classify between patient groups. The methods were evaluated on data from an anorexia nervosa (AN) study with 37 subjects and a bipolar/major depressive disorder (BD-MDD) study with 73 subjects. Using leave-one-subject-out cross validation, our technique is able to achieve an area under the receiver operating curve (AUROC) score of 0.935 for the AN study and 0.888 for the BD-MDD study, the latter of which exceeds the performance of the state-of-the-art analysis model designed specifically for the BD-MDD study, which had an AUROC of 0.879. The results validate the proposed methods' efficacy as generalizable, standard baselines for analyzing visual scanning data.

A Low-Complexity R-Peak Detection Algorithm with Adaptive Thresholding for Wearable Devices

Tiago Rodrigues, Hugo Plácido Da Silva, Ana Luisa Nobre Fred, Sirisack Samoutphonh

Responsive image

Auto-TLDR; Real-Time and Low-Complexity R-peak Detection for Single Lead ECG Signals

Slides Poster Similar

A reliable detection of the R-peaks in an electrocardiogram (ECG) time series is a fundamental step for further rhythmic, heart rate variability (HRV) analysis, biometric recognition techniques and additional ECG waveform based analysis. In this paper, a novel real-time and low-complexity R-peak detection algorithm is presented for single lead ECG signals. The detection algorithm is divided in two stages. In the first pre-processing stage, the QRS complex is enhanced by taking the double derivative, squaring and moving window integration. In the second, the detection of the R-peak is achieved based on a finite state machine approach. The detection threshold is dynamically adapted and follows an exponential decay after each detection, making it suitable for R-peak detection under fast heart rate and R-wave amplitude changes with no additional search back. The proposed algorithm was evaluated in a private single lead ECG database acquired using a FieldWiz wearable device. The database comprises five recordings from four different subjects, recorded during dynamic conditions, running, trail running and gym sessions. The raw ECG signals were annotated for the R-peak and benchmarked against common QRS detectors and proposed method. The combined acquisition setup and presented approach resulted in R-peak detection Sensivitity (Se) of 99.77% and Positive Predictive Value of (PPV) of 99.18%, comparable to state of the art real time QRS detectors. Due to its low computational complexity, this method can be implemented in embedded wearable systems, suited for cardiovascular tracking devices in dynamic use cases and R-peak detection.

Tensor Factorization of Brain Structural Graph for Unsupervised Classification in Multiple Sclerosis

Berardino Barile, Marzullo Aldo, Claudio Stamile, Françoise Durand-Dubief, Dominique Sappey-Marinier

Responsive image

Auto-TLDR; A Fully Automated Tensor-based Algorithm for Multiple Sclerosis Classification based on Structural Connectivity Graph of the White Matter Network

Slides Poster Similar

Analysis of longitudinal changes in brain diseases is essential for a better characterization of pathological processes and evaluation of the prognosis. This is particularly important in Multiple Sclerosis (MS) which is the first traumatic disease in young adults, with unknown etiology and characterized by complex inflammatory and degenerative processes leading to different clinical courses. In this work, we propose a fully automated tensor-based algorithm for the classification of MS clinical forms based on the structural connectivity graph of the white matter (WM) network. Using non-negative tensor factorization (NTF), we first focused on the detection of pathological patterns of the brain WM network affected by significant longitudinal variations. Second, we performed unsupervised classification of different MS phenotypes based on these longitudinal patterns, and finally, we used the latent factors obtained by the factorization algorithm to identify the most affected brain regions.

Appliance Identification Using a Histogram Post-Processing of 2D Local Binary Patterns for Smart Grid Applications

Yassine Himeur, Abdullah Alsalemi, Faycal Bensaali, Abbes Amira

Responsive image

Auto-TLDR; LBP-BEVM based Local Binary Patterns for Appliances Identification in the Smart Grid

Similar

Identifying domestic appliances in the smart grid leads to a better power usage management and further helps in detecting appliance-level abnormalities. An efficient identification can be achieved only if a robust feature extraction scheme is developed with a high ability to discriminate between different appliances on the smart grid. Accordingly, we propose in this paper a novel method to extract electrical power signatures after transforming the power signal to 2D space, which has more encoding possibilities. Following, an improved local binary patterns (LBP) is proposed that relies on improving the discriminative ability of conventional LBP using a post-processing stage. A binarized eigenvalue map (BEVM) is extracted from the 2D power matrix and then used to post-process the generated LBP representation. Next, two histograms are constructed, namely up and down histograms, and are then concatenated to form the global histogram. A comprehensive performance evaluation is performed on two different datasets, namely the GREEND and WITHED, in which power data were collected at 1 Hz and 44000 Hz sampling rates, respectively. The obtained results revealed the superiority of the proposed LBP-BEVM based system in terms of the identification performance versus other 2D descriptors and existing identification frameworks.

Assessing the Severity of Health States Based on Social Media Posts

Shweta Yadav, Joy Prakash Sain, Amit Sheth, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya

Responsive image

Auto-TLDR; A Multiview Learning Framework for Assessment of Health State in Online Health Communities

Slides Poster Similar

The unprecedented growth of Internet users has resulted in an abundance of unstructured information on social media including health forums, where patients request health-related information or opinions from other users. Previous studies have shown that online peer support has limited effectiveness without expert intervention. Therefore, a system capable of assessing the severity of health state from the patients' social media posts can help health professionals (HP) in prioritizing the user’s post. In this study, we inspect the efficacy of different aspects of Natural Language Understanding (NLU) to identify the severity of the user’s health state in relation to two perspectives(tasks) (a) Medical Condition (i.e., Recover, Exist, Deteriorate, Other) and (b) Medication (i.e., Effective, Ineffective, Serious Adverse Effect, Other) in online health communities. We propose a multiview learning framework that models both the textual content as well as contextual-information to assess the severity of the user’s health state. Specifically, our model utilizes the NLU views such as sentiment, emotions, personality, and use of figurative language to extract the contextual information. The diverse NLU views demonstrate its effectiveness on both the tasks and as well as on the individual disease to assess a user’s health.

Using Machine Learning to Refer Patients with Chronic Kidney Disease to Secondary Care

Lee Au-Yeung, Xianghua Xie, Timothy Marcus Scale, James Anthony Chess

Responsive image

Auto-TLDR; A Machine Learning Approach for Chronic Kidney Disease Prediction using Blood Test Data

Slides Poster Similar

There has been growing interest recently in using machine learning techniques as an aid in clinical medicine. Machine learning offers a range of classification algorithms which can be applied to medical data to aid in making clinical predictions. Recent studies have demonstrated the high predictive accuracy of various classification algorithms applied to clinical data. Several studies have already been conducted in diagnosing or predicting chronic kidney disease at various stages using different sets of variables. In this study we are investigating the use machine learning techniques with blood test data. Such a system could aid renal teams in making recommendations to primary care general practitioners to refer patients to secondary care where patients may benefit from earlier specialist assessment and medical intervention. We are able to achieve an overall accuracy of 88.48\% using logistic regression, 87.12\% using ANN and 85.29\% using SVM. ANNs performed with the highest sensitivity at 89.74\% compared to 86.67\% for logistic regression and 85.51\% for SVM.

Epileptic Seizure Prediction: A Semi-Dilated Convolutional Neural Network Architecture

Ramy Hussein, Rabab K. Ward, Soojin Lee, Martin Mckeown

Responsive image

Auto-TLDR; Semi-Dilated Convolutional Network for Seizure Prediction using EEG Scalograms

Poster Similar

Despite many recent advances in machine learning and time-series classification, accurate prediction of seizures remains elusive. In this work, we develop a convolutional network module that uses Electroencephalogram (EEG) scalograms to distinguish between the pre-seizure and normal brain activities. Since the EEG scalogram takes rectangular image format with many more temporal bins than spectral bins, the presented module uses "semi-dilated convolutions" to also create a proportional non-square receptive field. The proposed semi-dilated convolutions support exponential expansion of the receptive field over the long dimension (image width, i.e. time) while maintaining high resolution over the short dimension (image height, i.e., frequency). The proposed architecture comprises a set of co-operative semi-dilated convolutional blocks, each block has a stack of parallel semi-dilated convolutional modules with different dilation rates. Results show that our proposed seizure prediction solution outperforms the state-of-the-art methods, achieving a seizure prediction sensitivity of 88.45% and 89.52% for the American Epilepsy Society and Melbourne University EEG datasets, respectively.

Can Reinforcement Learning Lead to Healthy Life?: Simulation Study Based on User Activity Logs

Masami Takahashi, Masahiro Kohjima, Takeshi Kurashima, Hiroyuki Toda

Responsive image

Auto-TLDR; Reinforcement Learning for Healthy Daily Life

Slides Poster Similar

The importance of developing an application based on intervention technology that leads to a healthier life is widely recognized. A challenging part of realizing the application is the need for planning, i.e., considering a user's health goal (e.g., sleep at 10:00 p.m. to get enough sleep), providing intervention at the appropriate timing to help the user achieve the goal. The reinforcement learning (RL) approach is well suited to this type of problem since it is a methodology for planning; RL finds the optimal strategy as that which maximizes future expected profit. The purpose of this study is to clarify the effects of intervention based on RL to support healthy daily life. Therefore, we (i) collect real daily activity data from participants, (ii) generate a user model that imitates the user's response to system interventions, (iii) examine valuable goals and design them as rewards in RL and (iv) obtain optimal intervention strategies by RL via simulations given a user model and goals. We evaluate a generated user model and verify by simulations whether our method could successfully achieve the goal. In addition, we analyze the cases that demonstrated higher probability of achieving the goal and report the features.

Algorithm Recommendation for Data Streams

Jáder Martins Camboim De Sá, Andre Luis Debiaso Rossi, Gustavo Enrique De Almeida Prado Alves Batista, Luís Paulo Faina Garcia

Responsive image

Auto-TLDR; Meta-Learning for Algorithm Selection in Time-Changing Data Streams

Slides Poster Similar

In the last decades, many companies are taking advantage of massive data generation at high frequencies through knowledge discovery to identify valuable information. Machine learning techniques can be employed for knowledge discovery, since they are able to extract patterns from data and induce models to predict future events. However, dynamic and evolving environments generate streams of data that usually are non-stationary. Models induced in these scenarios may perish over time due to seasonality or concept drift. The periodic retraining could help but the fixed algorithm's hypothesis space could no longer be appropriate. An alternative solution is to use meta-learning for periodic algorithm selection in time-changing environments, choosing the bias that best suits the current data. In this paper, we present an enhanced framework for data streams algorithm selection based on MetaStream. Our approach uses meta-learning and incremental learning to actively select the best algorithm for the current concept in a time-changing. Different from previous works, a set of cutting edge meta-features and an incremental learning approach in the meta-level based on LightGBM are used. The results show that this new strategy can improve the recommendation of the best algorithm more accurately in time-changing data.

Automatic Classification of Human Granulosa Cells in Assisted Reproductive Technology Using Vibrational Spectroscopy Imaging

Marina Paolanti, Emanuele Frontoni, Giorgia Gioacchini, Giorgini Elisabetta, Notarstefano Valentina, Zacà Carlotta, Carnevali Oliana, Andrea Borini, Marco Mameli

Responsive image

Auto-TLDR; Predicting Oocyte Quality in Assisted Reproductive Technology Using Machine Learning Techniques

Slides Poster Similar

In the field of reproductive technology, the biochemical composition of female gametes has been successfully investigated with the use of vibrational spectroscopy. Currently, in assistive reproductive technology (ART), there are no shared criteria for the choice of oocyte, and automatic classification methods for the best quality oocytes have not yet been applied. In this paper, considering the lack of criteria in Assisted Reproductive Technology (ART), we use Machine Learning (ML) techniques to predict oocyte quality for a successful pregnancy. To improve the chances of successful implantation and minimize any complications during the pregnancy, Fourier transform infrared microspectroscopy (FTIRM) analysis has been applied on granulosa cells (GCs) collected along with the oocytes during oocyte aspiration, as it is routinely done in ART, and specific spectral biomarkers were selected by multivariate statistical analysis. A proprietary biological reference dataset (BRD) was successfully collected to predict the best oocyte for a successful pregnancy. Personal health information are stored, maintained and backed up using a cloud computing service. Using a user-friendly interface, the user will evaluate whether or not the selected oocyte will have a positive result. This interface includes a dashboard for retrospective analysis, reporting, real-time processing, and statistical analysis. The experimental results are promising and confirm the efficiency of the method in terms of classification metrics: precision, recall, and F1-score (F1) measures.

EEG-Based Cognitive State Assessment Using Deep Ensemble Model and Filter Bank Common Spatial Pattern

Debashis Das Chakladar, Shubhashis Dey, Partha Pratim Roy, Masakazu Iwamura

Responsive image

Auto-TLDR; A Deep Ensemble Model for Cognitive State Assessment using EEG-based Cognitive State Analysis

Slides Poster Similar

Electroencephalography (EEG) is the most used physiological measure to evaluate the cognitive state of a user efficiently. As EEG inherently suffers from a poor spatial resolution, features extracted from each EEG channel may not efficiently used for cognitive state assessment. In this paper, the EEG-based cognitive state assessment has been performed during the mental arithmetic experiment, which includes two cognitive states (task and rest) of a user. To obtain the temporal as well as spatial resolution of the EEG signal, we combined the Filter Bank Common Spatial Pattern (FBCSP) method and Long Short-Term Memory (LSTM)-based deep ensemble model for classifying the cognitive state of a user. Subject-wise data distribution has been performed due to the execution of a large volume of data in a low computing environment. In the FBCSP method, the input EEG is decomposed into multiple equal-sized frequency bands, and spatial features of each frequency bands are extracted using the Common Spatial Pattern (CSP) algorithm. Next, a feature selection algorithm has been applied to identify the most informative features for classification. The proposed deep ensemble model consists of multiple similar structured LSTM networks that work in parallel. The output of the ensemble model (i.e., the cognitive state of a user) is computed using the average weighted combination of individual model prediction. This proposed model achieves 87\% classification accuracy, and it can also effectively estimate the cognitive state of a user in a low computing environment.

Deep Transfer Learning for Alzheimer’s Disease Detection

Nicole Cilia, Claudio De Stefano, Francesco Fontanella, Claudio Marrocco, Mario Molinara, Alessandra Scotto Di Freca

Responsive image

Auto-TLDR; Automatic Detection of Handwriting Alterations for Alzheimer's Disease Diagnosis using Dynamic Features

Slides Poster Similar

Early detection of Alzheimer’s Disease (AD) is essential in order to initiate therapies that can reduce the effects of such a disease, improving both life quality and life expectancy of patients. Among all the activities carried out in our daily life, handwriting seems one of the first to be influenced by the arise of neurodegenerative diseases. For this reason, the analysis of handwriting and the study of its alterations has become of great interest in this research field in order to make a diagnosis as early as possible. In recent years, many studies have tried to use classification algorithms applied to handwritings to implement decision support systems for AD diagnosis. A key issue for the use of these techniques is the detection of effective features, that allow the system to distinguish the natural handwriting alterations due to age, from those caused by neurodegenerative disorders. In this context, many interesting results have been published in the literature in which the features have been typically selected by hand, generally considering the dynamics of the handwriting process in order to detect motor disorders closely related to AD. Features directly derived from handwriting generation models can be also very helpful for AD diagnosis. It should be remarked, however, that the above features do not consider changes in the shape of handwritten traces, which may occur as a consequence of neurodegenerative diseases, as well as the correlation among shape alterations and changes in the dynamics of the handwriting process. Moving from these considerations, the aim of this study is to verify if the combined use of both shape and dynamic features allows a decision support system to improve performance for AD diagnosis. To this purpose, starting from a database of on-line handwriting samples, we generated for each of them a synthetic off-line colour image, where the colour of each elementary trait encodes, in the three RGB channels, the dynamic information associated to that trait. Finally, we exploited the capability of Deep Neural Networks (DNN) to automatically extract features from raw images. The experimental comparison of the results obtained by using standard features and features extracted according the above procedure, confirmed the effectiveness of our approach.

Fingerprints, Forever Young?

Roman Kessler, Olaf Henniger, Christoph Busch

Responsive image

Auto-TLDR; Mated Similarity Scores for Fingerprint Recognition: A Hierarchical Linear Model

Slides Poster Similar

In the present study we analyzed longitudinal fingerprint data of 20 data subjects, acquired over a time span of up to 12 years. Using hierarchical linear modeling, we aimed to delineate mated similarity scores as a function of fingerprint quality and of the time interval between reference and probe images. Our results did not reveal effects on mated similarity scores caused by an increasing time interval across subjects, but rather individual effects on mated similarity scores. The results are in line with the general assumption that the fingerprint as a biometric characteristic and the features extracted from it do not change over the adult life span. However, it contradicts several related studies that reported noticeable template ageing effects. We discuss why different findings regarding ageing of references in fingerprint recognition systems were made.

Prediction of Obstructive Coronary Artery Disease from Myocardial Perfusion Scintigraphy using Deep Neural Networks

Ida Arvidsson, Niels Christian Overgaard, Miguel Ochoa Figueroa, Jeronimo Rose, Anette Davidsson, Kalle Åström, Anders Heyden

Responsive image

Auto-TLDR; A Deep Learning Algorithm for Multi-label Classification of Myocardial Perfusion Scintigraphy for Stable Ischemic Heart Disease

Slides Poster Similar

For diagnosis and risk assessment in patients with stable ischemic heart disease, myocardial perfusion scintigraphy is one of the most common cardiological examinations performed today. There are however many motivations for why an artificial intelligence algorithm would provide useful input to this task. For example to reduce the subjectiveness and save time for the nuclear medicine physicians working with this time consuming task. In this work we have developed a deep learning algorithm for multi-label classification based on a modified convolutional neural network to estimate probability of obstructive coronary artery disease in the left anterior artery, left circumflex artery and right coronary artery. The prediction is based on data from myocardial perfusion scintigraphy studies conducted in a dedicated Cadmium-Zinc-Telluride cardio camera (D-SPECT Spectrum Dynamics). Data from 588 patients was available, with stress images in both upright and supine position, as well as a number of auxiliary parameters such as angina symptoms and BMI. The data was used to train and evaluate the algorithm using 5-fold cross-validation. We achieve state-of-the-art results for this task with an area under the receiver operating characteristics curve of 0.89 as average on per-vessel level and 0.94 on per-patient level.

Dealing with Scarce Labelled Data: Semi-Supervised Deep Learning with Mix Match for Covid-19 Detection Using Chest X-Ray Images

Saúl Calderón Ramirez, Raghvendra Giri, Shengxiang Yang, Armaghan Moemeni, Mario Umaña, David Elizondo, Jordina Torrents-Barrena, Miguel A. Molina-Cabello

Responsive image

Auto-TLDR; Semi-supervised Deep Learning for Covid-19 Detection using Chest X-rays

Slides Poster Similar

Coronavirus (Covid-19) is spreading fast, infecting people through contact in various forms including droplets from sneezing and coughing. Therefore, the detection of infected subjects in an early, quick and cheap manner is urgent. Currently available tests are scarce and limited to people in danger of serious illness. The application of deep learning to chest X-ray images for Covid-19 detection is an attractive approach. However, this technology usually relies on the availability of large labelled datasets, a requirement hard to meet in the context of a virus outbreak. To overcome this challenge, a semi-supervised deep learning model using both labelled and unlabelled data is proposed. We developed and tested a semi-supervised deep learning framework based on the Mix Match architecture to classify chest X-rays into Covid-19, pneumonia and healthy cases. The presented approach was calibrated using two publicly available datasets. The results show an accuracy increase of around $15\%$ under low labelled / unlabelled data ratio. This indicates that our semi-supervised framework can help improve performance levels towards Covid-19 detection when the amount of high-quality labelled data is scarce. Also, we introduce a semi-supervised deep learning boost coefficient which is meant to ease the scalability of our approach and performance comparison.

Classifying Eye-Tracking Data Using Saliency Maps

Shafin Rahman, Sejuti Rahman, Omar Shahid, Md. Tahmeed Abdullah, Jubair Ahmed Sourov

Responsive image

Auto-TLDR; Saliency-based Feature Extraction for Automatic Classification of Eye-tracking Data

Slides Poster Similar

A plethora of research in the literature shows how human eye fixation pattern varies depending on different factors, including genetics, age, social functioning, cognitive functioning, and so on. Analysis of these variations in visual attention has already elicited two potential research avenues: 1) determining the physiological or psychological state of the subject and 2) predicting the tasks associated with the act of viewing from the recorded eye-fixation data. To this end, this paper proposes a visual saliency based novel feature extraction method for automatic and quantitative classification of eye-tracking data, which is applicable to both of the research directions. Instead of directly extracting features from the fixation data, this method employs several well-known computational models of visual attention to predict eye fixation locations as saliency maps. Comparing the saliency amplitudes, similarity and dissimilarity of saliency maps with the corresponding eye fixations maps gives an extra dimension of information which is effectively utilized to generate discriminative features to classify the eye-tracking data. Extensive experimentation using Saliency4ASD [1], Age Prediction [2], and Visual Perceptual Task [3] dataset show that our saliency-based feature can achieve superior performance, outperforming the previous state-of-the-art methods [2],[4], [5] by a considerable margin. Moreover, unlike the existing application-specific solutions, our method demonstrates performance improvement across three distinct problems from the real-life domain: Autism Spectrum Disorder screening, toddler age prediction, and human visual perceptual task classification, providing a general paradigm that utilizes the extra-information inherent in saliency maps for a more accurate classification.

Personalized Models in Human Activity Recognition Using Deep Learning

Hamza Amrani, Daniela Micucci, Paolo Napoletano

Responsive image

Auto-TLDR; Incremental Learning for Personalized Human Activity Recognition

Slides Poster Similar

Current sensor-based human activity recognition techniques that rely on a user-independent model struggle to generalize to new users and on to changes that a person may make over time to his or her way of carrying out activities. Incremental learning is a technique that allows to obtain personalized models which may improve the performance on the classifiers thanks to a continuous learning based on user data. Finally, deep learning techniques have been proven to be more effective with respect to traditional ones in the generation of user-independent models. The aim of our work is therefore to put together deep learning techniques with incremental learning in order to obtain personalized models that perform better with respect to user-independent model and personalized model obtained using traditional machine learning techniques. The experimentation was done by comparing the results obtained by a technique in the state of the art with those obtained by two neural networks (ResNet and a simplified CNN) on three datasets. The experimentation showed that neural networks adapt faster to a new user than the baseline.

Fall Detection by Human Pose Estimation and Kinematic Theory

Vincenzo Dentamaro, Donato Impedovo, Giuseppe Pirlo

Responsive image

Auto-TLDR; A Decision Support System for Automatic Fall Detection on Le2i and URFD Datasets

Slides Poster Similar

In a society with increasing age, the understanding of human falls it is of paramount importance. This paper presents a Decision Support System whose pipeline is designed to extract and compute physical domain’s features achieving the state of the art accuracy on the Le2i and UR fall detection datasets. The paper uses the Kinematic Theory of Rapid Human Movement and its sigma-lognormal model together with classic physical features to achieve 98% and 99% of accuracy in automatic fall detection on respectively Le2i and URFD datasets. The effort made in the design of this work is toward recognition of falls by using physical models whose laws are clear and understandable.

Extracting and Interpreting Unknown Factors with Classifier for Foot Strike Types in Running

Chanjin Seo, Masato Sabanai, Yuta Goto, Koji Tagami, Hiroyuki Ogata, Kazuyuki Kanosue, Jun Ohya

Responsive image

Auto-TLDR; Deep Learning for Foot Strike Classification using Accelerometer Data

Slides Poster Similar

This paper proposes a method that can classify foot strike types using a deep learning model and can extract unknown factors, which enables to evaluate running motions without being influenced by biases of sports experts, using the contribution degree of input values (CDIV). Accelerometers are attached to the runner’s body, and when the runner runs, a fixed camera observes the runner and acquires a video sequence synchronously with the accelerometers. To train a deep learning model for classifying foot strikes, we annotate foot strike acceleration data for RFS (Rearfoot strike) or non-RFS objectively by watching the video. To interpret the unknown factors extracted from the learned model, we calculate two CDIVs: the contributions of the resampling time and the accelerometer value to the output (foot strike type) . Experiments on classifying unknown runners’ foot strikes were conducted. As a common result to sport science, it is confirmed that the CDIVs contribute highly at the time of the right foot strike, and the sensor values corresponding to the right and left tibias contribute highly to classifying the foot strikes. Experimental results show the right tibia is important for classifying foot strikes. This is because many of the training data represent difference between the two foot strikes in the right tibia. As a conclusion, our proposed method could extract unknown factors from the classifier and could interpret the factors that contain similar knowledge to the prior knowledge of experts, as well as new findings that are not included in conventional knowledge.

Using Meta Labels for the Training of Weighting Models in a Sample-Specific Late Fusion Classification Architecture

Peter Bellmann, Patrick Thiam, Friedhelm Schwenker

Responsive image

Auto-TLDR; A Late Fusion Architecture for Multiple Classifier Systems

Slides Poster Similar

The performance of multiple classifier systems can be significantly improved by the use of intelligent classifier combination approaches. In this study, we introduce a novel late fusion architecture, which can be interpreted as a combination of the well-known mixture of experts and stacked generalization methods. Our proposed method aggregates the outputs of classification models and corresponding sample-specific weighting models. A special feature of our proposed architecture is that each weighting model is trained on an individual set of meta labels. Using individual sets of meta labels allows each weighting model to separate regions, on which the predictions of the corresponding classification model can be associated to an estimated confidence value. We test our proposed architecture on a set of publicly available databases, including different benchmark data sets. The experimental evaluation shows the effectiveness and potential of our proposed method. Moreover, we discuss different approaches for further improvement of our proposed architecture.

Detecting Rare Cell Populations in Flow Cytometry Data Using UMAP

Lisa Weijler, Markus Diem, Michael Reiter

Responsive image

Auto-TLDR; Unsupervised Manifold Approximation and Projection for Small Cell Population Detection in Flow cytometry Data

Slides Poster Similar

We present an approach for detecting small cell populations in flow cytometry (FCM) samples based on the combination of unsupervised manifold embedding and supervised random forest classification. Each sample consists of hundred thousands to a few million cells where each cell typically corresponds to a measurement vector with 10 to 50 dimensions. The difficulty of the task is that clusters of measurement vectors formed in the data space according to standard clustering criteria often do not correspond to biologically meaningful sub-populations of cells, due to strong variations in shape and size of their distributions. In many cases the relevant population consists of less than 100 scattered events out of millions of events, where supervised approaches perform better than unsupervised clustering. The aim of this paper is to demonstrate that the performance of the standard supervised classifier can be improved significantly by combining it with a preceding unsupervised learning step involving the Uniform Manifold Approximation and Projection (UMAP). We present an experimental evaluation on FCM data from children suffering from Acute Lymphoblastic Leukemia (ALL) showing that the improvement particularly occurs in difficult samples where the size of the relevant population of leukemic cells is low in relation to other sub-populations. Further, the experiments indicate that on such samples the algorithm also outperforms other baseline methods based on Gaussian Mixture Models.

A Lumen Segmentation Method in Ureteroscopy Images Based on a Deep Residual U-Net Architecture

Jorge Lazo, Marzullo Aldo, Sara Moccia, Michele Catellani, Benoit Rosa, Elena De Momi, Michel De Mathelin, Francesco Calimeri

Responsive image

Auto-TLDR; A Deep Neural Network for Ureteroscopy with Residual Units

Slides Poster Similar

Ureteroscopy is becoming the first surgical treatment option for the majority of urinary affections. This procedure is carried out using an endoscope which provides the surgeon with the visual and spatial information necessary to navigate inside the urinary tract. Having in mind the development of surgical assistance systems, that could enhance the performance of surgeon, the task of lumen segmentation is a fundamental part since this is the visual reference which marks the path that the endoscope should follow. This is something that has not been analyzed in ureteroscopy data before. However, this task presents several challenges given the image quality and the conditions itself of ureteroscopy procedures. In this paper, we study the implementation of a Deep Neural Network which exploits the advantage of residual units in an architecture based on U-Net. For the training of these networks, we analyze the use of two different color spaces: gray-scale and RGB data images. We found that training on gray-scale images gives the best results obtaining mean values of Dice Score, Precision, and Recall of 0.73, 0.58, and 0.92 respectively. The results obtained show that the use of residual U-Net could be a suitable model for further development for a computer-aided system for navigation and guidance through the urinary system.

Attack-Agnostic Adversarial Detection on Medical Data Using Explainable Machine Learning

Matthew Watson, Noura Al Moubayed

Responsive image

Auto-TLDR; Explainability-based Detection of Adversarial Samples on EHR and Chest X-Ray Data

Slides Poster Similar

Explainable machine learning has become increasingly prevalent, especially in healthcare where explainable models are vital for ethical and trusted automated decision making. Work on the susceptibility of deep learning models to adversarial attacks has shown the ease of designing samples to mislead a model into making incorrect predictions. In this work, we propose an explainability-based method for the accurate detection of adversarial samples on two datasets with different complexity and properties: Electronic Health Record (EHR) and chest X-ray (CXR) data. On the MIMIC-III and Henan-Renmin EHR datasets, we report a detection accuracy of 77% against the Longitudinal Adversarial Attack. On the MIMIC-CXR dataset, we achieve an accuracy of 88%; significantly improving on the state of the art of adversarial detection in both datasets by over 10% in all settings. We propose an anomaly detection based method using explainability techniques to detect adversarial samples which is able to generalise to different attack methods without a need for retraining.

EasiECG: A Novel Inter-Patient Arrhythmia Classification Method Using ECG Waves

Chuanqi Han, Ruoran Huang, Fang Yu, Xi Huang, Li Cui

Responsive image

Auto-TLDR; EasiECG: Attention-based Convolution Factorization Machines for Arrhythmia Classification

Slides Poster Similar

Abstract—In an ECG record, the PQRST waves are of important medical significance which provide ample information reflecting heartbeat activities. In this paper, we propose a novel arrhythmia classification method namely EasiECG, characterized by simplicity and accuracy. Compared with other works, the EasiECG takes the configuration of these five key waves into account and does not require complicated feature engineering. Meanwhile, an additional encoding of the extracted features makes the EasiECG applicable even on samples with missing waves. To automatically capture interactions that contribute to the classification among the processed features, a novel adapted classification model named Attention-based Convolution Factorization Machines (ACFM) is proposed. In detail, the ACFM can learn both linear and high-order interactions from linear regression and convolution on outer-product feature interaction maps, respectively. After that, an attention mechanism implemented in the model can further assign different importance of these interactions when predicting certain types of heartbeats. To validate the effectiveness and practicability of our EasiECG, extensive experiments of inter-patient paradigm on the benchmark MIT-BIH arrhythmia database are conducted. To tackle the imbalanced sample problem in this dataset, an ingenious loss function: focal loss is adopted when training. The experiment results show that our method is competitive compared with other state-of-the-arts, especially in classifying the Supraventricular ectopic beats. Besides, the EasiECG achieves an overall accuracy of 87.6% on samples with a missing wave in the related experiment, demonstrating the robustness of our proposed method.

Exploring Spatial-Temporal Representations for fNIRS-based Intimacy Detection via an Attention-enhanced Cascade Convolutional Recurrent Neural Network

Chao Li, Qian Zhang, Ziping Zhao

Responsive image

Auto-TLDR; Intimate Relationship Prediction by Attention-enhanced Cascade Convolutional Recurrent Neural Network Using Functional Near-Infrared Spectroscopy

Slides Poster Similar

The detection of intimacy plays a crucial role in the improvement of intimate relationship, which contributes to promote the family and social harmony. Previous studies have shown that different degrees of intimacy have significant differences in brain imaging. Recently, a few of work has emerged to recognise intimacy automatically by using machine learning technique. Moreover, considering the temporal dynamic characteristics of intimacy relationship on neural mechanism, how to model spatio-temporal dynamics for intimacy prediction effectively is still a challenge. In this paper, we propose a novel method to explore deep spatial-temporal representations for intimacy prediction by Attention-enhanced Cascade Convolutional Recurrent Neural Network (ACCRNN). Given the advantages of time-frequency resolution in complex neuronal activities analysis, this paper utilizes functional near-infrared spectroscopy (fNIRS) to analyse and infer to intimate relationship. We collect a fNIRS-based dataset for the analysis of intimate relationship. Forty-two-channel fNIRS signals are recorded from the 44 subjects' prefrontal cortex when they watched a total of 18 photos of lovers, friends and strangers for 30 seconds per photo. The experimental results show that our proposed method outperforms the others in terms of accuracy with the precision of 96.5%. To the best of our knowledge, this is the first time that such a hybrid deep architecture has been employed for fNIRS-based intimacy prediction.

Wireless Localisation in WiFi Using Novel Deep Architectures

Peizheng Li, Han Cui, Aftab Khan, Usman Raza, Robert Piechocki, Angela Doufexi, Tim Farnham

Responsive image

Auto-TLDR; Deep Neural Network for Indoor Localisation of WiFi Devices in Indoor Environments

Slides Poster Similar

This paper studies the indoor localisation of WiFi devices based on a commodity chipset and standard channel sounding. First, we present a novel shallow neural network (SNN) in which features are extracted from the channel state information (CSI) corresponding to WiFi subcarriers received on different antennas and used to train the model. The single layer architecture of this localisation neural network makes it lightweight and easy-to-deploy on devices with stringent constraints on computational resources. We further investigate for localisation the use of deep learning models and design novel architectures for convolutional neural network (CNN) and long-short term memory (LSTM). We extensively evaluate these localisation algorithms for continuous tracking in indoor environments. Experimental results prove that even an SNN model, after a careful handcrafted feature extraction, can achieve accurate localisation. Meanwhile, using a well-organised architecture, the neural network models can be trained directly with raw data from the CSI and localisation features can be automatically extracted to achieve accurate position estimates. We also found that the performance of neural network-based methods are directly affected by the number of anchor access points (APs) regardless of their structure. With three APs, all neural network models proposed in this paper can obtain localisation accuracy of around 0.5 metres. In addition the proposed deep NN architecture reduces the data pre-processing time by 6.5 hours compared with a shallow NN using the data collected in our testbed. In the deployment phase, the inference time is also significantly reduced to 0.1 ms per sample. We also demonstrate the generalisation capability of the proposed method by evaluating models using different target movement characteristics to the ones in which they were trained.

VR Sickness Assessment with Perception Prior and Hybrid Temporal Features

Po-Chen Kuo, Li-Chung Chuang, Dong-Yi Lin, Ming-Sui Lee

Responsive image

Auto-TLDR; A novel content-based VR sickness assessment method which considers both the perception prior and hybrid temporal features

Slides Poster Similar

Virtual reality (VR) sickness is one of the obstacles hindering the growth of the VR market. Different VR contents may cause various degree of sickness. If the degree of the sickness can be estimated objectively, it adds a great value and help in designing the VR contents. To address this problem, a novel content-based VR sickness assessment method which considers both the perception prior and hybrid temporal features is proposed. Based on the perception prior which assumes the user’s field of view becomes narrower while watching videos, a Gaussian weighted optical flow is calculated with a specified aspect ratio. In order to capture the dynamic characteristics, hybrid temporal features including horizontal motion, vertical motion and the proposed motion anisotropy are adopted. In addition, a new dataset is compiled with one hundred VR sickness test samples and each of which comes along with the Dizziness Scores (DS) answered by the user and a Simulator Sickness Questionnaire (SSQ) collected at the end of test. A random forest regressor is then trained on this dataset by feeding the hybrid temporal features of both the present and the previous minute. Extensive experiments are conducted on the VRSA dataset and the results demonstrate that the proposed method is comparable to the state-of-the-art method in terms of effectiveness and efficiency.

A Systematic Investigation on Deep Architectures for Automatic Skin Lesions Classification

Pierluigi Carcagni, Marco Leo, Andrea Cuna, Giuseppe Celeste, Cosimo Distante

Responsive image

Auto-TLDR; RegNet: Deep Investigation of Convolutional Neural Networks for Automatic Classification of Skin Lesions

Slides Poster Similar

Computer vision-based techniques are more and more employed in healthcare and medical fields nowadays in order, principally, to be as a support to the experienced medical staff to help them to make a quick and correct diagnosis. One of the hot topics in this arena concerns the automatic classification of skin lesions. Several promising works exist about it, mainly leveraging Convolutional Neural Networks (CNN), but proposed pipeline mainly rely on complex data preprocessing and there is no systematic investigation about how available deep models can actually reach the accuracy needed for real applications. In order to overcome these drawbacks, in this work, an end-to-end pipeline is introduced and some of the most recent Convolutional Neural Networks (CNNs) architectures are included in it and compared on the largest common benchmark dataset recently introduced. To this aim, for the first time in this application context, a new network design paradigm, namely RegNet, has been exploited to get the best models among a population of configurations. The paper introduces a threefold level of contribution and novelty with respect the previous literature: the deep investigation of several CNN architectures driving to a consistent improvement of the lesions recognition accuracy, the exploitation of a new network design paradigm able to study the behavior of populations of models and a deep discussion about pro and cons of each analyzed method paving the path towards new research lines.

How to Define a Rejection Class Based on Model Learning?

Sarah Laroui, Xavier Descombes, Aurelia Vernay, Florent Villiers, Francois Villalba, Eric Debreuve

Responsive image

Auto-TLDR; An innovative learning strategy for supervised classification that is able, by design, to reject a sample as not belonging to any of the known classes

Slides Poster Similar

In supervised classification, the learning process typically trains a classifier to optimize the accuracy of classifying data into the classes that appear in the learning set, and only them. While this framework fits many use cases, there are situations where the learning process is knowingly performed using a learning set that only represents the data that have been observed so far among a virtually unconstrained variety of possible samples. It is then crucial to define a classifier which has the ability to reject a sample, i.e., to classify it into a rejection class that has not been yet defined. Although obvious solutions can add this ability a posteriori to a classifier that has been learned classically, a better approach seems to directly account for this requirement in the classifier design. In this paper, we propose an innovative learning strategy for supervised classification that is able, by design, to reject a sample as not belonging to any of the known classes. For that, we rely on modeling each class as the combination of a probability density function (PDF) and a threshold that is computed with respect to the other classes. Several alternatives are proposed and compared in this framework. A comparison with straightforward approaches is also provided.

Influence of Event Duration on Automatic Wheeze Classification

Bruno M Rocha, Diogo Pessoa, Alda Marques, Paulo Carvalho, Rui Pedro Paiva

Responsive image

Auto-TLDR; Experimental Design of the Non-wheeze Class for Wheeze Classification

Slides Poster Similar

Patients with respiratory conditions typically exhibit adventitious respiratory sounds, such as wheezes. Wheeze events have variable duration. In this work we studied the influence of event duration on wheeze classification, namely how the creation of the non-wheeze class affected the classifiers' performance. First, we evaluated several classifiers on an open access respiratory sound database, with the best one reaching sensitivity and specificity values of 98% and 95%, respectively. Then, by changing one parameter in the design of the non-wheeze class, i.e., event duration, the best classifier only reached sensitivity and specificity values of 53% and 75%, respectively. These results demonstrate the importance of experimental design on the assessment of wheeze classification algorithms' performance.

Deep Convolutional Embedding for Digitized Painting Clustering

Giovanna Castellano, Gennaro Vessio

Responsive image

Auto-TLDR; A Deep Convolutional Embedding Model for Clustering Artworks

Slides Poster Similar

Clustering artworks is difficult because of several reasons. On one hand, recognizing meaningful patterns in accordance with domain knowledge and visual perception is extremely hard. On the other hand, the application of traditional clustering and feature reduction techniques to the highly dimensional pixel space can be ineffective. To address these issues, we propose to use a deep convolutional embedding model for digitized painting clustering, in which the task of mapping the input raw data to an abstract, latent space is jointly optimized with the task of finding a set of cluster centroids in this latent feature space. Quantitative and qualitative experimental results show the effectiveness of the proposed method. The model is also able to outperform other state-of-the-art deep clustering approaches to the same problem. The proposed method may be beneficial to several art-related tasks, particularly visual link retrieval and historical knowledge discovery in painting datasets.

Explainable Online Validation of Machine Learning Models for Practical Applications

Wolfgang Fuhl, Yao Rong, Thomas Motz, Michael Scheidt, Andreas Markus Hartel, Andreas Koch, Enkelejda Kasneci

Responsive image

Auto-TLDR; A Reformulation of Regression and Classification for Machine Learning Algorithm Validation

Slides Poster Similar

We present a reformulation of the regression and classification, which aims to validate the result of a machine learning algorithm. Our reformulation simplifies the original problem and validates the result of the machine learning algorithm using the training data. Since the validation of machine learning algorithms must always be explainable, we perform our experiments with the kNN algorithm as well as with an algorithm based on conditional probabilities, which is proposed in this work. For the evaluation of our approach, three publicly available data sets were used and three classification and two regression problems were evaluated. The presented algorithm based on conditional probabilities is also online capable and requires only a fraction of memory compared to the kNN algorithm.

A Deep Learning-Based Method for Predicting Volumes of Nasopharyngeal Carcinoma for Adaptive Radiation Therapy Treatment

Bilel Daoud, Ken'Ichi Morooka, Shoko Miyauchi, Ryo Kurazume, Wafa Mnejja, Leila Farhat, Jamel Daoud

Responsive image

Auto-TLDR; TEP-Net: Tumor Evolution Prediction of Nasopharyngeal Carcinoma and Organ-at-risks Using CT Images

Slides Poster Similar

This paper presents a new system for predicting the spatial change of Nasopharyngeal carcinoma(NPC) and organ-at-risks (OARs) volumes over the course of the radiation therapy (RT) treatment for facilitating the workflow of adaptive radiotherapy. The proposed system, called " Tumor Evolution Prediction (TEP-Net)", predicts the spatial distributions of NPC and 5 OARs, separately, in response to RT in the coming week, week n. Here, TEP-Net has (n-1)-inputs that are week 1 to week n-1 of CT axial, coronal or sagittal images acquired once the patient complete the planned RT treatment of the corresponding week. As a result, three predicted results of each target region are obtained from the three-view CT images. To determine the final prediction of NPC and 5 OARs, two integration methods, weighted fully connected layers and weighted voting methods, are introduced. From the experiments using weekly CT images of 140 NPC patients, our proposed system achieves the best performance for predicting NPC and OARs compared with conventional methods.

Estimation of Clinical Tremor Using Spatio-Temporal Adversarial AutoEncoder

Li Zhang, Vidya Koesmahargyo, Isaac Galatzer-Levy

Responsive image

Auto-TLDR; ST-AAE: Spatio-temporal Adversarial Autoencoder for Clinical Assessment of Hand Tremor Frequency and Severity

Slides Poster Similar

Collecting sufficient well-labeled training data is a challenging task in many clinical applications. Besides the tremendous efforts required for data collection, clinical assessments are also impacted by raters’ variabilities, which may be significant even among experienced clinicians. The high demands of reproducible and scalable data-driven approaches in these areas necessitates relevant research on learning with limited data. In this work, we propose a spatio-temporal adversarial autoencoder (ST-AAE) for clinical assessment of hand tremor frequency and severity. The ST-AAE integrates spatial and temporal information simultaneously into the original AAE, taking optical flows as inputs. Using only optical flows, irrelevant background or static objects from RGB frames are largely eliminated, so that the AAE is directed to effectively learn key feature representations of the latent space from tremor movements. The ST-AAE was evaluated with both volunteer and clinical data. The volunteer results showed that the ST-AAE improved model performance significantly by 15% increase on accuracy. Leave-one-out (on subjects) cross validation was used to evaluate the accuracy for all the 3068 video segments from 28 volunteers. The weighted average of the AUCs of ROCs is 0.97. The results demonstrated that the ST-AAE model, trained with a small number of subjects, can be generalized well to different subjects. In addition, the model trained only by volunteer data was also evaluated with 32 clinical videos from 9 essential tremor patients, the model predictions correlate well with the clinical ratings: correlation coefficient r = 0.91 and 0.98 for in-person ratings and video watching ratings, respectively.

PIF: Anomaly detection via preference embedding

Filippo Leveni, Luca Magri, Giacomo Boracchi, Cesare Alippi

Responsive image

Auto-TLDR; PIF: Anomaly Detection with Preference Embedding for Structured Patterns

Slides Poster Similar

We address the problem of detecting anomalies with respect to structured patterns. To this end, we conceive a novel anomaly detection method called PIF, that combines the advantages of adaptive isolation methods with the flexibility of preference embedding. Specifically, we propose to embed the data in a high dimensional space where an efficient tree-based method, PI-FOREST, is employed to compute an anomaly score. Experiments on synthetic and real datasets demonstrate that PIF favorably compares with state-of-the-art anomaly detection techniques, and confirm that PI-FOREST is better at measuring arbitrary distances and isolate points in the preference space.

Inferring Functional Properties from Fluid Dynamics Features

Andrea Schillaci, Maurizio Quadrio, Carlotta Pipolo, Marcello Restelli, Giacomo Boracchi

Responsive image

Auto-TLDR; Exploiting Convective Properties of Computational Fluid Dynamics for Medical Diagnosis

Slides Poster Similar

In a wide range of applied problems involving fluid flows, Computational Fluid Dynamics (CFD) provides detailed quantitative information on the flow field, at various levels of fidelity and computational cost. However, CFD alone cannot predict high-level functional properties of the system that are not easily obtained from the equations of fluid motion. In this work, we present a data-driven framework to extract additional information, such as medical diagnostic output, from CFD solutions. The task is made difficult by the huge data dimensionality of CFD, together with the limited amount of training data implied by its high computational cost. By pursuing a traditional ML pipeline of pre-processing, feature extraction, and model training, we demonstrate that informative features can be extracted from CFD data. Two experiments, pertaining to different application domains, support the claim that the convective properties implicit into a CFD solution can be leveraged to retrieve functional information for which an analytical definition is missing. Despite the preliminary nature of our study and the relative simplicity of both the geometrical and CFD models, for the first time we demonstrate that the combination of ML and CFD can diagnose a complex system in terms of high-level functional information.

A Versatile Crack Inspection Portable System Based on Classifier Ensemble and Controlled Illumination

Milind Gajanan Padalkar, Carlos Beltran-Gonzalez, Matteo Bustreo, Alessio Del Bue, Vittorio Murino

Responsive image

Auto-TLDR; Lighting Conditions for Crack Detection in Ceramic Tile

Slides Poster Similar

This paper presents a novel setup for automatic visual inspection of cracks in ceramic tile as well as studies the effect of various classifiers and height-varying illumination conditions for this task. The intuition behind this setup is that cracks can be better visualized under specific lighting conditions than others. Our setup, which is designed for field work with constraints in its maximum dimensions, can acquire images for crack detection with multiple lighting conditions using the illumination sources placed at multiple heights. Crack detection is then performed by classifying patches extracted from the acquired images in a sliding window fashion. We study the effect of lights placed at various heights by training classifiers both on customized as well as state-of-the-art architectures and evaluate their performance both at patch-level and image-level, demonstrating the effectiveness of our setup. More importantly, ours is the first study that demonstrates how height-varying illumination conditions can affect crack detection with the use of existing state-of-the-art classifiers. We provide an insight about the illumination conditions that can help in improving crack detection in a challenging real-world industrial environment.

Creating Classifier Ensembles through Meta-Heuristic Algorithms for Aerial Scene Classification

Álvaro Roberto Ferreira Jr., Gustavo Gustavo Henrique De Rosa, Joao Paulo Papa, Gustavo Carneiro, Fabio Augusto Faria

Responsive image

Auto-TLDR; Univariate Marginal Distribution Algorithm for Aerial Scene Classification Using Meta-Heuristic Optimization

Slides Poster Similar

Aerial scene classification is a challenging task to be solved in the remote sensing area, whereas deep learning approaches, such as Convolutional Neural Networks (CNN), are being widely employed to overcome such a problem. Nevertheless, it is not straightforward to find single CNN models that can solve all aerial scene classification tasks, allowing the nurturing of a better alternative, which is to fuse CNN-based classifiers into an ensemble. However, an appropriate choice of the classifiers that will belong to the ensemble is a critical factor, as it is unfeasible to employ all the possible classifiers in the literature. Therefore, this work proposes a novel framework based on meta-heuristic optimization for creating optimized-ensembles in the context of aerial scene classification. The experimental results were performed across nine meta-heuristic algorithms and three aerial scene literature datasets, being compared in terms of effectiveness (accuracy), efficiency (execution time), and behavioral performance in different scenarios. Finally, one can observe that the Univariate Marginal Distribution Algorithm (UMDA) overcame popular literature meta-heuristic algorithms, such as Genetic Programming and Particle Swarm Optimization considering the adopted criteria in the performed experiments.

Real-Time Driver Drowsiness Detection Using Facial Action Units

Malaika Vijay, Nandagopal Netrakanti Vinayak, Maanvi Nunna, Subramanyam Natarajan

Responsive image

Auto-TLDR; Real-Time Detection of Driver Drowsiness using Facial Action Units using Extreme Gradient Boosting

Slides Poster Similar

This paper presents a two-stage, vision-based pipeline for the real-time detection of driver drowsiness using Facial Action Units (FAUs). FAUs capture movements in groups of muscles in the face like widening of the eyes or dropping of the jaw. The first stage of the pipeline employs a Convolutional Neural Network (CNN) trained to detect FAUs. The output of the penultimate layer of this network serves as an image embedding that captures features relevant to FAU detection. These embeddings are then used to predict drowsiness using an Extreme Gradient Boosting (XGBoost) classifier. A separate XGBoost model is trained for each user of the system so that behavior specific to each user can be modeled into the drowsiness classifier. We show that user-specific classifiers require very little data and low training time to yield high prediction accuracies in real-time.

BAT Optimized CNN Model Identifies Water Stress in Chickpea Plant Shoot Images

Shiva Azimi, Taranjit Kaur, Tapan Gandhi

Responsive image

Auto-TLDR; BAT Optimized ResNet-18 for Stress Classification of chickpea shoot images under water deficiency

Slides Poster Similar

Stress due to water deficiency in plants can significantly lower the agricultural yield. It can affect many visible plant traits such as size and surface area, the number of leaves and their color, etc. In recent years, computer vision-based plant phenomics has emerged as a promising tool for plant research and management. Such techniques have the advantage of being non-destructive, non-evasive, fast, and offer high levels of automation. Pulses like chickpeas play an important role in ensuring food security in poor countries owing to their high protein and nutrition content. In the present work, we have built a dataset comprising of two varieties of chickpea plant shoot images under different moisture stress conditions. Specifically, we propose a BAT optimized ResNet-18 model for classifying stress induced by water deficiency using chickpea shoot images. BAT algorithm identifies the optimal value of the mini-batch size to be used for training rather than employing the traditional manual approach of trial and error. Experimentation on two crop varieties (JG and Pusa) reveals that BAT optimized approach achieves an accuracy of 96% and 91% for JG and Pusa varieties that is better than the traditional method by 4%. The experimental results are also compared with state of the art CNN models like Alexnet, GoogleNet, and ResNet-50. The comparison results demonstrate that the proposed BAT optimized ResNet-18 model achieves higher performance than the comparison counterparts.

Dual-Memory Model for Incremental Learning: The Handwriting Recognition Use Case

Mélanie Piot, Bérangère Bourdoulous, Aurelia Deshayes, Lionel Prevost

Responsive image

Auto-TLDR; A dual memory model for handwriting recognition

Poster Similar

In this paper, we propose a dual memory model inspired by neural science. Short-term memory processes the data stream before integrating them into long-term memory, which generalizes. The use case is learning the ability to recognize handwriting. This begins with the learning of prototypical letters . It continues throughout life and gives the individual the ability to recognize increasingly varied handwriting. This second task is achieved by incrementally training our dual-memory model. We used a convolution network for encoding and random forests as the memory model. Indeed, the latter have the advantage of being easily enhanced to integrate new data and new classes. Performances on the MNIST database are very encouraging since they exceed 95\% and the complexity of the model remains reasonable.

Handwritten Signature and Text Based User Verification Using Smartwatch

Raghavendra Ramachandra, Sushma Venkatesh, Raja Kiran, Christoph Busch

Responsive image

Auto-TLDR; A novel technique for user verification using a smartwatch based on writing pattern or signing pattern

Slides Poster Similar

Wrist-wearable devices such as smartwatch have gained popularity as they provide quick access to the various information and easy access to multiple applications. Among various applications of the smartwatch, user verification based on the handwriting has been recently investigated. In this paper, we present a novel technique for user verification using a smartwatch based on writing pattern or signing pattern. The proposed technique leverages accelerometer data captured from the smartwatch that are further represented using 2D Continuous Wavelet Transform (CWT) and deep features extracted using the pre-trained ResNet50. The comparison is performed using the ensemble of the classifier. Extensive experiments are carried out on the newly captured dataset using two different smartwatches with three different writing scenarios (or activities). The article provides key insights and analysis of the results in such a verification scenario.