Attack-Agnostic Adversarial Detection on Medical Data Using Explainable Machine Learning

Matthew Watson, Noura Al Moubayed

Responsive image

Auto-TLDR; Explainability-based Detection of Adversarial Samples on EHR and Chest X-Ray Data

Slides Poster

Explainable machine learning has become increasingly prevalent, especially in healthcare where explainable models are vital for ethical and trusted automated decision making. Work on the susceptibility of deep learning models to adversarial attacks has shown the ease of designing samples to mislead a model into making incorrect predictions. In this work, we propose an explainability-based method for the accurate detection of adversarial samples on two datasets with different complexity and properties: Electronic Health Record (EHR) and chest X-ray (CXR) data. On the MIMIC-III and Henan-Renmin EHR datasets, we report a detection accuracy of 77% against the Longitudinal Adversarial Attack. On the MIMIC-CXR dataset, we achieve an accuracy of 88%; significantly improving on the state of the art of adversarial detection in both datasets by over 10% in all settings. We propose an anomaly detection based method using explainability techniques to detect adversarial samples which is able to generalise to different attack methods without a need for retraining.

Similar papers

Accuracy-Perturbation Curves for Evaluation of Adversarial Attack and Defence Methods

Jaka Šircelj, Danijel Skocaj

Responsive image

Auto-TLDR; Accuracy-perturbation Curve for Robustness Evaluation of Adversarial Examples

Slides Poster Similar

With more research published on adversarial examples, we face a growing need for strong and insightful methods for evaluating the robustness of machine learning solutions against their adversarial threats. Previous work contains problematic and overly simplified evaluation methods, where different methods for generating adversarial examples are compared, even though they produce adversarial examples of differing perturbation magnitudes. This creates a biased evaluation environment, as higher perturbations yield naturally stronger adversarial examples. We propose a novel "accuracy-perturbation curve" that visualizes a classifiers classification accuracy response to adversarial examples of different perturbations. To demonstrate the utility of the curve we perform evaluation of responses of different image classifier architectures to four popular adversarial example methods. We also show how adversarial training improves the robustness of a classifier using the "accuracy-perturbation curve".

Optimal Transport As a Defense against Adversarial Attacks

Quentin Bouniot, Romaric Audigier, Angélique Loesch

Responsive image

Auto-TLDR; Sinkhorn Adversarial Training with Optimal Transport Theory

Slides Poster Similar

Deep learning classifiers are now known to have flaws in the representations of their class. Adversarial attacks can find a human-imperceptible perturbation for a given image that will mislead a trained model. The most effective methods to defend against such attacks trains on generated adversarial examples to learn their distribution. Previous work aimed to align original and adversarial image representations in the same way as domain adaptation to improve robustness. Yet, they partially align the representations using approaches that do not reflect the geometry of space and distribution. In addition, it is difficult to accurately compare robustness between defended models. Until now, they have been evaluated using a fixed perturbation size. However, defended models may react differently to variations of this perturbation size. In this paper, the analogy of domain adaptation is taken a step further by exploiting optimal transport theory. We propose to use a loss between distributions that faithfully reflect the ground distance. This leads to SAT (Sinkhorn Adversarial Training), a more robust defense against adversarial attacks. Then, we propose to quantify more precisely the robustness of a model to adversarial attacks over a wide range of perturbation sizes using a different metric, the Area Under the Accuracy Curve (AUAC). We perform extensive experiments on both CIFAR-10 and CIFAR-100 datasets and show that our defense is globally more robust than the state-of-the-art.

Defense Mechanism against Adversarial Attacks Using Density-Based Representation of Images

Yen-Ting Huang, Wen-Hung Liao, Chen-Wei Huang

Responsive image

Auto-TLDR; Adversarial Attacks Reduction Using Input Recharacterization

Slides Poster Similar

Adversarial examples are slightly modified inputs devised to cause erroneous inference of deep learning models. Protection against the intervention of adversarial examples is a fundamental issue that needs to be addressed before the wide adoption of deep-learning based intelligent systems. In this research, we utilize the method known as input recharacterization to effectively eliminate the perturbations found in the adversarial examples. By converting images from the intensity domain into density-based representation using halftoning operation, performance of the classifier can be properly maintained. With adversarial attacks generated using FGSM, I-FGSM, and PGD, the top-5 accuracy of the hybrid model can still achieve 80.97%, 78.77%, 81.56%, respectively. Although the accuracy has been slightly affected, the influence of adversarial examples is significantly discounted. The average improvement over existing input transform defense mechanisms is approximately 10%.

Variational Inference with Latent Space Quantization for Adversarial Resilience

Vinay Kyatham, Deepak Mishra, Prathosh A.P.

Responsive image

Auto-TLDR; A Generalized Defense Mechanism for Adversarial Attacks on Data Manifolds

Slides Poster Similar

Despite their tremendous success in modelling highdimensional data manifolds, deep neural networks suffer from the threat of adversarial attacks - Existence of perceptually valid input-like samples obtained through careful perturbation that lead to degradation in the performance of the underlying model. Major concerns with existing defense mechanisms include non-generalizability across different attacks, models and large inference time. In this paper, we propose a generalized defense mechanism capitalizing on the expressive power of regularized latent space based generative models. We design an adversarial filter, devoid of access to classifier and adversaries, which makes it usable in tandem with any classifier. The basic idea is to learn a Lipschitz constrained mapping from the data manifold, incorporating adversarial perturbations, to a quantized latent space and re-map it to the true data manifold. Specifically, we simultaneously auto-encode the data manifold and its perturbations implicitly through the perturbations of the regularized and quantized generative latent space, realized using variational inference. We demonstrate the efficacy of the proposed formulation in providing resilience against multiple attack types (black and white box) and methods, while being almost real-time. Our experiments show that the proposed method surpasses the stateof-the-art techniques in several cases.

Verifying the Causes of Adversarial Examples

Honglin Li, Yifei Fan, Frieder Ganz, Tony Yezzi, Payam Barnaghi

Responsive image

Auto-TLDR; Exploring the Causes of Adversarial Examples in Neural Networks

Slides Poster Similar

The robustness of neural networks is challenged by adversarial examples that contain almost imperceptible perturbations to inputs which mislead a classifier to incorrect outputs in high confidence. Limited by the extreme difficulty in examining a high-dimensional image space thoroughly, research on explaining and justifying the causes of adversarial examples falls behind studies on attacks and defenses. In this paper, we present a collection of potential causes of adversarial examples and verify (or partially verify) them through carefully-designed controlled experiments. The major causes of adversarial examples include model linearity, one-sum constraint, and geometry of the categories. To control the effect of those causes, multiple techniques are applied such as $L_2$ normalization, replacement of loss functions, construction of reference datasets, and novel models using multi-layer perceptron probabilistic neural networks (MLP-PNN) and density estimation (DE). Our experiment results show that geometric factors tend to be more direct causes and statistical factors magnify the phenomenon, especially for assigning high prediction confidence. We hope this paper will inspire more studies to rigorously investigate the root causes of adversarial examples, which in turn provide useful guidance on designing more robust models.

Dealing with Scarce Labelled Data: Semi-Supervised Deep Learning with Mix Match for Covid-19 Detection Using Chest X-Ray Images

Saúl Calderón Ramirez, Raghvendra Giri, Shengxiang Yang, Armaghan Moemeni, Mario Umaña, David Elizondo, Jordina Torrents-Barrena, Miguel A. Molina-Cabello

Responsive image

Auto-TLDR; Semi-supervised Deep Learning for Covid-19 Detection using Chest X-rays

Slides Poster Similar

Coronavirus (Covid-19) is spreading fast, infecting people through contact in various forms including droplets from sneezing and coughing. Therefore, the detection of infected subjects in an early, quick and cheap manner is urgent. Currently available tests are scarce and limited to people in danger of serious illness. The application of deep learning to chest X-ray images for Covid-19 detection is an attractive approach. However, this technology usually relies on the availability of large labelled datasets, a requirement hard to meet in the context of a virus outbreak. To overcome this challenge, a semi-supervised deep learning model using both labelled and unlabelled data is proposed. We developed and tested a semi-supervised deep learning framework based on the Mix Match architecture to classify chest X-rays into Covid-19, pneumonia and healthy cases. The presented approach was calibrated using two publicly available datasets. The results show an accuracy increase of around $15\%$ under low labelled / unlabelled data ratio. This indicates that our semi-supervised framework can help improve performance levels towards Covid-19 detection when the amount of high-quality labelled data is scarce. Also, we introduce a semi-supervised deep learning boost coefficient which is meant to ease the scalability of our approach and performance comparison.

A Delayed Elastic-Net Approach for Performing Adversarial Attacks

Brais Cancela, Veronica Bolon-Canedo, Amparo Alonso-Betanzos

Responsive image

Auto-TLDR; Robustness of ImageNet Pretrained Models against Adversarial Attacks

Slides Poster Similar

With the rise of the so-called Adversarial Attacks, there is an increased concern on model security. In this paper we present two different contributions: novel measures of robustness (based on adversarial attacks) and a novel adversarial attack. The key idea behind these metrics is to obtain a measure that could compare different architectures, with independence of how the input is preprocessed (robustness against different input sizes and value ranges). To do so, a novel adversarial attack is presented, performing a delayed elastic-net adversarial attack (constraints are only used whenever a successful adversarial attack is obtained). Experimental results show that our approach obtains state-of-the-art adversarial samples, in terms of minimal perturbation distance. Finally, a benchmark of ImageNet pretrained models is used to conduct experiments aiming to shed some light about which model should be selected whenever security is a role factor.

Beyond Cross-Entropy: Learning Highly Separable Feature Distributions for Robust and Accurate Classification

Arslan Ali, Andrea Migliorati, Tiziano Bianchi, Enrico Magli

Responsive image

Auto-TLDR; Gaussian class-conditional simplex loss for adversarial robust multiclass classifiers

Slides Poster Similar

Deep learning has shown outstanding performance in several applications including image classification. However, deep classifiers are known to be highly vulnerable to adversarial attacks, in that a minor perturbation of the input can easily lead to an error. Providing robustness to adversarial attacks is a very challenging task especially in problems involving a large number of classes, as it typically comes at the expense of an accuracy decrease. In this work, we propose the Gaussian class-conditional simplex (GCCS) loss: a novel approach for training deep robust multiclass classifiers that provides adversarial robustness while at the same time achieving or even surpassing the classification accuracy of state-of-the-art methods. Differently from other frameworks, the proposed method learns a mapping of the input classes onto target distributions in a latent space such that the classes are linearly separable. Instead of maximizing the likelihood of target labels for individual samples, our objective function pushes the network to produce feature distributions yielding high inter-class separation. The mean values of the distributions are centered on the vertices of a simplex such that each class is at the same distance from every other class. We show that the regularization of the latent space based on our approach yields excellent classification accuracy and inherently provides robustness to multiple adversarial attacks, both targeted and untargeted, outperforming state-of-the-art approaches over challenging datasets.

On-Manifold Adversarial Data Augmentation Improves Uncertainty Calibration

Kanil Patel, William Beluch, Dan Zhang, Michael Pfeiffer, Bin Yang

Responsive image

Auto-TLDR; On-Manifold Adversarial Data Augmentation for Uncertainty Estimation

Slides Similar

Uncertainty estimates help to identify ambiguous, novel, or anomalous inputs, but the reliable quantification of uncertainty has proven to be challenging for modern deep networks. To improve uncertainty estimation, we propose On-Manifold Adversarial Data Augmentation or OMADA, which specifically attempts to generate challenging examples by following an on-manifold adversarial attack path in the latent space of an autoencoder that closely approximates the decision boundaries between classes. On a variety of datasets and for multiple network architectures, OMADA consistently yields more accurate and better calibrated classifiers than baseline models, and outperforms competing approaches such as Mixup, as well as achieving similar performance to (at times better than) post-processing calibration methods such as temperature scaling. Variants of OMADA can employ different sampling schemes for ambiguous on-manifold examples based on the entropy of their estimated soft labels, which exhibit specific strengths for generalization, calibration of predicted uncertainty, or detection of out-of-distribution inputs.

Modeling the Distribution of Normal Data in Pre-Trained Deep Features for Anomaly Detection

Oliver Rippel, Patrick Mertens, Dorit Merhof

Responsive image

Auto-TLDR; Deep Feature Representations for Anomaly Detection in Images

Slides Poster Similar

Anomaly Detection (AD) in images is a fundamental computer vision problem and refers to identifying images and/or image substructures that deviate significantly from the norm. Popular AD algorithms commonly try to learn a model of normality from scratch using task specific datasets, but are limited to semi-supervised approaches employing mostly normal data due to the inaccessibility of anomalies on a large scale combined with the ambiguous nature of anomaly appearance. We follow an alternative approach and demonstrate that deep feature representations learned by discriminative models on large natural image datasets are well suited to describe normality and detect even subtle anomalies. Our model of normality is established by fitting a multivariate Gaussian to deep feature representations of classification networks trained on ImageNet using normal data only in a transfer learning setting. By subsequently applying the Mahalanobis distance as the anomaly score we outperform the current state of the art on the public MVTec AD dataset, achieving an Area Under the Receiver Operating Characteristic curve of 95.8 +- 1.2 % (mean +- SEM) over all 15 classes. We further investigate why the learned representations are discriminative to the AD task using Principal Component Analysis. We find that the principal components containing little variance in normal data are the ones crucial for discriminating between normal and anomalous instances. This gives a possible explanation to the often sub-par performance of AD approaches trained from scratch using normal data only. By selectively fitting a multivariate Gaussian to these most relevant components only, we are able to further reduce model complexity while retaining AD performance. We also investigate setting the working point by selecting acceptable False Positive Rate thresholds based on the multivariate Gaussian assumption.

Explain2Attack: Text Adversarial Attacks via Cross-Domain Interpretability

Mahmoud Hossam, Le Trung, He Zhao, Dinh Phung

Responsive image

Auto-TLDR; Transfer2Attack: A Black-box Adversarial Attack on Text Classification

Slides Poster Similar

Training robust deep learning models is a critical challenge for downstream tasks. Research has shown that common down-stream models can be easily fooled with adversarial inputs that look like the training data, but slightly perturbed, in a way imperceptible to humans. Understanding the behavior of natural language models under these attacks is crucial to better defend these models against such attacks. In the black-box attack setting, where no access to model parameters is available, the attacker can only query the output information from the targeted model to craft a successful attack. Current black-box state-of-the-art models are costly in both computational complexity and number of queries needed to craft successful adversarial examples. For real world scenarios, the number of queries is critical, where less queries are desired to avoid suspicion towards an attacking agent. In this paper, we propose Transfer2Attack, a black-box adversarial attack on text classification task, that employs cross-domain interpretability to reduce target model queries during attack. We show that our framework either achieves or out-performs attack rates of the state-of-the-art models, yet with lower queries cost and higher efficiency.

Discriminative Multi-Level Reconstruction under Compact Latent Space for One-Class Novelty Detection

Jaewoo Park, Yoon Gyo Jung, Andrew Teoh

Responsive image

Auto-TLDR; Discriminative Compact AE for One-Class novelty detection and Adversarial Example Detection

Slides Similar

In one-class novelty detection, a model learns solely on the in-class data to single out out-class instances. Autoencoder (AE) variants aim to compactly model the in-class data to reconstruct it exclusively, thus differentiating the in-class from out-class by the reconstruction error. However, compact modeling in an improper way might collapse the latent representations of the in-class data and thus their reconstruction, which would lead to performance deterioration. Moreover, to properly measure the reconstruction error of high-dimensional data, a metric is required that captures high-level semantics of the data. To this end, we propose Discriminative Compact AE (DCAE) that learns both compact and collapse-free latent representations of the in-class data, thereby reconstructing them both finely and exclusively. In DCAE, (a) we force a compact latent space to bijectively represent the in-class data by reconstructing them through internal discriminative layers of generative adversarial nets. (b) Based on the deep encoder's vulnerability to open set risk, out-class instances are encoded into the same compact latent space and reconstructed poorly without sacrificing the quality of in-class data reconstruction. (c) In inference, the reconstruction error is measured by a novel metric that computes the dissimilarity between a query and its reconstruction based on the class semantics captured by the internal discriminator. Extensive experiments on public image datasets validate the effectiveness of our proposed model on both novelty and adversarial example detection, delivering state-of-the-art performance.

Adaptive Noise Injection for Training Stochastic Student Networks from Deterministic Teachers

Yi Xiang Marcus Tan, Yuval Elovici, Alexander Binder

Responsive image

Auto-TLDR; Adaptive Stochastic Networks for Adversarial Attacks

Slides Similar

Adversarial attacks have been a prevalent problem causing misclassification in machine learning models, with stochasticity being a promising direction towards greater robustness. However, stochastic networks frequently underperform compared to deterministic deep networks. In this work, we present a conceptually clear adaptive noise injection mechanism in combination with teacher-initialisation, which adjusts its degree of randomness dynamically through the computation of mini-batch statistics. This mechanism is embedded within a simple framework to obtain stochastic networks from existing deterministic networks. Our experiments show that our method is able to outperform prior baselines under white-box settings, exemplified through CIFAR-10 and CIFAR-100. Following which, we perform in-depth analysis on varying different components of training with our approach on the effects of robustness and accuracy, through the study of the evolution of decision boundary and trend curves of clean accuracy/attack success over differing degrees of stochasticity. We also shed light on the effects of adversarial training on a pre-trained network, through the lens of decision boundaries.

Adversarially Training for Audio Classifiers

Raymel Alfonso Sallo, Mohammad Esmaeilpour, Patrick Cardinal

Responsive image

Auto-TLDR; Adversarially Training for Robust Neural Networks against Adversarial Attacks

Slides Poster Similar

In this paper, we investigate the potential effect of the adversarially training on the robustness of six advanced deep neural networks against a variety of targeted and non-targeted adversarial attacks. We firstly show that, the ResNet-56 model trained on the 2D representation of the discrete wavelet transform appended with the tonnetz chromagram outperforms other models in terms of recognition accuracy. Then we demonstrate the positive impact of adversarially training on this model as well as other deep architectures against six types of attack algorithms (white and black-box) with the cost of the reduced recognition accuracy and limited adversarial perturbation. We run our experiments on two benchmarking environmental sound datasets and show that without any imposed limitations on the budget allocations for the adversary, the fooling rate of the adversarially trained models can exceed 90%. In other words, adversarial attacks exist in any scales, but they might require higher adversarial perturbations compared to non-adversarially trained models.

Attack Agnostic Adversarial Defense via Visual Imperceptible Bound

Saheb Chhabra, Akshay Agarwal, Richa Singh, Mayank Vatsa

Responsive image

Auto-TLDR; Robust Adversarial Defense with Visual Imperceptible Bound

Slides Poster Similar

High susceptibility of deep learning algorithms against structured and unstructured perturbations has motivated the development of efficient adversarial defense algorithms. However, the lack of generalizability of existing defense algorithms and the high variability in the performance of the attack algorithms for different databases raises several questions on the effectiveness of the defense algorithms. In this research, we aim to design a defense model that is robust within the certain bound against both seen and unseen adversarial attacks. This bound is related to the visual appearance of an image and we termed it as \textit{Visual Imperceptible Bound (VIB)}. To compute this bound, we propose a novel method that uses the database characteristics. The VIB is further used to compute the effectiveness of attack algorithms. In order to design a defense model, we propose a defense algorithm which makes the model robust within the VIB against both seen and unseen attacks. The performance of the proposed defense algorithm and the method to compute VIB are evaluated on MNIST, CIFAR-10, and Tiny ImageNet databases on multiple attacks including C\&W ($l_2$) and DeepFool. The proposed defense algorithm is not only able to increase the robustness against several attacks but also retain or improve the classification accuracy on an original clean test set. Experimentally, it is demonstrated that the proposed defense is better than existing strong defense algorithms based on adversarial retraining. We have additionally performed the PGD attack in white box settings and compared the results with the existing algorithms. The proposed defense is independent of the target model and adversarial attacks, and therefore can be utilized against any attack.

Confidence Calibration for Deep Renal Biopsy Immunofluorescence Image Classification

Federico Pollastri, Juan Maroñas, Federico Bolelli, Giulia Ligabue, Roberto Paredes, Riccardo Magistroni, Costantino Grana

Responsive image

Auto-TLDR; A Probabilistic Convolutional Neural Network for Immunofluorescence Classification in Renal Biopsy

Slides Poster Similar

With this work we tackle immunofluorescence classification in renal biopsy, employing state-of-the-art Convolutional Neural Networks. In this setting, the aim of the probabilistic model is to assist an expert practitioner towards identifying the location pattern of antibody deposits within a glomerulus. Since modern neural networks often provide overconfident outputs, we stress the importance of having a reliable prediction, demonstrating that Temperature Scaling, a recently introduced re-calibration technique, can be successfully applied to immunofluorescence classification in renal biopsy. Experimental results demonstrate that the designed model yields good accuracy on the specific task, and that Temperature Scaling is able to provide reliable probabilities, which are highly valuable for such a task given the low inter-rater agreement.

A Joint Representation Learning and Feature Modeling Approach for One-Class Recognition

Pramuditha Perera, Vishal Patel

Responsive image

Auto-TLDR; Combining Generative Features and One-Class Classification for Effective One-class Recognition

Slides Poster Similar

One-class recognition is traditionally approached either as a representation learning problem or a feature modelling problem. In this work, we argue that both of these approaches have their own limitations; and a more effective solution can be obtained by combining the two. The proposed approach is based on the combination of a generative framework and a one-class classification method. First, we learn generative features using the one-class data with a generative framework. We augment the learned features with the corresponding reconstruction errors to obtain augmented features. Then, we qualitatively identify a suitable feature distribution that reduces the redundancy in the chosen classifier space. Finally, we force the augmented features to take the form of this distribution using an adversarial framework. We test the effectiveness of the proposed method on three one-class classification tasks and obtain state-of-the-art results.

Combining Similarity and Adversarial Learning to Generate Visual Explanation: Application to Medical Image Classification

Martin Charachon, Roberto Roberto Ardon, Celine Hudelot, Paul-Henry Cournède, Camille Ruppli

Responsive image

Auto-TLDR; Explaining Black-Box Machine Learning Models with Visual Explanation

Slides Poster Similar

Recently, due to their success and increasing applications, explaining the decision of black-box machine learning models has become a critical task. It is particularly the case in sensitive domains such as medical image interpretation. Various explanation approaches have been proposed in the literature, among which perturbation based approaches are very promising. Within this class of methods, we leverage a learning framework to produce our visual explanations method. From a given classifier, we train two generators to produce from an input image the so called similar and adversarial images. The similar (resp. adversarial) image shall be classified as (resp. not as) the input image. We show that visual explanation, outperforming state of the art methods, can be derived from these. Our method is model-agnostic and, at test time, only requires a single forward pass to generate explanation. Therefore, the proposed approach is adapted for real-time systems such as medical image analysis. Finally, we show that random geometric augmentations applied on the original image acts as a regularization that improves all state of the art explanation methods. We validate our approach on a large chest X-ray database.

Boundary Optimised Samples Training for Detecting Out-Of-Distribution Images

Luca Marson, Vladimir Li, Atsuto Maki

Responsive image

Auto-TLDR; Boundary Optimised Samples for Out-of-Distribution Input Detection in Deep Convolutional Networks

Slides Poster Similar

This paper presents a new approach to the problem of detecting out-of-distribution (OOD) inputs in image classifications with deep convolutional networks. We leverage so-called boundary samples to enforce low confidence (maximum softmax probabilities) for inputs far away from the training data. In particular, we propose the boundary optimised samples (named BoS) training algorithm for generating them. Unlike existing approaches, it does not require extra generative adversarial network, but achieves the goal by simply back propagating the gradient of an appropriately designed loss function to the input samples. At the end of the BoS training, all the boundary samples are in principle located on a specific level hypersurface with respect to the designed loss. Our contributions are i) the BoS training as an efficient alternative to generate boundary samples, ii) a robust algorithm therewith to enforce low confidence for OOD samples, and iii) experiments demonstrating improved OOD detection over the baseline. We show the performance using standard datasets for training and different test sets including Fashion MNIST, EMNIST, SVHN, and CIFAR-100, preceded by evaluations with a synthetic 2-dimensional dataset that provide an insight for the new procedure.

Bridging the Gap between Natural and Medical Images through Deep Colorization

Lia Morra, Luca Piano, Fabrizio Lamberti, Tatiana Tommasi

Responsive image

Auto-TLDR; Transfer Learning for Diagnosis on X-ray Images Using Color Adaptation

Slides Poster Similar

Deep learning has thrived by training on large-scale datasets. However, in many applications, as for medical image diagnosis, getting massive amount of data is still prohibitive due to privacy, lack of acquisition homogeneity and annotation cost. In this scenario transfer learning from natural image collections is a standard practice that attempts to tackle shape, texture and color discrepancy all at once through pretrained model fine-tuning. In this work we propose to disentangle those challenges and design a dedicated network module that focuses on color adaptation. We combine learning from scratch of the color module with transfer learning of different classification backbones obtaining an end-to-end, easy-to-train architecture for diagnostic image recognition on X-ray images. Extensive experiments show how our approach is particularly efficient in case of data scarcity and provides a new path for further transferring the learned color information across multiple medical datasets.

Killing Four Birds with One Gaussian Process: The Relation between Different Test-Time Attacks

Kathrin Grosse, Michael Thomas Smith, Michael Backes

Responsive image

Auto-TLDR; Security of Gaussian Process Classifiers against Attack Algorithms

Slides Poster Similar

In machine learning (ML) security, attacks like evasion, model stealing or membership inference are generally studied in individually. Previous work has also shown a relationship between some attacks and decision function curvature of the targeted model. Consequently, we study an ML model allowing direct control over the decision surface curvature: Gaussian Process classifiers (GPCs). For evasion, we find that changing GPC's curvature to be robust against one attack algorithm boils down to enabling a different norm or attack algorithm to succeed. This is backed up by our formal analysis showing that static security guarantees are opposed to learning. Concerning intellectual property, we show formally that lazy learning does not necessarily leak all information when applied. In practice, often a seemingly secure curvature can be found. For example, we are able to secure GPC against empirical membership inference by proper configuration. In this configuration, however, the GPC's hyper-parameters are leaked, e.g. model reverse engineering succeeds. We conclude that attacks on classification should not be studied in isolation, but in relation to each other.

Automatic Tuberculosis Detection Using Chest X-Ray Analysis with Position Enhanced Structural Information

Hermann Jepdjio Nkouanga, Szilard Vajda

Responsive image

Auto-TLDR; Automatic Chest X-ray Screening for Tuberculosis in Rural Population using Localized Region on Interest

Slides Poster Similar

For Tuberculosis (TB) detection beside the more expensive diagnosis solutions such as culture or sputum smear analysis one could consider the automatic analysis of the chest X-ray (CXR). This could mimic the lung region reading by the radiologist and it could provide a cheap solution to analyze and diagnose pulmonary abnormalities such as TB which often co- occurs with HIV. This software based pulmonary screening can be a reliable and affordable solution for rural population in different parts of the world such as India, Africa, etc. Our fully automatic system is processing the incoming CXR image by applying image processing techniques to detect the region on interest (ROI) followed by a computationally cheap feature extraction involving edge detection using Laplacian of Gaussian which we enrich by counting the local distribution of the intensities. The choice to ”zoom in” the ROI and look for abnormalities locally is motivated by the fact that some pulmonary abnormalities are localized in specific regions of the lungs. Later on the classifiers can decide about the normal or abnormal nature of each lung X-ray. Our goal is to find a simple feature, instead of a combination of several ones, -proposed and promoted in recent years’ literature, which can properly describe the different pathological alterations in the lungs. Our experiments report results on two publicly available data collections1, namely the Shenzhen and the Montgomery collection. For performance evaluation, measures such as area under the curve (AUC), and accuracy (ACC) were considered, achieving AUC = 0.81 (ACC = 83.33%) and AUC = 0.96 (ACC = 96.35%) for the Montgomery and Schenzen collections, respectively. Several comparisons are also provided to other state- of-the-art systems reported recently in the field.

Towards Explaining Adversarial Examples Phenomenon in Artificial Neural Networks

Ramin Barati, Reza Safabakhsh, Mohammad Rahmati

Responsive image

Auto-TLDR; Convolutional Neural Networks and Adversarial Training from the Perspective of convergence

Slides Poster Similar

In this paper, we study the adversarial examples existence and adversarial training from the standpoint of convergence and provide evidence that pointwise convergence in ANNs can explain these observations. The main contribution of our proposal is that it relates the objective of the evasion attacks and adversarial training with concepts already defined in learning theory. Also, we extend and unify some of the other proposals in the literature and provide alternative explanations on the observations made in those proposals. Through different experiments, we demonstrate that the framework is valuable in the study of the phenomenon and is applicable to real-world problems.

Polynomial Universal Adversarial Perturbations for Person Re-Identification

Wenjie Ding, Xing Wei, Rongrong Ji, Xiaopeng Hong, Yihong Gong

Responsive image

Auto-TLDR; Polynomial Universal Adversarial Perturbation for Re-identification Methods

Slides Poster Similar

In this paper, we focus on Universal Adversarial Perturbations (UAP) attack on state-of-the-art person re-identification (Re-ID) methods. Existing UAP methods usually compute a perturbation image and add it to the images of interest. Such a simple constant form greatly limits the attack power. To address this problem, we extend the formulation of UAP to a polynomial form and propose the Polynomial Universal Adversarial Perturbation (PUAP). Unlike traditional UAP methods which only rely on the additive perturbation signal, the proposed PUAP consists of both an additive perturbation and a multiplicative modulation factor. The additive perturbation produces the fundamental component of the signal, while the multiplicative factor modulates the perturbation signal in line with the unit impulse pattern of the input image. Moreover, we design a Pearson correlation coefficient loss to generate universal perturbations, for disrupting the outputs of person Re-ID methods. Extensive experiments on DukeMTMC-ReID, Market-1501, and MARS show that the proposed method can efficiently improve the attack performance, especially when the magnitude of UAP is constrained to a small value.

Task-based Focal Loss for Adversarially Robust Meta-Learning

Yufan Hou, Lixin Zou, Weidong Liu

Responsive image

Auto-TLDR; Task-based Adversarial Focal Loss for Few-shot Meta-Learner

Slides Poster Similar

Adversarial robustness of machine learning has been widely studied in recent years, and a series of effective methods are proposed to resist adversarial attacks. However, less attention is paid to few-shot meta-learners which are much more vulnerable due to the lack of training samples. In this paper, we propose Task-based Adversarial Focal Loss (TAFL) to handle this tough challenge on a typical meta-learner called MAML. More concretely, we regard few-shot classification tasks as normal samples in learning models and apply focal loss mechanism on them. Our proposed method focuses more on adversarially fragile tasks, leading to improvement on overall model robustness. Results of extensive experiments on several benchmarks demonstrate that TAFL can effectively promote the performance of the meta-learner on adversarial examples with elaborately designed perturbations.

MINT: Deep Network Compression Via Mutual Information-Based Neuron Trimming

Madan Ravi Ganesh, Jason Corso, Salimeh Yasaei Sekeh

Responsive image

Auto-TLDR; Mutual Information-based Neuron Trimming for Deep Compression via Pruning

Slides Poster Similar

Most approaches to deep neural network compression via pruning either evaluate a filter’s importance using its weights or optimize an alternative objective function with sparsity constraints. While these methods offer a useful way to approximate contributions from similar filters, they often either ignore the dependency between layers or solve a more difficult optimization objective than standard cross-entropy. Our method, Mutual Information-based Neuron Trimming (MINT), approaches deep compression via pruning by enforcing sparsity based on the strength of the relationship between filters of adjacent layers, across every pair of layers. The relationship is calculated using conditional geometric mutual information which evaluates the amount of similar information exchanged between the filters using a graph-based criterion. When pruning a network, we ensure that retained filters contribute the majority of the information towards succeeding layers which ensures high performance. Our novel approach outperforms existing state-of-the-art compression-via-pruning methods on the standard benchmarks for this task: MNIST, CIFAR-10, and ILSVRC2012, across a variety of network architectures. In addition, we discuss our observations of a common denominator between our pruning methodology’s response to adversarial attacks and calibration statistics when compared to the original network.

Transformer-Encoder Detector Module: Using Context to Improve Robustness to Adversarial Attacks on Object Detection

Faisal Alamri, Sinan Kalkan, Nicolas Pugeault

Responsive image

Auto-TLDR; Context Module for Robust Object Detection with Transformer-Encoder Detector Module

Slides Poster Similar

Deep neural network approaches have demonstrated high performance in object recognition (CNN) and detection (Faster-RCNN) tasks, but experiments have shown that such architectures are vulnerable to adversarial attacks (FFF, UAP): low amplitude perturbations, barely perceptible by the human eye, can lead to a drastic reduction in labelling performance. This article proposes a new context module, called Transformer-Encoder Detector Module, that can be applied to an object detector to (i) improve the labelling of object instances; and (ii) improve the detector's robustness to adversarial attacks. The proposed model achieves higher mAP, F1 scores and AUC average score of up to 13\% compared to the baseline Faster-RCNN detector, and an mAP score 8 points higher on images subjected to FFF or UAP attacks. The result demonstrates that a simple ad-hoc context module can improve the reliability of object detectors significantly

Video Anomaly Detection by Estimating Likelihood of Representations

Yuqi Ouyang, Victor Sanchez

Responsive image

Auto-TLDR; Video Anomaly Detection in the latent feature space using a deep probabilistic model

Slides Poster Similar

Video anomaly detection is a challenging task not only because it involves solving many sub-tasks such as motion representation, object localization and action recognition, but also because it is commonly considered as an unsupervised learning problem that involves detecting outliers. Traditionally, solutions to this task have focused on the mapping between video frames and their low-dimensional features, while ignoring the spatial connections of those features. Recent solutions focus on analyzing these spatial connections by using hard clustering techniques, such as K-Means, or applying neural networks to map latent features to a general understanding, such as action attributes. In order to solve video anomaly in the latent feature space, we propose a deep probabilistic model to transfer this task into a density estimation problem where latent manifolds are generated by a deep denoising autoencoder and clustered by expectation maximization. Evaluations on several benchmarks datasets show the strengths of our model, achieving outstanding performance on challenging datasets.

Combining GANs and AutoEncoders for Efficient Anomaly Detection

Fabio Carrara, Giuseppe Amato, Luca Brombin, Fabrizio Falchi, Claudio Gennaro

Responsive image

Auto-TLDR; CBIGAN: Anomaly Detection in Images with Consistency Constrained BiGAN

Slides Poster Similar

In this work, we propose CBiGAN --- a novel method for anomaly detection in images, where a consistency constraint is introduced as a regularization term in both the encoder and decoder of a BiGAN. Our model exhibits fairly good modeling power and reconstruction consistency capability. We evaluate the proposed method on MVTec AD --- a real-world benchmark for unsupervised anomaly detection on high-resolution images --- and compare against standard baselines and state-of-the-art approaches. Experiments show that the proposed method improves the performance of BiGAN formulations by a large margin and performs comparably to expensive state-of-the-art iterative methods while reducing the computational cost. We also observe that our model is particularly effective in texture-type anomaly detection, as it sets a new state of the art in this category. The code will be publicly released.

Separation of Aleatoric and Epistemic Uncertainty in Deterministic Deep Neural Networks

Denis Huseljic, Bernhard Sick, Marek Herde, Daniel Kottke

Responsive image

Auto-TLDR; AE-DNN: Modeling Uncertainty in Deep Neural Networks

Slides Poster Similar

Despite the success of deep neural networks (DNN) in many applications, their ability to model uncertainty is still significantly limited. For example, in safety-critical applications such as autonomous driving, it is crucial to obtain a prediction that reflects different types of uncertainty to address life-threatening situations appropriately. In such cases, it is essential to be aware of the risk (i.e., aleatoric uncertainty) and the reliability (i.e., epistemic uncertainty) that comes with a prediction. We present AE-DNN, a model allowing the separation of aleatoric and epistemic uncertainty while maintaining a proper generalization capability. AE-DNN is based on deterministic DNN, which can determine the respective uncertainty measures in a single forward pass. In analyses with synthetic and image data, we show that our method improves the modeling of epistemic uncertainty while providing an intuitively understandable separation of risk and reliability.

Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes

Andre Mendes, Julian Togelius, Leandro Dos Santos Coelho

Responsive image

Auto-TLDR; Multi-Task Learning and Semi-Supervised Learning for Multi-Stage Processes

Similar

In multi-stage processes, decisions occur in an ordered sequence of stages. Early stages usually have more observations with general information (easier/cheaper to collect), while later stages have fewer observations but more specific data. This situation can be represented by a dual funnel structure, in which the sample size decreases from one stage to the other while the information increases. Training classifiers in this scenario is challenging since information in the early stages may not contain distinct patterns to learn (underfitting). In contrast, the small sample size in later stages can cause overfitting. We address both cases by introducing a framework that combines adversarial autoencoders (AAE), multi-task learning (MTL), and multi-label semi-supervised learning (MLSSL). We improve the decoder of the AAE with an MTL component so it can jointly reconstruct the original input and use feature nets to predict the features for the next stages. We also introduce a sequence constraint in the output of an MLSSL classifier to guarantee the sequential pattern in the predictions. Using real-world data from different domains (selection process, medical diagnosis), we show that our approach outperforms other state-of-the-art methods.

Pretraining Image Encoders without Reconstruction Via Feature Prediction Loss

Gustav Grund Pihlgren, Fredrik Sandin, Marcus Liwicki

Responsive image

Auto-TLDR; Feature Prediction Loss for Autoencoder-based Pretraining of Image Encoders

Similar

This work investigates three methods for calculating loss for autoencoder-based pretraining of image encoders: The commonly used reconstruction loss, the more recently introduced deep perceptual similarity loss, and a feature prediction loss proposed here; the latter turning out to be the most efficient choice. Standard auto-encoder pretraining for deep learning tasks is done by comparing the input image and the reconstructed image. Recent work shows that predictions based on embeddings generated by image autoencoders can be improved by training with perceptual loss, i.e., by adding a loss network after the decoding step. So far the autoencoders trained with loss networks implemented an explicit comparison of the original and reconstructed images using the loss network. However, given such a loss network we show that there is no need for the time-consuming task of decoding the entire image. Instead, we propose to decode the features of the loss network, hence the name ``feature prediction loss''. To evaluate this method we perform experiments on three standard publicly available datasets (LunarLander-v2, STL-10, and SVHN) and compare six different procedures for training image encoders (pixel-wise, perceptual similarity, and feature prediction losses; combined with two variations of image and feature encoding/decoding). The embedding-based prediction results show that encoders trained with feature prediction loss is as good or better than those trained with the other two losses. Additionally, the encoder is significantly faster to train using feature prediction loss in comparison to the other losses. The method implementation used in this work is available online: https://github.com/guspih/Perceptual-Autoencoders

Deep Learning Based Sepsis Intervention: The Modelling and Prediction of Severe Sepsis Onset

Gavin Tsang, Xianghua Xie

Responsive image

Auto-TLDR; Predicting Sepsis onset by up to six hours prior using a boosted cascading training methodology and adjustable margin hinge loss function

Slides Poster Similar

Sepsis presents a significant challenge to healthcare providers during critical care scenarios such as within an intensive care unit. The prognosis of the onset of severe septic shock results in significant increases in mortality rate, length of stay and readmission rates. Continual advancements in health informatics data allows for applications within the machine learning field to predict sepsis onset in a timely manner, allowing for effective preventative intervention of severe septic shock. A novel deep learning application is proposed to provide effective prediction of sepsis onset by up to six hours prior, involving the use of novel concepts such as a boosted cascading training methodology and adjustable margin hinge loss function. The proposed methodology provides statistically significant improvements to that of current machine learning based modelling applications based off the Physionet Computing in Cardiology 2019 challenge. Results show test F1 scores of 0.420, a significant improvement of 0.281 as compared to the next best challenger results.

The eXPose Approach to Crosslier Detection

Antonio Barata, Frank Takes, Hendrik Van Den Herik, Cor Veenman

Responsive image

Auto-TLDR; EXPose: Crosslier Detection Based on Supervised Category Modeling

Slides Poster Similar

Transit of wasteful materials within the European Union is highly regulated through a system of permits. Waste processing costs vary greatly depending on the waste category of a permit. Therefore, companies may have a financial incentive to allege transporting waste with erroneous categorisation. Our goal is to assist inspectors in selecting potentially manipulated permits for further investigation, making their task more effective and efficient. Due to data limitations, a supervised learning approach based on historical cases is not possible. Standard unsupervised approaches, such as outlier detection and data quality-assurance techniques, are not suited since we are interested in targeting non-random modifications in both category and category-correlated features. For this purpose we (1) introduce the concept of crosslier: an anomalous instance of a category which lies across other categories; (2) propose eXPose: a novel approach to crosslier detection based on supervised category modelling; and (3) present the crosslier diagram: a visualisation tool specifically designed for domain experts to easily assess crossliers. We compare eXPose against traditional outlier detection methods in various benchmark datasets with synthetic crossliers and show the superior performance of our method in targeting these instances.

Multi-Scale and Attention Based ResNet for Heartbeat Classification

Haojie Zhang, Gongping Yang, Yuwen Huang, Feng Yuan, Yilong Yin

Responsive image

Auto-TLDR; A Multi-Scale and Attention based ResNet for ECG heartbeat classification in intra-patient and inter-patient paradigms

Slides Poster Similar

This paper presents a novel deep learning framework for the electrocardiogram (ECG) heartbeat classification. Although there have been some studies with excellent overall accuracy, these studies have not been very accurate in the diagnosis of arrhythmia classes especially such as supraventricular ectopic beat (SVEB) and ventricular ectopic beat (VEB). In our work, we propose a Multi-Scale and Attention based ResNet for heartbeat classification in intra-patient and inter-patient paradigms respectively. Firstly, we extract shallow features from a convolutional layer. Secondly, the shallow features are sent into three branches with different convolution kernels in order to combine receptive fields of different sizes. Finally, fully connected layers are used to classify the heartbeat. Besides, we design a new attention mechanism based on the characteristics of heartbeat data. At last, extensive experiments on benchmark dataset demonstrate the effectiveness of our proposed model.

Explainable Feature Embedding Using Convolutional Neural Networks for Pathological Image Analysis

Kazuki Uehara, Masahiro Murakawa, Hirokazu Nosato, Hidenori Sakanashi

Responsive image

Auto-TLDR; Explainable Diagnosis Using Convolutional Neural Networks for Pathological Image Analysis

Slides Poster Similar

The development of computer-assisted diagnosis (CAD) algorithms for pathological image analysis constitutes an important research topic. Recently, convolutional neural networks (CNNs) have been used in several studies for the development of CAD algorithms. Such systems are required to be not only accurate but also explainable for their decisions, to ensure reliability. However, a limitation of using CNNs is that the basis of the decisions made by them are incomprehensible to humans. Thus, in this paper, we present an explainable diagnosis method, which comprises of two CNNs for different rolls. This method allows us to interpret the basis of the decisions made by CNN from two perspectives, namely statistics and visualization. For the statistical explanation, the method constructs a dictionary of representative pathological features. It performs diagnoses based on the occurrence and importance of learned features referred from its dictionary. To construct the dictionary, we introduce a vector quantization scheme for CNN. For the visual interpretation, the method provides images of learned features embedded in a high-dimensional feature space as an index of the dictionary by generating them using a conditional autoregressive model. The experimental results showed that the proposed network learned pathological features, which contributed to the diagnosis and yielded an area under the receiver operating curve (AUC) of approximately 0.93 for detecting atypical tissues in pathological images of the uterine cervix. Moreover, the proposed method demonstrated that it could provide visually interpretable images to show the rationales behind its decisions. Thus, the proposed method can serve as a valuable tool for pathological image analysis in terms of both its accuracy and explainability.

F-Mixup: Attack CNNs from Fourier Perspective

Xiu-Chuan Li, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu

Responsive image

Auto-TLDR; F-Mixup: A novel black-box attack in frequency domain for deep neural networks

Slides Poster Similar

Recent research has revealed that deep neural networks are highly vulnerable to adversarial examples. In this paper, different from most adversarial attacks which directly modify pixels in spatial domain, we propose a novel black-box attack in frequency domain, named as f-mixup, based on the property of natural images and perception disparity between human-visual system (HVS) and convolutional neural networks (CNNs): First, natural images tend to have the bulk of their Fourier spectrums concentrated on the low frequency domain; Second, HVS is much less sensitive to high frequencies while CNNs can utilize both low and high frequency information to make predictions. Extensive experiments are conducted and show that deeper CNNs tend to concentrate more on the high frequency domain, which may explain the contradiction between robustness and accuracy. In addition, we compared f-mixup with existing attack methods and observed that our approach possesses great advantages. Finally, we show that f-mixup can be also incorporated in training to make deep CNNs defensible against a kind of perturbations effectively.

Adversarial Training for Aspect-Based Sentiment Analysis with BERT

Akbar Karimi, Andrea Prati, Leonardo Rossi

Responsive image

Auto-TLDR; Adversarial Training of BERT for Aspect-Based Sentiment Analysis

Slides Poster Similar

Aspect-Based Sentiment Analysis (ABSA) studies the extraction of sentiments and their targets. Collecting labeled data for this task in order to help neural networks generalize better can be laborious and time-consuming. As an alternative, similar data to the real-world examples can be produced artificially through an adversarial process which is carried out in the embedding space. Although these examples are not real sentences, they have been shown to act as a regularization method which can make neural networks more robust. In this work, we fine-tune the general purpose BERT and domain specific post-trained BERT (BERT-PT) using adversarial training. After improving the results of post-trained BERT with different hyperparameters, we propose a novel architecture called BERT Adversarial Training (BAT) to utilize adversarial training for the two major tasks of Aspect Extraction and Aspect Sentiment Classification in sentiment analysis. The proposed model outperforms the general BERT as well as the in-domain post-trained BERT in both tasks. To the best of our knowledge, this is the first study on the application of adversarial training in ABSA. The code is publicly available on a GitHub repository at https://github.com/IMPLabUniPr/Adversarial-Training-fo r-ABSA

Automatic Detection of Stationary Waves in the Venus’ Atmosphere Using Deep Generative Models

Minori Narita, Daiki Kimura, Takeshi Imamura

Responsive image

Auto-TLDR; Anomaly Detection of Large Bow-shaped Structures on the Venus Clouds using Variational Auto-encoder and Attention Maps

Slides Poster Similar

Various anomaly detection methods utilizing different types of images have recently been proposed. However, anomaly detection in the field of planetary science is still done predominantly by the human eye because explainability is crucial in the physical sciences and most of today's anomaly detection methods based on deep learning cannot offer enough. Moreover, preparing a large number of images required for fully utilizing anomaly detection is not always feasible. In this work, we propose a new framework that automatically detects large bow-shaped structures~(stationary waves) appearing on the surface of the Venus clouds by applying a variational auto-encoder~(VAE) and attention maps to anomaly detection. We also discuss the advantages of using image augmentation. Experiments show that our approach can achieve higher accuracy than the state-of-the-art methods even when the anomaly images are scarce. On the basis of this finding, we discuss anomaly detection frameworks particularly suited to physical science domains.

Self-Supervised Learning for Astronomical Image Classification

Ana Martinazzo, Mateus Espadoto, Nina S. T. Hirata

Responsive image

Auto-TLDR; Unlabeled Astronomical Images for Deep Neural Network Pre-training

Slides Poster Similar

In Astronomy, a huge amount of image data is generated daily by photometric surveys, which scan the sky to collect data from stars, galaxies and other celestial objects. In this paper, we propose a technique to leverage unlabeled astronomical images to pre-train deep convolutional neural networks, in order to learn a domain-specific feature extractor which improves the results of machine learning techniques in setups with small amounts of labeled data available. We show that our technique produces results which are in many cases better than using ImageNet pre-training.

Using Machine Learning to Refer Patients with Chronic Kidney Disease to Secondary Care

Lee Au-Yeung, Xianghua Xie, Timothy Marcus Scale, James Anthony Chess

Responsive image

Auto-TLDR; A Machine Learning Approach for Chronic Kidney Disease Prediction using Blood Test Data

Slides Poster Similar

There has been growing interest recently in using machine learning techniques as an aid in clinical medicine. Machine learning offers a range of classification algorithms which can be applied to medical data to aid in making clinical predictions. Recent studies have demonstrated the high predictive accuracy of various classification algorithms applied to clinical data. Several studies have already been conducted in diagnosing or predicting chronic kidney disease at various stages using different sets of variables. In this study we are investigating the use machine learning techniques with blood test data. Such a system could aid renal teams in making recommendations to primary care general practitioners to refer patients to secondary care where patients may benefit from earlier specialist assessment and medical intervention. We are able to achieve an overall accuracy of 88.48\% using logistic regression, 87.12\% using ANN and 85.29\% using SVM. ANNs performed with the highest sensitivity at 89.74\% compared to 86.67\% for logistic regression and 85.51\% for SVM.

Real Time Fencing Move Classification and Detection at Touch Time During a Fencing Match

Cem Ekin Sunal, Chris G. Willcocks, Boguslaw Obara

Responsive image

Auto-TLDR; Fencing Body Move Classification and Detection Using Deep Learning

Slides Similar

Fencing is a fast-paced sport played with swords which are Epee, Foil, and Saber. However, such fast-pace can cause referees to make wrong decisions. Review of slow-motion camera footage in tournaments helps referees’ decision making, but it interrupts the match and may not be available for every organization. Motivated by the need for better decision making, analysis, and availability, we introduce the first fully-automated deep learning classification and detection system for fencing body moves at the moment a touch is made. This is an important step towards creating a fencing analysis system, with player profiling and decision tools that will benefit the fencing community. The proposed architecture combines You Only Look Once version three (YOLOv3) with a ResNet-34 classifier, trained on ImageNet settings to obtain 83.0\% test accuracy on the fencing moves. These results are exciting development in the sport, providing immediate feedback and analysis along with accessibility, hence making it a valuable tool for trainers and fencing match referees.

Fine-Tuning Convolutional Neural Networks: A Comprehensive Guide and Benchmark Analysis for Glaucoma Screening

Amed Mvoulana, Rostom Kachouri, Mohamed Akil

Responsive image

Auto-TLDR; Fine-tuning Convolutional Neural Networks for Glaucoma Screening

Slides Poster Similar

This work aimed at giving a comprehensive and in-detailed guide on the route to fine-tuning Convolutional Neural Networks (CNNs) for glaucoma screening. Transfer learning consists in a promising alternative to train CNNs from stratch, to avoid the huge data and resources requirements. After a thorough study of five state-of-the-art CNNs architectures, a complete and well-explained strategy for fine-tuning these networks is proposed, using hyperparameter grid-searching and two-phase training approach. Excellent performance is reached on model evaluation, with a 0.9772 AUROC validation rate, giving arise to reliable glaucoma diagosis-help systems. Also, a benchmark analysis is conducted across all fine-tuned models, studying them according to performance indices such as model complexity and size, AUROC density and inference time. This in-depth analysis allows a rigorous comparison between model characteristics, and is useful for giving practioners important trademarks for prospective applications and deployments.

Kernel-Based LIME with Feature Dependency Sampling

Sheng Shi, Yangzhou Du, Fan Wei

Responsive image

Auto-TLDR; Local Interpretable Model-agnostic Explanation with Feature Dependency Sampling

Slides Poster Similar

While deep learning makes significant achievements in Artificial Intelligence (AI), the lack of transparency has limited its broad application in various vertical domains. Explainability is not only a gateway between AI and society, but also a powerful feature to detect flaw of the models and bias of the data. Local Interpretable Model-agnostic Explanation (LIME) is a widely-accepted technique that explains the predictions of any classifier faithfully by learning an interpretable model locally around the predicted instance. However, the sampling operation in the standard implementation of LIME is defective. Perturbed samples are generated from a uniform distribution, ignoring the complicated correlation between features. Moreover, as the local decision boundary is non-linear for most complex networks, linear approximation may produce serious errors. This paper proposes an high-interpretability and high-fidelity local explanation method, known as Kernel-based LIME with Feature Dependency Sampling (KLFDS). Given an instance being explained, KLFDS enhances interpretability by feature sampling with intrinsic dependency. Besides, KLFDS improves the local explanation fidelity by approximating nonlinear boundary of local decision. We evaluate our method with image classification tasks and results show that KLFDS's explanation of the back-box model achieves much better performance than original LIME in terms of interpretability and fidelity.

Evaluation of Anomaly Detection Algorithms for the Real-World Applications

Marija Ivanovska, Domen Tabernik, Danijel Skocaj, Janez Pers

Responsive image

Auto-TLDR; Evaluating Anomaly Detection Algorithms for Practical Applications

Slides Poster Similar

Anomaly detection in complex data structures is oneof the most challenging problems in computer vision. In manyreal-world problems, for example in the quality control in modernmanufacturing, the anomalous samples are usually rare, resultingin (highly) imbalanced datasets. However, in current researchpractice, these scenarios are rarely modeled, and as a conse-quence, evaluation of anomaly detection algorithms often do notreproduce results that are useful for practical applications. First,even in case of highly unbalanced input data, anomaly detectionalgorithms are expected to significantly reduce the proportionof anomalous samples, detecting ”almost all” anomalous samples(with exact specifications depending on the target customer). Thisplaces high importance on only the small part of the ROC curve,possibly rendering the standard metrics such as AUC (AreaUnder Curve) and AP (Average Precision) useless. Second, thetarget of automatic anomaly detection in practical applicationsis significant reduction in manual work required, and standardmetrics are poor predictor of this feature. Finally, the evaluationmay produce erratic results for different randomly initializedtraining runs of the neural network, producing evaluation resultsthat may not reproduce well in practice. In this paper, we presentan evaluation methodology that avoids these pitfalls.

MixNet for Generalized Face Presentation Attack Detection

Nilay Sanghvi, Sushant Singh, Akshay Agarwal, Mayank Vatsa, Richa Singh

Responsive image

Auto-TLDR; MixNet: A Deep Learning-based Network for Detection of Presentation Attacks in Cross-Database and Unseen Setting

Slides Poster Similar

The non-intrusive nature and high accuracy of face recognition algorithms have led to their successful deployment across multiple applications ranging from border access to mobile unlocking and digital payments. However, their vulnerability against sophisticated and cost-effective presentation attack mediums raises essential questions regarding its reliability. Several presentation attack detection algorithms are presented; however, they are still far behind from reality. The major problem with the existing work is the generalizability against multiple attacks both in the seen and unseen setting. The algorithms which are useful for one kind of attack (such as print) fail miserably for another type of attack (such as silicone masks). In this research, we have proposed a deep learning-based network called MixNet to detect presentation attacks in cross-database and unseen attack settings. The proposed algorithm utilizes state-of-the-art convolutional neural network architectures and learns the feature mapping for each attack category. Experiments are performed using multiple challenging face presentation attack databases such as Silicone Mask Attack Database (SMAD) and Spoof In the Wild with Multiple Attack (SiW-M). Extensive experiments and comparison with the existing state of the art algorithms show the effectiveness of the proposed algorithm.

Cost-Effective Adversarial Attacks against Scene Text Recognition

Mingkun Yang, Haitian Zheng, Xiang Bai, Jiebo Luo

Responsive image

Auto-TLDR; Adversarial Attacks on Scene Text Recognition

Slides Poster Similar

Scene text recognition is a challenging task due to the diversity in text appearance and complexity of natural scenes. Thanks to the development of deep learning and the large volume of training data, scene text recognition has made impressive progress in recent years. However, recent research on adversarial examples has shown that deep learning models are vulnerable to adversarial input with imperceptible changes. As one of the most practical tasks in computer vision, scene text recognition is also facing huge security risks. To our best knowledge, there has been no work on adversarial attacks against scene text recognition. To investigate its effects on scene text recognition, we make the first attempt to attack the state-of-the-art scene text recognizer, i.e., attention-based recognizer. To that end, we first adjust the objective function designed for non-sequential tasks, such as image classification, semantic segmentation and image retrieval, to the sequential form. We then propose a novel and effective objective function to further reduce the amount of perturbation while achieving a higher attack success rate. Comprehensive experiments on several standard benchmarks clearly demonstrate effective adversarial effects on scene text recognition by the proposed attacks.

Classify Breast Histopathology Images with Ductal Instance-Oriented Pipeline

Beibin Li, Ezgi Mercan, Sachin Mehta, Stevan Knezevich, Corey Arnold, Donald Weaver, Joann Elmore, Linda Shapiro

Responsive image

Auto-TLDR; DIOP: Ductal Instance-Oriented Pipeline for Diagnostic Classification

Slides Poster Similar

In this study, we propose the Ductal Instance-Oriented Pipeline (DIOP) that contains a duct-level instance segmentation model, a tissue-level semantic segmentation model, and three-levels of features for diagnostic classification. Based on recent advancements in instance segmentation and the Mask R-CNN model, our duct-level segmenter tries to identify each ductal individual inside a microscopic image; then, it extracts tissue-level information from the identified ductal instances. Leveraging three levels of information obtained from these ductal instances and also the histopathology image, the proposed DIOP outperforms previous approaches (both feature-based and CNN-based) in all diagnostic tasks; for the four-way classification task, the DIOP achieves comparable performance to general pathologists in this unique dataset. The proposed DIOP only takes a few seconds to run in the inference time, which could be used interactively on most modern computers. More clinical explorations are needed to study the robustness and generalizability of this system in the future.