Active Sampling for Pairwise Comparisons via Approximate Message Passing and Information Gain Maximization

Aliaksei Mikhailiuk, Clifford Wilmot, Maria Perez-Ortiz, Dingcheng Yue, Rafal Mantiuk

Responsive image

Auto-TLDR; ASAP: An Active Sampling Algorithm for Pairwise Comparison Data

Slides

Pairwise comparison data arise in many domains with subjective assessment experiments, for example in image and video quality assessment. In these experiments observers are asked to express a preference between two conditions. However, many pairwise comparison protocols require a large number of comparisons to infer accurate scores, which may be unfeasible when each comparison is time-consuming (e.g. videos) or expensive (e.g. medical imaging). This motivates the use of an active sampling algorithm that chooses only the most informative pairs for comparison. In this paper we propose ASAP, an active sampling algorithm based on approximate message passing and expected information gain maximization. Unlike most existing methods, which rely on partial updates of the posterior distribution, we are able to perform full updates and therefore much improve the accuracy of the inferred scores. The algorithm relies on three techniques for reducing computational cost: inference based on approximate message passing, selective evaluations of the information gain, and selecting pairs in a batch that forms a minimum spanning tree of the inverse of information gain. We demonstrate, with real and synthetic data, that ASAP offers the highest accuracy of inferred scores compared to the existing methods. We also provide an open-source GPU implementation of ASAP for large-scale experiments.

Similar papers

Factor Screening Using Bayesian Active Learning and Gaussian Process Meta-Modelling

Cheng Li, Santu Rana, Andrew William Gill, Dang Nguyen, Sunil Kumar Gupta, Svetha Venkatesh

Responsive image

Auto-TLDR; Data-Efficient Bayesian Active Learning for Factor Screening in Combat Simulations

Similar

In this paper we propose a data-efficient Bayesian active learning framework for factor screening, which is important when dealing with systems which are expensive to evaluate, such as combat simulations. We use Gaussian Process meta-modelling with the Automatic Relevance Determination covariance kernel, which measures the importance of each factor by the inverse of their associated length-scales in the kernel. This importance measures the degree of non-linearity in the simulation response with respect to the corresponding factor. We initially place a prior over the length-scale values, then use the estimated posterior to select the next datum to simulate which maximises the mutual entropy between the length-scales and the unknown simulation response. Our goal-driven Bayesian active learning strategy ensures that we are data-efficient in discovering the correct values of the length-scales compared to either a random-sampling or uncertainty-sampling based approach. We apply our method to an expensive combat simulation and demonstrate the superiority of our approach.

Sketch-Based Community Detection Via Representative Node Sampling

Mahlagha Sedghi, Andre Beckus, George Atia

Responsive image

Auto-TLDR; Sketch-based Clustering of Community Detection Using a Small Sketch

Slides Poster Similar

This paper proposes a sketch-based approach to the community detection problem which clusters the full graph through the use of an informative and concise sketch. The reduced sketch is built through an effective sampling approach which selects few nodes that best represent the complete graph and operates on a pairwise node similarity measure based on the average commute time. After sampling, the proposed algorithm clusters the nodes in the sketch, and then infers the cluster membership of the remaining nodes in the full graph based on their aggregate similarity to nodes in the partitioned sketch. By sampling nodes with strong representation power, our approach can improve the success rates over full graph clustering. In challenging cases with large node degree variation, our approach not only maintains competitive accuracy with full graph clustering despite using a small sketch, but also outperforms existing sampling methods. The use of a small sketch allows considerable storage savings, and computational and timing improvements for further analysis such as clustering and visualization. We provide numerical results on synthetic data based on the homogeneous, heterogeneous and degree corrected versions of the stochastic block model, as well as experimental results on real-world data.

Probabilistic Latent Factor Model for Collaborative Filtering with Bayesian Inference

Jiansheng Fang, Xiaoqing Zhang, Yan Hu, Yanwu Xu, Ming Yang, Jiang Liu

Responsive image

Auto-TLDR; Bayesian Latent Factor Model for Collaborative Filtering

Slides Similar

Latent Factor Model (LFM) is one of the most successful methods for Collaborative filtering (CF) in the recommendation system, in which both users and items are projected into a joint latent factor space. Base on matrix factorization applied usually in pattern recognition, LFM models user-item interactions as inner products of factor vectors of user and item in that space and can be efficiently solved by least square methods with optimal estimation. However, such optimal estimation methods are prone to overfitting due to the extreme sparsity of user-item interactions. In this paper, we propose a Bayesian treatment for LFM, named Bayesian Latent Factor Model (BLFM). Based on observed user-item interactions, we build a probabilistic factor model in which the regularization is introduced via placing prior constraint on latent factors, and the likelihood function is established over observations and parameters. Then we draw samples of latent factors from the posterior distribution with Variational Inference (VI) to predict expected value. We further make an extension to BLFM, called BLFMBias, incorporating user-dependent and item-dependent biases into the model for enhancing performance. Extensive experiments on the movie rating dataset show the effectiveness of our proposed models by compared with several strong baselines.

Multi-annotator Probabilistic Active Learning

Marek Herde, Daniel Kottke, Denis Huseljic, Bernhard Sick

Responsive image

Auto-TLDR; MaPAL: Multi-annotator Probabilistic Active Learning

Slides Poster Similar

Classifiers require annotations of instances, i.e., class labels, for training. An annotation process is often costly due to its manual execution through human annotators. Active learning (AL) aims at reducing the annotation costs by selecting instances from which the classifier is expected to learn the most. Many AL strategies assume the availability of a single omniscient annotator. In this article, we overcome this limitation by considering multiple error-prone annotators. We propose a novel AL strategy multi-annotator probabilistic active learning (MaPAL). Due to the nature of learning with error-prone annotators, it must not only select instances but annotators, too. MaPAL builds on a decision-theoretic framework and selects instance-annotator pairs maximizing the classifier's expected performance. Experiments on a variety of data sets demonstrate MaPAL's superior performance compared to five related AL strategies.

Aggregating Dependent Gaussian Experts in Local Approximation

Hamed Jalali, Gjergji Kasneci

Responsive image

Auto-TLDR; A novel approach for aggregating the Gaussian experts by detecting strong violations of conditional independence

Slides Poster Similar

Distributed Gaussian processes (DGPs) are prominent local approximation methods to scale Gaussian processes (GPs) to large datasets. Instead of a global estimation, they train local experts by dividing the training set into subsets, thus reducing the time complexity. This strategy is based on the conditional independence assumption, which basically means that there is a perfect diversity between the local experts. In practice, however, this assumption is often violated, and the aggregation of experts leads to sub-optimal and inconsistent solutions. In this paper, we propose a novel approach for aggregating the Gaussian experts by detecting strong violations of conditional independence. The dependency between experts is determined by using a Gaussian graphical model, which yields the precision matrix. The precision matrix encodes conditional dependencies between experts and is used to detect strongly dependent experts and construct an improved aggregation. Using both synthetic and real datasets, our experimental evaluations illustrate that our new method outperforms other state-of-the-art (SOTA) DGP approaches while being substantially more time-efficient than SOTA approaches, which build on independent experts.

Bayesian Active Learning for Maximal Information Gain on Model Parameters

Kasra Arnavaz, Aasa Feragen, Oswin Krause, Marco Loog

Responsive image

Auto-TLDR; Bayesian assumptions for Bayesian classification

Slides Poster Similar

The fact that machine learning models, despite their advancements, are still trained on randomly gathered data is proof that a lasting solution to the problem of optimal data gathering has not yet been found. In this paper, we investigate whether a Bayesian approach to the classification problem can provide assumptions under which one is guaranteed to perform at least as good as random sampling. For a logistic regression model, we show that maximal expected information gain on model parameters is a promising criterion for selecting samples, assuming that our classification model is well-matched to the data. Our derived criterion is closely related to the maximum model change. We experiment with data sets which satisfy this assumption to varying degrees to see how sensitive our performance is to the violation of our assumption in practice.

Budgeted Batch Mode Active Learning with Generalized Cost and Utility Functions

Arvind Agarwal, Shashank Mujumdar, Nitin Gupta, Sameep Mehta

Responsive image

Auto-TLDR; Active Learning Based on Utility and Cost Functions

Slides Poster Similar

Active learning reduces the labeling cost by actively querying labels for the most valuable data points. Typical active learning methods select the most informative examples one-at-a-time, their batch variants exist where a set of most informative points are selected. These points are selected in such a way that when added to the training data along with their labels, they provide maximum benefit to the underlying model. In this paper, we present a learning framework that actively selects optimal set of examples (in a batch) within a given budget, based on given utility and cost functions. The framework is generic enough to incorporate any utility and any cost function defined on a set of examples. Furthermore, we propose a novel utility function based on the Facility Location problem that considers three important characteristics of utility i.e., diversity, density and point utility. We also propose a novel cost function, by formulating the cost computation problem as an optimization problem, the solution to which turns out to be the minimum spanning tree. Thus, our framework provides the optimal batch of points within the given budget based on the cost and utility functions. We evaluate our method on several data sets and show its superior performance over baseline methods.

3CS Algorithm for Efficient Gaussian Process Model Retrieval

Fabian Berns, Kjeld Schmidt, Ingolf Bracht, Christian Beecks

Responsive image

Auto-TLDR; Efficient retrieval of Gaussian Process Models for large-scale data using divide-&-conquer-based approach

Slides Poster Similar

Gaussian Process Models (GPMs) have been applied for various pattern recognition tasks due to their analytical tractability, ability to quantify uncertainty for their own results as well as to subsume prominent other regression techniques. Despite these promising prospects their super-quadratic computation time complexity for model selection and evaluation impedes its broader application for more than a few thousand data points. Although there have been many proposals towards Gaussian Processes for large-scale data, those only offer a linearly scaling improvement to a cubical scaling problem. In particular, solutions like the Nystrom approximation or sparse matrices are only taking fractions of the given data into account and subsequently lead to inaccurate models. In this paper, we thus propose a divide-&-conquer-based approach, that allows to efficiently retrieve GPMs for large-scale data. The resulting model is composed of independent pattern representations for non-overlapping segments of the given data and consequently reduces computation time significantly. Our performance analysis indicates that our proposal is able to outperform state-of-the-art algorithms for GPM retrieval with respect to the qualities of efficiency and accuracy.

Adaptive Sampling of Pareto Frontiers with Binary Constraints Using Regression and Classification

Raoul Heese, Michael Bortz

Responsive image

Auto-TLDR; Adaptive Optimization for Black-Box Multi-Objective Optimizing Problems with Binary Constraints

Poster Similar

We present a novel adaptive optimization algorithm for black-box multi-objective optimization problems with binary constraints on the foundation of Bayes optimization. Our method is based on probabilistic regression and classification models, which act as a surrogate for the optimization goals and allow us to suggest multiple design points at once in each iteration. The proposed acquisition function is intuitively understandable and can be tuned to the demands of the problems at hand. We also present a novel ellipsoid truncation method to speed up the expected hypervolume calculation in a straightfoward way for regression models with a normal probability density. We benchmark our approach with an evolutionary algorithm on multiple test problems.

DR2S: Deep Regression with Region Selection for Camera Quality Evaluation

Marcelin Tworski, Stéphane Lathuiliere, Salim Belkarfa, Attilio Fiandrotti, Marco Cagnazzo

Responsive image

Auto-TLDR; Texture Quality Estimation Using Deep Learning

Slides Poster Similar

In this work, we tackle the problem of estimating a camera capability to preserve fine texture details at a given lighting condition. Importantly, our texture preservation measurement should coincide with human perception. Consequently, we formulate our problem as a regression one and we introduce a deep convolutional network to estimate texture quality score. At training time, we use ground-truth quality scores provided by expert human annotators in order to obtain a subjective quality measure. In addition, we propose a region selection method to identify the image regions that are better suited at measuring perceptual quality. Finally, our experimental evaluation shows that our learning-based approach outperforms existing methods and that our region selection algorithm consistently improves the quality estimation.

Learning Parameter Distributions to Detect Concept Drift in Data Streams

Johannes Haug, Gjergji Kasneci

Responsive image

Auto-TLDR; A novel framework for the detection of concept drift in streaming environments

Slides Poster Similar

Data distributions in streaming environments are usually not stationary. In order to maintain a high predictive quality at all times, online learning models need to adapt to distributional changes, which are known as concept drift. The timely and robust identification of concept drift can be difficult, as we never have access to the true distribution of streaming data. In this work, we propose a novel framework for the detection of real concept drift, called ERICS. By treating the parameters of a predictive model as random variables, we show that concept drift corresponds to a change in the distribution of optimal parameters. To this end, we adopt common measures from information theory. The proposed framework is completely model-agnostic. By choosing an appropriate base model, ERICS is also capable to detect concept drift at the input level, which is a significant advantage over existing approaches. An evaluation on several synthetic and real-world data sets suggests that the proposed framework identifies concept drift more effectively and precisely than various existing works.

Rank-Based Ordinal Classification

Joan Serrat, Idoia Ruiz

Responsive image

Auto-TLDR; Ordinal Classification with Order

Slides Poster Similar

Differently from the regular classification task, in ordinal classification there is an order in the classes. As a consequence not all classification errors matter the same: a predicted class close to the groundtruth one is better than predicting a farther away class. To account for this, most previous works employ loss functions based on the absolute difference between the predicted and groundtruth class {\em labels}. We argue that there are many cases in ordinal classification where label values are arbitrary (for instance 1\ldots $C$, being $C$ the number of classes) and thus such loss functions may not be the best choice. We instead propose a network architecture that produces not a single class prediction but an ordered vector, or ranking, of all the possible classes from most to less likely. This is tanks to a loss function that compares groundtruth and predicted rankings of these class labels, not the labels themselves. Another advantage of this new formulation is that we can enforce consistency in the predictions, namely, predicted rankings come from some unimodal vector of scores with mode at the groundtruth class. We compare with the state of the art ordinal classification methods, showing that ours attains equal or better performance, as measured by common ordinal classification metrics, on three benchmark datasets. Furthermore, it is also suitable for a new task on image aesthetics assessment, \textit{i.e.}, most voted score prediction. Finally, we also apply it to building damage assessment from satellite images, providing an analysis of its performance depending on the degree of imbalance of the dataset.

Temporal Pattern Detection in Time-Varying Graphical Models

Federico Tomasi, Veronica Tozzo, Annalisa Barla

Responsive image

Auto-TLDR; A dynamical network inference model that leverages on kernels to consider general temporal patterns

Slides Poster Similar

Graphical models allow to describe the interplay among variables of a system through a compact representation, suitable when relations evolve over time. For example, in a biological setting, genes interact differently depending on external environmental or metabolic factors. To incorporate this dynamics a viable strategy is to estimate a sequence of temporally related graphs assuming similarity among samples in different time points. While adjacent time points may direct the analysis towards a robust estimate of the underlying graph, the resulting model will not incorporate long-term or recurrent temporal relationships. In this work we propose a dynamical network inference model that leverages on kernels to consider general temporal patterns (such as circadian rhythms or seasonality). We show how our approach may also be exploited when the recurrent patterns are unknown, by coupling the network inference with a clustering procedure that detects possibly non-consecutive similar networks. Such clusters are then used to build similarity kernels. The convexity of the functional is determined by whether we impose or infer the kernel. In the first case, the optimisation algorithm exploits efficiently proximity operators with closed-form solutions. In the other case, we resort to an alternating minimisation procedure which jointly learns the temporal kernel and the underlying network. Extensive analysis on synthetic data shows the efficacy of our models compared to state-of-the-art methods. Finally, we applied our approach on two real-world applications to show how considering long-term patterns is fundamental to have insights on the behaviour of a complex system.

The eXPose Approach to Crosslier Detection

Antonio Barata, Frank Takes, Hendrik Van Den Herik, Cor Veenman

Responsive image

Auto-TLDR; EXPose: Crosslier Detection Based on Supervised Category Modeling

Slides Poster Similar

Transit of wasteful materials within the European Union is highly regulated through a system of permits. Waste processing costs vary greatly depending on the waste category of a permit. Therefore, companies may have a financial incentive to allege transporting waste with erroneous categorisation. Our goal is to assist inspectors in selecting potentially manipulated permits for further investigation, making their task more effective and efficient. Due to data limitations, a supervised learning approach based on historical cases is not possible. Standard unsupervised approaches, such as outlier detection and data quality-assurance techniques, are not suited since we are interested in targeting non-random modifications in both category and category-correlated features. For this purpose we (1) introduce the concept of crosslier: an anomalous instance of a category which lies across other categories; (2) propose eXPose: a novel approach to crosslier detection based on supervised category modelling; and (3) present the crosslier diagram: a visualisation tool specifically designed for domain experts to easily assess crossliers. We compare eXPose against traditional outlier detection methods in various benchmark datasets with synthetic crossliers and show the superior performance of our method in targeting these instances.

Automatically Mining Relevant Variable Interactions Via Sparse Bayesian Learning

Ryoichiro Yafune, Daisuke Sakuma, Yasuo Tabei, Noritaka Saito, Hiroto Saigo

Responsive image

Auto-TLDR; Sparse Bayes for Interpretable Non-linear Prediction

Slides Poster Similar

With the rapid increase in the availability of large amount of data, prediction is becoming increasingly popular, and has widespread through our daily life. However, powerful non- linear prediction methods such as deep learning and SVM suffer from interpretability problem, making it hard to use in domains where the reason for decision making is required. In this paper, we develop an interpretable non-linear model called itemset Sparse Bayes (iSB), which builds a Bayesian probabilistic model, while simultaneously considering variable interactions. In order to suppress the resulting large number of variables, sparsity is imposed on regression weights by a sparsity inducing prior. As a subroutine to search for variable interactions, itemset enumeration algorithm is employed with a novel bounding condition. In computational experiments using real-world dataset, the proposed method performed better than decision tree by 10% in terms of r-squared . We also demonstrated the advantage of our method in Bayesian optimization setting, in which the proposed approach could successfully find the maximum of an unknown function faster than Gaussian process. The interpretability of iSB is naturally inherited to Bayesian optimization, thereby gives us a clue to understand which variables interactions are important in optimizing an unknown function.

An Intransitivity Model for Matchup and Pairwise Comparison

Yan Gu, Jiuding Duan, Hisashi Kashima

Responsive image

Auto-TLDR; Blade-Chest: A Low-Rank Matrix Approach for Probabilistic Ranking of Players

Slides Poster Similar

Ranking is a ubiquitous problem appearing in many real-world applications. The superior players or objects are oftentimes determined by a matchup or pairwise comparison. Various models have been developed to integrate the matchup results into a single ranking list of players and to further predict the results of future matchups. Amongst them, the Bradley-Terry model is a mainstream model that achieves the goals by constructing explicit probabilistic interpretation. However, the model suffers from its strong assumption of transitive relationships and becomes vulnerable in practices where intransitive relationships exist. Blade-Chest model is an alternative solution to this intransitivity challenge by allowing the multi-dimensional representation of players. In this paper, we propose a low-rank matrix approach to characterize all players and generalize the related works by introducing a unified framework. Our experimental results on synthetic datasets and real-world datasets show that the proposed model is stably competitive with the standard models in terms of the consistency of probabilistic model interpretation and the predictive performance in out-of-sample tests.

Assortative-Constrained Stochastic Block Models

Daniel Gribel, Thibaut Vidal, Michel Gendreau

Responsive image

Auto-TLDR; Constrained Stochastic Block Models for Assortative Communities in Neural Networks

Slides Poster Similar

Stochastic block models (SBMs) are often used to find assortative community structures in networks, such that the probability of connections within communities is higher than in between communities. However, classic SBMs are not limited to assortative structures. In this study, we discuss the implications of this model-inherent indifference towards assortativity or disassortativity, and show that it can lead to undesirable outcomes in datasets which are known to be assortative but which contain a reduced amount of information. To circumvent these issues, we propose a constrained SBM that imposes strong assortativity constraints, along with efficient algorithmic solutions. These constraints significantly boost community-detection capabilities in regimes which are close to the detectability threshold. They also permit to identify structurally-different communities in networks representing cerebral-cortex activity regions.

Quantifying Model Uncertainty in Inverse Problems Via Bayesian Deep Gradient Descent

Riccardo Barbano, Chen Zhang, Simon Arridge, Bangti Jin

Responsive image

Auto-TLDR; Bayesian Neural Networks for Inverse Reconstruction via Bayesian Knowledge-Aided Computation

Slides Poster Similar

Recent advances in reconstruction methods for inverse problems leverage powerful data-driven models, e.g., deep neural networks. These techniques have demonstrated state-of-the-art performances for several imaging tasks, but they often do not provide uncertainty on the obtained reconstructions. In this work, we develop a novel scalable data-driven knowledge-aided computational framework to quantify the model uncertainty via Bayesian neural networks. The approach builds on and extends deep gradient descent, a recently developed greedy iterative training scheme, and recasts it within a probabilistic framework. Scalability is achieved by being hybrid in the architecture: only the last layer of each block is Bayesian, while the others remain deterministic, and by being greedy in training. The framework is showcased on one representative medical imaging modality, viz. computed tomography with either sparse view or limited view data, and exhibits competitive performance with respect to state-of-the-art benchmarks, e.g., total variation, deep gradient descent and learned primal-dual.

Learning to Rank for Active Learning: A Listwise Approach

Minghan Li, Xialei Liu, Joost Van De Weijer, Bogdan Raducanu

Responsive image

Auto-TLDR; Learning Loss for Active Learning

Slides Similar

Active learning emerged as an alternative to alleviate the effort to label huge amount of data for data-hungry applications (such as image/video indexing and retrieval, autonomous driving, etc.). The goal of active learning is to automatically select a number of unlabeled samples for annotation (according to a budget), based on an acquisition function, which indicates how valuable a sample is for training the model. The learning loss method is a task-agnostic approach which attaches a module to learn to predict the target loss of unlabeled data, and select data with the highest loss for labeling. In this work, we follow this strategy but we define the acquisition function as a learning to rank problem and rethink the structure of the loss prediction module, using a simple but effective listwise approach. Experimental results on four datasets demonstrate that our method outperforms recent state-of-the-art active learning approaches for both image classification and regression tasks.

GPSRL: Learning Semi-Parametric Bayesian Survival Rule Lists from Heterogeneous Patient Data

Ameer Hamza Shakur, Xiaoning Qian, Zhangyang Wang, Bobak Mortazavi, Shuai Huang

Responsive image

Auto-TLDR; Semi-parametric Bayesian Survival Rule List Model for Heterogeneous Survival Data

Slides Similar

Survival data is often collected in medical applications from a heterogeneous population of patients. While in the past, popular survival models focused on modeling the average effect of the co-variates on survival outcomes, rapidly advancing sensing and information technologies have provided opportunities to further model the heterogeneity of the population as well as the non-linearity of the survival risk. With this motivation, we propose a new semi-parametric Bayesian Survival Rule List model in this paper. Our model derives a rule-based decision-making approach, while within the regime defined by each rule, survival risk is modelled via a Gaussian process latent variable model. Markov Chain Monte Carlo with a nested Laplace approximation for the latent variable model is used to search over the posterior of the rule lists efficiently. The use of ordered rule lists enables us to model heterogeneity while keeping the model complexity in check. Performance evaluations on a synthetic heterogeneous survival dataset and a real world sepsis survival dataset demonstrate the effectiveness of our model.

Graph Discovery for Visual Test Generation

Neil Hallonquist, Laurent Younes, Donald Geman

Responsive image

Auto-TLDR; Visual Question Answering over Graphs: A Probabilistic Framework for VQA

Slides Poster Similar

We consider the problem of uncovering an unknown attributed graph, where both its edges and vertices are hidden from view, through a sequence of binary questions about it. In order to select questions efficiently, we define a probability distribution over graphs, with randomness not just over edges, but over vertices as well. We then sequentially select questions so as to: (1) minimize the expected entropy of the random graph, given the answers to the previous questions in the sequence; and (2) to instantiate the vertices that compose the graph. We propose some basic question spaces, from which to select questions, that vary in their capacity. We apply this framework to the problem of test generation in Visual Question Answering (VQA), where semantic questions are used to evaluate vision systems over rich image representations. To do this, we use a restricted question vocabulary, resulting in image representations that take the form of scene graphs; by defining a distribution over them, a consistent set of probabilities is associated with the questions, and used in their selection.

A Multilinear Sampling Algorithm to Estimate Shapley Values

Ramin Okhrati, Aldo Lipani

Responsive image

Auto-TLDR; A sampling method for Shapley values for multilayer Perceptrons

Slides Poster Similar

Shapley values are great analytical tools in game theory to measure the importance of a player in a game. Due to their axiomatic and desirable properties such as efficiency, they have become popular for feature importance analysis in data science and machine learning. However, the time complexity to compute Shapley values based on the original formula is exponential, and as the number of features increases, this becomes infeasible. Castro et al. [1] developed a sampling algorithm, to estimate Shapley values. In this work, we propose a new sampling method based on a multilinear extension technique as applied in game theory. The aim is to provide a more efficient (sampling) method for estimating Shapley values. Our method is applicable to any machine learning model, in particular for either multiclass classifications or regression problems. We apply the method to estimate Shapley values for multilayer Perceptrons (MLPs) and through experimentation on two datasets, we demonstrate that our method provides more accurate estimations of the Shapley values by reducing the variance of the sampling statistics

Generic Merging of Structure from Motion Maps with a Low Memory Footprint

Gabrielle Flood, David Gillsjö, Patrik Persson, Anders Heyden, Kalle Åström

Responsive image

Auto-TLDR; A Low-Memory Footprint Representation for Robust Map Merge

Slides Poster Similar

With the development of cheap image sensors, the amount of available image data have increased enormously, and the possibility of using crowdsourced collection methods has emerged. This calls for development of ways to handle all these data. In this paper, we present new tools that will enable efficient, flexible and robust map merging. Assuming that separate optimisations have been performed for the individual maps, we show how only relevant data can be stored in a low memory footprint representation. We use these representations to perform map merging so that the algorithm is invariant to the merging order and independent of the choice of coordinate system. The result is a robust algorithm that can be applied to several maps simultaneously. The result of a merge can also be represented with the same type of low-memory footprint format, which enables further merging and updating of the map in a hierarchical way. Furthermore, the method can perform loop closing and also detect changes in the scene between the capture of the different image sequences. Using both simulated and real data — from both a hand held mobile phone and from a drone — we verify the performance of the proposed method.

Watermelon: A Novel Feature Selection Method Based on Bayes Error Rate Estimation and a New Interpretation of Feature Relevance and Redundancy

Xiang Xie, Wilhelm Stork

Responsive image

Auto-TLDR; Feature Selection Using Bayes Error Rate Estimation for Dynamic Feature Selection

Slides Poster Similar

Feature selection has become a crucial part of many classification problems in which high-dimensional datasets may contain tens of thousands of features. In this paper, we propose a novel feature selection method scoring the features through estimating the Bayes error rate based on kernel density estimation. Additionally, we update the scores of features dynamically by quantitatively interpreting the effects of feature relevance and redundancy in a new way. Distinguishing from the common heuristic applied by many feature selection methods, which prefers choosing features that are not relevant to each other, our approach penalizes only monotonically correlated features and rewards any other kind of relevance among features based on Spearman’s rank correlation coefficient and normalized mutual information. We conduct extensive experiments on seventeen diverse classification benchmarks, the results show that our approach overperforms other seventeen popular state-of-the-art feature selection methods in most cases.

Equation Attention Relationship Network (EARN) : A Geometric Deep Metric Framework for Learning Similar Math Expression Embedding

Saleem Ahmed, Kenny Davila, Srirangaraj Setlur, Venu Govindaraju

Responsive image

Auto-TLDR; Representational Learning for Similarity Based Retrieval of Mathematical Expressions

Slides Poster Similar

Representational Learning in the form of high dimensional embeddings have been used for multiple pattern recognition applications. There has been a significant interest in building embedding based systems for learning representationsin the mathematical domain. At the same time, retrieval of structured information such as mathematical expressions is an important need for modern IR systems. In this work, our motivation is to introduce a robust framework for learning representations for similarity based retrieval of mathematical expressions. Given a query by example, the embedding can find the closest matching expression as a function of euclidean distance between them. We leverage recent advancements in image-based and graph-based deep learning algorithms to learn our similarity embeddings. We do this first, by using uni-modal encoders in graph space and image space and then, a multi-modal combination of the same. To overcome the lack of training data, we force the networks to learn a deep metric using triplets generated with a heuristic scoring function. We also adopt a custom strategy for mining hard samples to train our neural networks. Our system produces rankings similar to those generated by the original scoring function, but using only a fraction of the time. Our results establish the viability of using such a multi-modal embedding for this task.

A Heuristic-Based Decision Tree for Connected Components Labeling of 3D Volumes

Maximilian Söchting, Stefano Allegretti, Federico Bolelli, Costantino Grana

Responsive image

Auto-TLDR; Entropy Partitioning Decision Tree for Connected Components Labeling

Slides Poster Similar

Connected Components Labeling represents a fundamental step for many Computer Vision and Image Processing pipelines. Since the first appearance of the task in the sixties, many algorithmic solutions to optimize the computational load needed to label an image have been proposed. Among them, block-based scan approaches and decision trees revealed to be some of the most valuable strategies. However, due to the cost of the manual construction of optimal decision trees and the computational limitations of automatic strategies employed in the past, the application of blocks and decision trees has been restricted to small masks, and thus to 2D algorithms. With this paper we present a novel heuristic algorithm based on decision tree learning methodology, called Entropy Partitioning Decision Tree (EPDT). It allows to compute near-optimal decision trees for large scan masks. Experimental results demonstrate that algorithms based on the generated decision trees outperform state-of-the-art competitors.

MINT: Deep Network Compression Via Mutual Information-Based Neuron Trimming

Madan Ravi Ganesh, Jason Corso, Salimeh Yasaei Sekeh

Responsive image

Auto-TLDR; Mutual Information-based Neuron Trimming for Deep Compression via Pruning

Slides Poster Similar

Most approaches to deep neural network compression via pruning either evaluate a filter’s importance using its weights or optimize an alternative objective function with sparsity constraints. While these methods offer a useful way to approximate contributions from similar filters, they often either ignore the dependency between layers or solve a more difficult optimization objective than standard cross-entropy. Our method, Mutual Information-based Neuron Trimming (MINT), approaches deep compression via pruning by enforcing sparsity based on the strength of the relationship between filters of adjacent layers, across every pair of layers. The relationship is calculated using conditional geometric mutual information which evaluates the amount of similar information exchanged between the filters using a graph-based criterion. When pruning a network, we ensure that retained filters contribute the majority of the information towards succeeding layers which ensures high performance. Our novel approach outperforms existing state-of-the-art compression-via-pruning methods on the standard benchmarks for this task: MNIST, CIFAR-10, and ILSVRC2012, across a variety of network architectures. In addition, we discuss our observations of a common denominator between our pruning methodology’s response to adversarial attacks and calibration statistics when compared to the original network.

Leveraging Sequential Pattern Information for Active Learning from Sequential Data

Raul Fidalgo-Merino, Lorenzo Gabrielli, Enrico Checchi

Responsive image

Auto-TLDR; Sequential Pattern Information for Active Learning

Slides Poster Similar

This paper presents a novel active learning technique aimed at the selection of sequences for manual annotation from a database of unlabelled sequences. Supervised machine learning algorithms can employ these sequences to build better models than those based on using random sequences for training. The main contribution of the proposed method is the use of sequential pattern information contained in the database to select representative and diverse sequences for annotation. These two characteristics ensure the proper coverage of the instance space of sequences and, at the same time, avoids over-fitting the trained model. The approach, called SPIAL (Sequential Pattern Information for Active Learning), uses sequential pattern mining algorithms to extract frequently occurring sub-sequences from the database and evaluates how representative and diverse each sequence is, based on this information. The output is a list of sequences for annotation sorted by representativeness and diversity. The algorithm is modular and, unlike current techniques, independent of the features taken into account by the machine learning algorithm that trains the model. Experiments done on well-known benchmarks involving sequential data show that the models trained using SPIAL increase their convergence speed while reducing manual effort by selecting small sets of very informative sequences for annotation. In addition, the computation cost using SPIAL is much lower than for the state-of-the-art algorithms evaluated.

Hierarchical Routing Mixture of Experts

Wenbo Zhao, Yang Gao, Shahan Ali Memon, Bhiksha Raj, Rita Singh

Responsive image

Auto-TLDR; A Binary Tree-structured Hierarchical Routing Mixture of Experts for Regression

Slides Poster Similar

In regression tasks the distribution of the data is often too complex to be fitted by a single model. In contrast, partition-based models are developed where data is divided and fitted by local models. These models partition the input space and do not leverage the input-output dependency of multimodal-distributed data, and strong local models are needed to make good predictions. Addressing these problems, we propose a binary tree-structured hierarchical routing mixture of experts (HRME) model that has classifiers as non-leaf node experts and simple regression models as leaf node experts. The classifier nodes jointly soft-partition the input-output space based on the natural separateness of multimodal data. This enables simple leaf experts to be effective for prediction. Further, we develop a probabilistic framework for the HRME model, and propose a recursive Expectation-Maximization (EM) based algorithm to learn both the tree structure and the expert models. Experiments on a collection of regression tasks validate the effectiveness of our method compared to a variety of other regression models.

On Learning Random Forests for Random Forest Clustering

Manuele Bicego, Francisco Escolano

Responsive image

Auto-TLDR; Learning Random Forests for Clustering

Slides Poster Similar

In this paper we study the poorly investigated problem of learning Random Forests for distance-based Random Forest clustering. We studied both classic schemes as well as alternative approaches, novel in this context. In particular, we investigated the suitability of Gaussian Density Forests, Random Forests specifically designed for density estimation. Further, we introduce a novel variant of Random Forest, based on an effective non parametric by-pass estimator of the Renyi entropy, which can be useful when the parametric assumption is too strict. An empirical evaluation involving different datasets and different RF-clustering strategies confirms that the learning step is crucial for RF-clustering. We also present a set of practical guidelines useful to determine the most suitable variant of RF-clustering according to the problem under examination.

Naturally Constrained Online Expectation Maximization

Daniela Pamplona, Antoine Manzanera

Responsive image

Auto-TLDR; Constrained Online Expectation-Maximization for Probabilistic Principal Components Analysis

Slides Poster Similar

With the rise of big data sets, learning algorithms must be adapted to piece-wise mechanisms in order to tackle time and memory costs of large scale calculations. Furthermore, for most learning embedded systems the input data are fed in a sequential and contingent manner: one by one, and possibly class by class. Thus, learning algorithms should not only run online but cope with time-varying, non-independent, and non-balanced training data for the system's entire life. Online Expectation-Maximization is a well-known algorithm for learning probabilistic models in real-time, due to its simplicity and convergence properties. However, these properties are only valid in the case of large, independent and identically distributed (iid) samples. In this paper, we propose to constraint the online Expectation-Maximization on the Fisher distance between the parameters. After the presentation of the algorithm, we make a thorough study of its use in Probabilistic Principal Components Analysis. First, we derive the update rules, then we analyse the effect of the constraint on major problems of online and sequential learning: convergence, forgetting and interference. Furthermore we use several algorithmic protocols: iid {\em vs} sequential data, and constraint parameters updated step-wise {\em vs} class-wise. Our results show that this constraint increases the convergence rate of online Expectation-Maximization, decreases forgetting and slightly introduces transfer learning.

A Novel Adaptive Minority Oversampling Technique for Improved Classification in Data Imbalanced Scenarios

Ayush Tripathi, Rupayan Chakraborty, Sunil Kumar Kopparapu

Responsive image

Auto-TLDR; Synthetic Minority OverSampling Technique for Imbalanced Data

Slides Poster Similar

Imbalance in the proportion of training samples belonging to different classes often poses performance degradation of conventional classifiers. This is primarily due to the tendency of the classifier to be biased towards the majority classes in the imbalanced dataset. In this paper, we propose a novel three step technique to address imbalanced data. As a first step we significantly oversample the minority class distribution by employing the traditional Synthetic Minority OverSampling Technique (SMOTE) algorithm using the neighborhood of the minority class samples and in the next step we partition the generated samples using a Gaussian-Mixture Model based clustering algorithm. In the final step synthetic data samples are chosen based on the weight associated with the cluster, the weight itself being determined by the distribution of the majority class samples. Extensive experiments on several standard datasets from diverse domains show the usefulness of the proposed technique in comparison with the original SMOTE and its state-of-the-art variants algorithms.

Algorithm Recommendation for Data Streams

Jáder Martins Camboim De Sá, Andre Luis Debiaso Rossi, Gustavo Enrique De Almeida Prado Alves Batista, Luís Paulo Faina Garcia

Responsive image

Auto-TLDR; Meta-Learning for Algorithm Selection in Time-Changing Data Streams

Slides Poster Similar

In the last decades, many companies are taking advantage of massive data generation at high frequencies through knowledge discovery to identify valuable information. Machine learning techniques can be employed for knowledge discovery, since they are able to extract patterns from data and induce models to predict future events. However, dynamic and evolving environments generate streams of data that usually are non-stationary. Models induced in these scenarios may perish over time due to seasonality or concept drift. The periodic retraining could help but the fixed algorithm's hypothesis space could no longer be appropriate. An alternative solution is to use meta-learning for periodic algorithm selection in time-changing environments, choosing the bias that best suits the current data. In this paper, we present an enhanced framework for data streams algorithm selection based on MetaStream. Our approach uses meta-learning and incremental learning to actively select the best algorithm for the current concept in a time-changing. Different from previous works, a set of cutting edge meta-features and an incremental learning approach in the meta-level based on LightGBM are used. The results show that this new strategy can improve the recommendation of the best algorithm more accurately in time-changing data.

Graph-Based Image Decoding for Multiplexed in Situ RNA Detection

Gabriele Partel, Carolina Wahlby

Responsive image

Auto-TLDR; A Graph-based Decoding Approach for Multiplexed In situ RNA Detection

Poster Similar

Image-based multiplexed in situ RNA detection makes it possible to map the spatial gene expression of hundreds to thousands of genes in parallel, and thus discern at the same time a large numbers of different cell types to better understand tissue development, heterogeneity, and disease. Fluorescent signals are detected over multiple fluorescent channels and imaging rounds and decoded in order to identify RNA molecules in their morphological context. Here we present a graph-based decoding approach that models the decoding process as a network flow problem jointly optimizing observation likelihoods and distances of signal detections, thus achieving robustness with respect to noise and spatial jitter of the fluorescent signals. We evaluated our method on synthetic data generated at different experimental conditions, and on real data of in situ RNA sequencing, comparing results with respect to alternative and gold standard image decoding pipelines.

Quantifying the Use of Domain Randomization

Mohammad Ani, Hector Basevi, Ales Leonardis

Responsive image

Auto-TLDR; Evaluating Domain Randomization for Synthetic Image Generation by directly measuring the difference between realistic and synthetic data distributions

Slides Poster Similar

Synthetic image generation provides the ability to efficiently produce large quantities of labeled data, which addresses both the data volume requirements of state-of-the-art vision systems and the expense of manually labeling data. However, systems trained on synthetic data typically under-perform systems trained on realistic data due to mismatch between the synthetic and realistic data distributions. Domain Randomization (DR) is a method of broadening a synthetic data distribution to encompass a realistic data distribution, and so provide better performance, when the exact characteristics of the realistic data distribution are not known or cannot be simulated. However, there is no consensus in the literature on the best method of performing DR. We propose a novel method of ranking DR methods by directly measuring the difference between realistic and DR data distributions. This avoids the need to measure task-specific performance and the associated expense of training and evaluation. We compare different methods for measuring distribution differences including the Wasserstein, and Fr\'echet Inception distances. We also examine the effect of performing this evaluation directly on images, and on features generated by an image classification backbone. Finally, we show that the ranking generated by our method is reflected in actual task performance.

Minority Class Oriented Active Learning for Imbalanced Datasets

Umang Aggarwal, Adrian Popescu, Celine Hudelot

Responsive image

Auto-TLDR; Active Learning for Imbalanced Datasets

Slides Poster Similar

Active learning aims to optimize the dataset annotation process when resources are constrained. Most existing methods are designed for balanced datasets. Their practical applicability is limited by the fact that a majority of real-life datasets are actually imbalanced. Here, we introduce a new active learning method which is designed for imbalanced datasets. It favors samples likely to be in minority classes so as to reduce the imbalance of the labeled subset and create a better representation for these classes. We also compare two training schemes for active learning: (1) the one commonly deployed in deep active learning using model fine tuning for each iteration and (2) a scheme which is inspired by transfer learning and exploits generic pre-trained models and train shallow classifiers for each iteration. Evaluation is run with three imbalanced datasets. Results show that the proposed active learning method outperforms competitive baselines. Equally interesting, they also indicate that the transfer learning training scheme outperforms model fine tuning if features are transferable from the generic dataset to the unlabeled one. This last result is surprising and should encourage the community to explore the design of deep active learning methods.

Can Reinforcement Learning Lead to Healthy Life?: Simulation Study Based on User Activity Logs

Masami Takahashi, Masahiro Kohjima, Takeshi Kurashima, Hiroyuki Toda

Responsive image

Auto-TLDR; Reinforcement Learning for Healthy Daily Life

Slides Poster Similar

The importance of developing an application based on intervention technology that leads to a healthier life is widely recognized. A challenging part of realizing the application is the need for planning, i.e., considering a user's health goal (e.g., sleep at 10:00 p.m. to get enough sleep), providing intervention at the appropriate timing to help the user achieve the goal. The reinforcement learning (RL) approach is well suited to this type of problem since it is a methodology for planning; RL finds the optimal strategy as that which maximizes future expected profit. The purpose of this study is to clarify the effects of intervention based on RL to support healthy daily life. Therefore, we (i) collect real daily activity data from participants, (ii) generate a user model that imitates the user's response to system interventions, (iii) examine valuable goals and design them as rewards in RL and (iv) obtain optimal intervention strategies by RL via simulations given a user model and goals. We evaluate a generated user model and verify by simulations whether our method could successfully achieve the goal. In addition, we analyze the cases that demonstrated higher probability of achieving the goal and report the features.

Low-Cost Lipschitz-Independent Adaptive Importance Sampling of Stochastic Gradients

Huikang Liu, Xiaolu Wang, Jiajin Li, Man-Cho Anthony So

Responsive image

Auto-TLDR; Adaptive Importance Sampling for Stochastic Gradient Descent

Slides Similar

Stochastic gradient descent (SGD) usually samples training data based on the uniform distribution, which may not be a good choice because of the high variance of its stochastic gradient. Thus, importance sampling methods are considered in the literature to improve the performance. Most previous work on SGD-based methods with importance sampling requires the knowledge of Lipschitz constants of all component gradients, which are in general difficult to estimate. In this paper, we study an adaptive importance sampling method for common SGD-based methods by exploiting the local first-order information without knowing any Lipschitz constants. In particular, we periodically changes the sampling distribution by only utilizing the gradient norms in the past few iterations. We prove that our adaptive importance sampling non-asymptotically reduces the variance of the stochastic gradients in SGD, and thus better convergence bounds than that for vanilla SGD can be obtained. We extend this sampling method to several other widely used stochastic gradient algorithms including SGD with momentum and ADAM. Experiments on common convex learning problems and deep neural networks illustrate notably enhanced performance using the adaptive sampling strategy.

Unveiling Groups of Related Tasks in Multi-Task Learning

Jordan Frecon, Saverio Salzo, Massimiliano Pontil

Responsive image

Auto-TLDR; Continuous Bilevel Optimization for Multi-Task Learning

Slides Poster Similar

A common approach in multi-task learning is to encourage the tasks to share a low dimensional representation. This has led to the popular method of trace norm regularization, which has proved effective in many applications. In this paper, we extend this approach by allowing the tasks to partition into different groups, within which trace norm regularization is separately applied. We propose a continuous bilevel optimization framework to simultaneously identify groups of related tasks and learn a low dimensional representation within each group. Hinging on recent results on the derivative of generalized matrix functions, we devise a smooth approximation of the upper-level objective via a dual forward-backward algorithm with Bregman distances. This allows us to solve the bilevel problem by a gradient-based scheme. Numerical experiments on synthetic and benchmark datasets support the effectiveness of the proposed method.

Explainable Online Validation of Machine Learning Models for Practical Applications

Wolfgang Fuhl, Yao Rong, Thomas Motz, Michael Scheidt, Andreas Markus Hartel, Andreas Koch, Enkelejda Kasneci

Responsive image

Auto-TLDR; A Reformulation of Regression and Classification for Machine Learning Algorithm Validation

Slides Poster Similar

We present a reformulation of the regression and classification, which aims to validate the result of a machine learning algorithm. Our reformulation simplifies the original problem and validates the result of the machine learning algorithm using the training data. Since the validation of machine learning algorithms must always be explainable, we perform our experiments with the kNN algorithm as well as with an algorithm based on conditional probabilities, which is proposed in this work. For the evaluation of our approach, three publicly available data sets were used and three classification and two regression problems were evaluated. The presented algorithm based on conditional probabilities is also online capable and requires only a fraction of memory compared to the kNN algorithm.

Uniform and Non-Uniform Sampling Methods for Sub-Linear Time K-Means Clustering

Yuanhang Ren, Ye Du

Responsive image

Auto-TLDR; Sub-linear Time Clustering with Constant Approximation Ratio for K-Means Problem

Slides Poster Similar

The $k$-means problem is arguably the most well-known clustering problem in machine learning, and lots of approximation algorithms have been proposed for it. However, many of these algorithms may become infeasible when data is huge. Sub-linear time algorithms with constant approximation ratios are desirable in this scenario. In this paper, we first improve the analysis of the algorithm proposed by \cite{Mohan:2017:BNA:3172077.3172235} by sharpening the approximation ratio from $4(\alpha+\beta)$ to $\alpha+\beta$. Moreover, on mild assumptions of the data, a constant approximation ratio can be achieved in poly-logarithmic time by the algorithm. Furthermore, we propose a novel sub-linear time clustering algorithm called {\it Double-K-M$\text{C}^2$ sampling} as well. Experiments on the data clustering task and the image segmentation task have validated the effectiveness of our algorithms.

Sparse-Dense Subspace Clustering

Shuai Yang, Wenqi Zhu, Yuesheng Zhu

Responsive image

Auto-TLDR; Sparse-Dense Subspace Clustering with Piecewise Correlation Estimation

Slides Poster Similar

Subspace clustering refers to the problem of clustering high-dimensional data into a union of low-dimensional subspaces. Current subspace clustering approaches are usually based on a two-stage framework. In the first stage, an affinity matrix is generated from data. In the second one, spectral clustering is applied on the affinity matrix. However, the affinity matrix produced by two-stage methods cannot fully reveal the similarity between data points from the same subspace, resulting in inaccurate clustering. Besides, most approaches fail to solve large-scale clustering problems due to poor efficiency. In this paper, we first propose a new scalable sparse method called Iterative Maximum Correlation (IMC) to learn the affinity matrix from data. Then we develop Piecewise Correlation Estimation (PCE) to densify the intra-subspace similarity produced by IMC. Finally we extend our work into a Sparse-Dense Subspace Clustering (SDSC) framework with a dense stage to optimize the affinity matrix for two-stage methods. We show that IMC is efficient for large-scale tasks, and PCE ensures better performance for IMC. We show the universality of our SDSC framework for current two-stage methods as well. Experiments on benchmark data sets demonstrate the effectiveness of our approaches.

Categorizing the Feature Space for Two-Class Imbalance Learning

Rosa Sicilia, Ermanno Cordelli, Paolo Soda

Responsive image

Auto-TLDR; Efficient Ensemble of Classifiers for Minority Class Inference

Slides Poster Similar

Class imbalance limits the performance of most learning algorithms, resulting in a low recognition rate for samples belonging to the minority class. Although there are different strategies to address this problem, methods that generate ensemble of classifiers have proven to be effective in several applications. This paper presents a new strategy to construct the training set of each classifier in the ensemble by exploiting information in the feature space that can give rise to unreliable classifications, which are determined by a novel algorithm here introduced. The performance of our proposal is compared against multiple standard ensemble approaches on 25 publicly available datasets, showing promising results.

Adaptive Matching of Kernel Means

Miao Cheng, Xinge You

Responsive image

Auto-TLDR; Adaptive Matching of Kernel Means for Knowledge Discovery and Feature Learning

Slides Poster Similar

As a promising step, the performance of data analysis and feature learning are able to be improved if certain pattern matching mechanism is available. One of the feasible solutions can refer to the importance estimation of instances, and consequently, kernel mean matching (KMM) has become an important method for knowledge discovery and novelty detection in general. Furthermore, the existing KMM methods have focused on concrete learning frameworks. In this work, a novel approach to adaptive matching of kernel means is proposed, and selected data with high importance are adopted to achieve calculation efficiency with optimization. In addition, scalable learning can be conducted in proposed method as a generalized solution with appended data. The experimental results on a wide variety of real-world data sets demonstrate the proposed method is able to give outstanding performance compared with several state-of-the-art methods, while calculation efficiency can be preserved.

Classification and Feature Selection Using a Primal-Dual Method and Projections on Structured Constraints

Michel Barlaud, Antonin Chambolle, Jean_Baptiste Caillau

Responsive image

Auto-TLDR; A Constrained Primal-dual Method for Structured Feature Selection on High Dimensional Data

Slides Poster Similar

This paper deals with feature selection using supervised classification on high dimensional datasets. A classical approach is to project data on a low dimensional space and classify by minimizing an appropriate quadratic cost. Our first contribution is to introduce a matrix of centers in the definition of this cost. Moreover, as quadratic costs are not robust to outliers, we propose to use an $\ell_1$ cost instead (or Huber loss to mitigate overfitting issues). While control on sparsity is commonly obtained by adding an $\ell_1$ constraint on the vectorized matrix of weights used for projecting the data, our second contribution is to enforce structured sparsity. To this end we propose constraints that take into account the matrix structure of the data, based either on the nuclear norm, on the $\ell_{2,1}$ norm, or on the $\ell_{1,2}$ norm for which we provide a new projection algorithm. We optimize simultaneously the projection matrix and the matrix of centers thanks to a new tailored constrained primal-dual method. The primal-dual framework is general enough to encompass the various robust losses and structured constraints we use, and allows a convergence analysis. We demonstrate the effectiveness of the approach on three biological datasets. Our primal-dual method with robust losses, adaptive centers and structured constraints does significantly better than classical methods, both in terms of accuracy and computational time.

Scalable Direction-Search-Based Approach to Subspace Clustering

Yicong He, George Atia

Responsive image

Auto-TLDR; Fast Direction-Search-Based Subspace Clustering

Slides Similar

Subspace clustering finds a multi-subspace representation that best fits a high-dimensional dataset. The computational and storage complexities of existing algorithms limit their usefulness for large scale data. In this paper, we develop a novel scalable approach to subspace clustering termed Fast Direction-Search-Based Subspace Clustering (Fast DiSC). In sharp contrast to existing scalable solutions which are mostly based on the self-expressiveness property of the data, Fast DiSC rests upon a new representation obtained from projections on computed data-dependent directions. These directions are derived from a convex formulation for optimal direction search to gauge hidden similarity relations. The computational complexity is significantly reduced by performing direction search in partitions of sampled data, followed by a retrieval step to cluster out-of-sample data using projections on the computed directions. A theoretical analysis underscores the ability of the proposed formulation to construct local similarity relations for the different data points. Experiments on both synthetic and real data demonstrate that the proposed algorithm can often outperform the state-of-the-art clustering methods.

Motion Segmentation with Pairwise Matches and Unknown Number of Motions

Federica Arrigoni, Tomas Pajdla, Luca Magri

Responsive image

Auto-TLDR; Motion Segmentation using Multi-Modelfitting andpermutation synchronization

Slides Poster Similar

In this paper we address motion segmentation, that is the problem of clustering points in multiple images according to a number of moving objects. Two-frame correspondences are assumed as input without prior knowledge about trajectories. Our method is based on principles from ''multi-model fitting'' and ''permutation synchronization'', and - differently from previous techniques working under the same assumptions - it can handle an unknown number of motions. The proposed approach is validated on standard datasets, showing that it can correctly estimate the number of motions while maintaining comparable or better accuracy than the state of the art.

Leveraging Quadratic Spherical Mutual Information Hashing for Fast Image Retrieval

Nikolaos Passalis, Anastasios Tefas

Responsive image

Auto-TLDR; Quadratic Mutual Information for Large-Scale Hashing and Information Retrieval

Slides Poster Similar

Several deep supervised hashing techniques have been proposed to allow for querying large image databases. However, it is often overlooked that the process of information retrieval can be modeled using information-theoretic metrics, leading to optimizing various proxies for the problem at hand instead. Contrary to this, we propose a deep supervised hashing algorithm that optimizes the learned codes using an information-theoretic measure, the Quadratic Mutual Information (QMI). The proposed method is adapted to the needs of large-scale hashing and information retrieval leading to a novel information-theoretic measure, the Quadratic Spherical Mutual Information (QSMI), that is inspired by QMI, but leads to significant better retrieval precision. Indeed, the effectiveness of the proposed method is demonstrated under several different scenarios, using different datasets and network architectures, outperforming existing deep supervised image hashing techniques.