Low Rank Representation on Product Grassmann Manifolds for Multi-viewSubspace Clustering

Jipeng Guo, Yanfeng Sun, Junbin Gao, Yongli Hu, Baocai Yin

Responsive image

Auto-TLDR; Low Rank Representation on Product Grassmann Manifold for Multi-View Data Clustering

Slides Poster

Clustering high dimension multi-view data with complex intrinsic properties and nonlinear manifold structure is a challenging task since these data are always embedded in low dimension manifolds. Inspired by Low Rank Representation (LRR), some researchers extended classic LRR on Grassmann manifold or Product Grassmann manifold to represent data with non-linear metrics. However, most of these methods utilized convex nuclear norm to leverage a low-rank structure, which was over-relaxation of true rank and would lead to the results deviated from the true underlying ones. And, the computational complexity of singular value decomposition of matrix is high for nuclear norm minimization. In this paper, we propose a new low rank model for high-dimension multi-view data clustering on Product Grassmann Manifold with the matrix tri-factorization which is used to control the upper bound of true rank of representation matrix. And, the original problem can be transformed into the nuclear norm minimization with smaller scale matrices. An effective solution and theoretical analysis are also provided. The experimental results show that the proposed method obviously outperforms other state-of-the-art methods on several multi-source human/crowd action video datasets.

Similar papers

Double Manifolds Regularized Non-Negative Matrix Factorization for Data Representation

Jipeng Guo, Shuai Yin, Yanfeng Sun, Yongli Hu

Responsive image

Auto-TLDR; Double Manifolds Regularized Non-negative Matrix Factorization for Clustering

Slides Poster Similar

Non-negative matrix factorization (NMF) is an important method in learning latent data representation. The local geometrical structure can make the learned representation more effectively and significantly improve the performance of NMF. However, most of existing graph-based learning methods are determined by a predefined similarity graph which may be not optimal for specific tasks. To solve the above the problem, we propose the Double Manifolds Regularized NMF (DMR-NMF) model which jointly learns an adaptive affinity matrix with the non-negative matrix factorization. The learned affinity matrix can guide the NMF to fit the clustering task. Moreover, we develop the iterative updating optimization schemes for DMR-NMF, and provide the strict convergence proof of our optimization strategy. Empirical experiments on four different real-world data sets demonstrate the state-of-the-art performance of DMR-NMF in comparison with the other related algorithms.

A Spectral Clustering on Grassmann Manifold Via Double Low Rank Constraint

Xinglin Piao, Yongli Hu, Junbin Gao, Yanfeng Sun, Xin Yang, Baocai Yin

Responsive image

Auto-TLDR; Double Low Rank Representation for High-Dimensional Data Clustering on Grassmann Manifold

Slides Similar

High-dimension data clustering is a fundamental topic in machine learning and data mining areas. In recent year, researchers have proposed a series of effective methods based on Low Rank Representation (LRR) which could explore low-dimension subspace structure embedded in original data effectively. The traditional LRR methods usually treat original data as samples in Euclidean space. They generally adopt linear metric to measure the distance between two data. However, high-dimension data (such as video clip or imageset) are always considered as non-linear manifold data such as Grassmann manifold. Therefore, the traditional linear Euclidean metric would be no longer suitable for these special data. In addition, traditional LRR clustering method always adopt nuclear norm as low rank constraint which would lead to suboptimal solution and decrease the clustering accuracy. In this paper, we proposed a new low rank method on Grassmann manifold for high-dimension data clustering task. In the proposed method, a double low rank representation approach is proposed by combining the nuclear norm and bilinear representation for better construct the representation matrix. The experimental results on several public datasets show that the proposed method outperforms the state-of-the-art clustering methods.

Embedding Shared Low-Rank and Feature Correlation for Multi-View Data Analysis

Zhan Wang, Lizhi Wang, Hua Huang

Responsive image

Auto-TLDR; embedding shared low-rank and feature correlation for multi-view data analysis

Slides Poster Similar

The diversity of multimedia data in the real-world usually forms multi-view features. How to explore the structure information and correlations among multi-view features is still an open problem. In this paper, we propose a novel multi-view subspace learning method, named embedding shared low-rank and feature correlation (ESLRFC), for multi-view data analysis. First, in the embedding subspace, we propose a robust low-rank model on each feature set and enforce a shared low-rank constraint to characterize the common structure information of multiple feature data. Second, we develop an enhanced correlation analysis in the embedding subspace for simultaneously removing the redundancy of each feature set and exploring the correlations of multiple feature data. Finally, we incorporate the low-rank model and the correlation analysis into a unified framework. The shared low-rank constraint not only depicts the data distribution consistency among multiple feature data, but also assists robust subspace learning. Experimental results on recognition tasks demonstrate the superior performance and noise robustness of the proposed method.

Fast Subspace Clustering Based on the Kronecker Product

Lei Zhou, Xiao Bai, Liang Zhang, Jun Zhou, Edwin Hancock

Responsive image

Auto-TLDR; Subspace Clustering with Kronecker Product for Large Scale Datasets

Slides Poster Similar

Subspace clustering is a useful technique for many computer vision applications in which the intrinsic dimension of high-dimensional data is often smaller than the ambient dimension. Spectral clustering, as one of the main approaches to subspace clustering, often takes on a sparse representation or a low-rank representation to learn a block diagonal self-representation matrix for subspace generation. However, existing methods require solving a large scale convex optimization problem with a large set of data, with computational complexity reaches O(N^3) for N data points. Therefore, the efficiency and scalability of traditional spectral clustering methods can not be guaranteed for large scale datasets. In this paper, we propose a subspace clustering model based on the Kronecker product. Due to the property that the Kronecker product of a block diagonal matrix with any other matrix is still a block diagonal matrix, we can efficiently learn the representation matrix which is formed by the Kronecker product of k smaller matrices. By doing so, our model significantly reduces the computational complexity to O(kN^{3/k}). Furthermore, our model is general in nature, and can be adapted to different regularization based subspace clustering methods. Experimental results on two public datasets show that our model significantly improves the efficiency compared with several state-of-the-art methods. Moreover, we have conducted experiments on synthetic data to verify the scalability of our model for large scale datasets.

Subspace Clustering Via Joint Unsupervised Feature Selection

Wenhua Dong, Xiaojun Wu, Hui Li, Zhenhua Feng, Josef Kittler

Responsive image

Auto-TLDR; Unsupervised Feature Selection for Subspace Clustering

Poster Similar

Any high-dimensional data arising from practical applications usually contains irrelevant features, which may impact on the performance of existing subspace clustering methods. This paper proposes a novel subspace clustering method, which reconstructs the feature matrix by the means of unsupervised feature selection (UFS) to achieve a better dictionary for subspace clustering (SC). Different from most existing clustering methods, the proposed approach uses a reconstructed feature matrix as the dictionary rather than the original data matrix. As the feature matrix reconstructed by representative features is more discriminative and closer to the ground-truth, it results in improved performance. The corresponding non-convex optimization problem is effectively solved using the half-quadratic and augmented Lagrange multiplier methods. Extensive experiments on four real datasets demonstrate the effectiveness of the proposed method.

Sparse-Dense Subspace Clustering

Shuai Yang, Wenqi Zhu, Yuesheng Zhu

Responsive image

Auto-TLDR; Sparse-Dense Subspace Clustering with Piecewise Correlation Estimation

Slides Poster Similar

Subspace clustering refers to the problem of clustering high-dimensional data into a union of low-dimensional subspaces. Current subspace clustering approaches are usually based on a two-stage framework. In the first stage, an affinity matrix is generated from data. In the second one, spectral clustering is applied on the affinity matrix. However, the affinity matrix produced by two-stage methods cannot fully reveal the similarity between data points from the same subspace, resulting in inaccurate clustering. Besides, most approaches fail to solve large-scale clustering problems due to poor efficiency. In this paper, we first propose a new scalable sparse method called Iterative Maximum Correlation (IMC) to learn the affinity matrix from data. Then we develop Piecewise Correlation Estimation (PCE) to densify the intra-subspace similarity produced by IMC. Finally we extend our work into a Sparse-Dense Subspace Clustering (SDSC) framework with a dense stage to optimize the affinity matrix for two-stage methods. We show that IMC is efficient for large-scale tasks, and PCE ensures better performance for IMC. We show the universality of our SDSC framework for current two-stage methods as well. Experiments on benchmark data sets demonstrate the effectiveness of our approaches.

T-SVD Based Non-Convex Tensor Completion and Robust Principal Component Analysis

Tao Li, Jinwen Ma

Responsive image

Auto-TLDR; Non-Convex tensor rank surrogate function and non-convex sparsity measure for tensor recovery

Slides Poster Similar

In this paper, we propose a novel non-convex tensor rank surrogate function and a novel non-convex sparsity measure. The basic idea is to sidestep the bias of $\ell_1-$norm by introducing the concavity. Furthermore, we employ this non-convex penalty in tensor recovery problems such as tensor completion and tensor robust principal component analysis. Due to the concavity, the parameters of these models are difficult to solve. To tackle this problem, we devise a majorization minimization algorithm that can optimize the upper bound of the original function in each iteration, and every sub-problem is solved by the alternating direction multiplier method. We also analyze the theoretical properties of the proposed algorithm. Finally, the experimental results on natural and hyperspectral images demonstrate the efficacy and efficiency of the proposed method.

Subspace Clustering for Action Recognition with Covariance Representations and Temporal Pruning

Giancarlo Paoletti, Jacopo Cavazza, Cigdem Beyan, Alessio Del Bue

Responsive image

Auto-TLDR; Unsupervised Learning for Human Action Recognition from Skeletal Data

Slides Similar

This paper tackles the problem of human action recognition, defined as classifying which action is displayed in a trimmed sequence, from skeletal data. Albeit state-of-the-art approaches designed for this application are all supervised, in this paper we pursue a more challenging direction: Solving the problem with unsupervised learning. To this end, we propose a novel subspace clustering method, which exploits covariance matrix to enhance the action’s discriminability and a timestamp pruning approach that allow us to better handle the temporal dimension of the data. Through a broad experimental validation, we show that our computational pipeline surpasses existing unsupervised approaches but also can result in favorable performances as compared to supervised methods.

Feature Extraction by Joint Robust Discriminant Analysis and Inter-Class Sparsity

Fadi Dornaika, Ahmad Khoder

Responsive image

Auto-TLDR; Robust Discriminant Analysis with Feature Selection and Inter-class Sparsity (RDA_FSIS)

Slides Similar

Feature extraction methods have been successfully applied to many real-world applications. The classical Linear Discriminant Analysis (LDA) and its variants are widely used as feature extraction methods. Although they have been used for different classification tasks, these methods have some shortcomings. The main one is that the projection axes obtained are not informative about the relevance of original features. In this paper, we propose a linear embedding method that merges two interesting properties: Robust LDA and inter-class sparsity. Furthermore, the targeted projection transformation focuses on the most discriminant original features. The proposed method is called Robust Discriminant Analysis with Feature Selection and Inter-class Sparsity (RDA_FSIS). Two kinds of sparsity are explicitly included in the proposed model. The first kind is obtained by imposing the $\ell_{2,1}$ constraint on the projection matrix in order to perform feature ranking. The second kind is obtained by imposing the inter-class sparsity constraint used for getting a common sparsity structure in each class. Comprehensive experiments on five real-world image datasets demonstrate the effectiveness and advantages of our framework over existing linear methods.

Scalable Direction-Search-Based Approach to Subspace Clustering

Yicong He, George Atia

Responsive image

Auto-TLDR; Fast Direction-Search-Based Subspace Clustering

Slides Similar

Subspace clustering finds a multi-subspace representation that best fits a high-dimensional dataset. The computational and storage complexities of existing algorithms limit their usefulness for large scale data. In this paper, we develop a novel scalable approach to subspace clustering termed Fast Direction-Search-Based Subspace Clustering (Fast DiSC). In sharp contrast to existing scalable solutions which are mostly based on the self-expressiveness property of the data, Fast DiSC rests upon a new representation obtained from projections on computed data-dependent directions. These directions are derived from a convex formulation for optimal direction search to gauge hidden similarity relations. The computational complexity is significantly reduced by performing direction search in partitions of sampled data, followed by a retrieval step to cluster out-of-sample data using projections on the computed directions. A theoretical analysis underscores the ability of the proposed formulation to construct local similarity relations for the different data points. Experiments on both synthetic and real data demonstrate that the proposed algorithm can often outperform the state-of-the-art clustering methods.

GraphBGS: Background Subtraction Via Recovery of Graph Signals

Jhony Heriberto Giraldo Zuluaga, Thierry Bouwmans

Responsive image

Auto-TLDR; Graph BackGround Subtraction using Graph Signals

Slides Poster Similar

Background subtraction is a fundamental pre-processing task in computer vision. This task becomes challenging in real scenarios due to variations in the background for both static and moving camera sequences. Several deep learning methods for background subtraction have been proposed in the literature with competitive performances. However, these models show performance degradation when tested on unseen videos; and they require huge amount of data to avoid overfitting. Recently, graph-based algorithms have been successful approaching unsupervised and semi-supervised learning problems. Furthermore, the theory of graph signal processing and semi-supervised learning have been combined leading to new insights in the field of machine learning. In this paper, concepts of recovery of graph signals are introduced in the problem of background subtraction. We propose a new algorithm called Graph BackGround Subtraction (GraphBGS), which is composed of: instance segmentation, background initialization, graph construction, graph sampling, and a semi-supervised algorithm inspired from the theory of recovery of graph signals. Our algorithm has the advantage of requiring less data than deep learning methods while having competitive results on both: static and moving camera videos. GraphBGS outperforms unsupervised and supervised methods in several challenging conditions on the publicly available Change Detection (CDNet2014), and UCSD background subtraction databases.

Sketch-Based Community Detection Via Representative Node Sampling

Mahlagha Sedghi, Andre Beckus, George Atia

Responsive image

Auto-TLDR; Sketch-based Clustering of Community Detection Using a Small Sketch

Slides Poster Similar

This paper proposes a sketch-based approach to the community detection problem which clusters the full graph through the use of an informative and concise sketch. The reduced sketch is built through an effective sampling approach which selects few nodes that best represent the complete graph and operates on a pairwise node similarity measure based on the average commute time. After sampling, the proposed algorithm clusters the nodes in the sketch, and then infers the cluster membership of the remaining nodes in the full graph based on their aggregate similarity to nodes in the partitioned sketch. By sampling nodes with strong representation power, our approach can improve the success rates over full graph clustering. In challenging cases with large node degree variation, our approach not only maintains competitive accuracy with full graph clustering despite using a small sketch, but also outperforms existing sampling methods. The use of a small sketch allows considerable storage savings, and computational and timing improvements for further analysis such as clustering and visualization. We provide numerical results on synthetic data based on the homogeneous, heterogeneous and degree corrected versions of the stochastic block model, as well as experimental results on real-world data.

Soft Label and Discriminant Embedding Estimation for Semi-Supervised Classification

Fadi Dornaika, Abdullah Baradaaji, Youssof El Traboulsi

Responsive image

Auto-TLDR; Semi-supervised Semi-Supervised Learning for Linear Feature Extraction and Label Propagation

Slides Poster Similar

In recent times, graph-based semi-supervised learning proved to be a powerful paradigm for processing and mining large datasets. The main advantage relies on the fact that these methods can be useful in propagating a small set of known labels to a large set of unlabeled data. The scarcity of labeled data may affect the performance of the semi-learning. This paper introduces a new semi-supervised framework for simultaneous linear feature extraction and label propagation. The proposed method simultaneously estimates a discriminant transformation and the unknown label by exploiting both labeled and unlabeled data. In addition, the unknowns of the learning model are estimated by integrating two types of graph-based smoothness constraints. The resulting semi-supervised model is expected to learn more discriminative information. Experiments are conducted on six public image datasets. These experimental results show that the performance of the proposed method can be better than that of many state-of-the-art graph-based semi-supervised algorithms.

Fast Discrete Cross-Modal Hashing Based on Label Relaxation and Matrix Factorization

Donglin Zhang, Xiaojun Wu, Zhen Liu, Jun Yu, Josef Kittler

Responsive image

Auto-TLDR; LRMF: Label Relaxation and Discrete Matrix Factorization for Cross-Modal Retrieval

Poster Similar

In recent years, cross-media retrieval has drawn considerable attention due to the exponential growth of multimedia data. Many hashing approaches have been proposed for the cross-media search task. However, there are still open problems that warrant investigation. For example, most existing supervised hashing approaches employ a binary label matrix, which achieves small margins between wrong labels (0) and true labels (1). This may affect the retrieval performance by generating many false negatives and false positives. In addition, some methods adopt a relaxation scheme to solve the binary constraints, which may cause large quantization errors. There are also some discrete hashing methods that have been presented, but most of them are time-consuming. To conquer these problems, we present a label relaxation and discrete matrix factorization method (LRMF) for cross-modal retrieval. It offers a number of innovations. First of all, the proposed approach employs a novel label relaxation scheme to control the margins adaptively, which has the benefit of reducing the quantization error. Second, by virtue of the proposed discrete matrix factorization method designed to learn the binary codes, large quantization errors caused by relaxation can be avoided. The experimental results obtained on two widely-used databases demonstrate that LRMF outperforms state-of-the-art cross-media methods.

Label Self-Adaption Hashing for Image Retrieval

Jianglin Lu, Zhihui Lai, Hailing Wang, Jie Zhou

Responsive image

Auto-TLDR; Label Self-Adaption Hashing for Large-Scale Image Retrieval

Slides Poster Similar

Hashing has attracted widespread attention in image retrieval because of its fast retrieval speed and low storage cost. Compared with supervised methods, unsupervised hashing methods are more reasonable and suitable for large-scale image retrieval since it is always difficult and expensive to collect true labels of the massive data. Without label information, however, unsupervised hashing methods can not guarantee the quality of learned binary codes. To resolve this dilemma, this paper proposes a novel unsupervised hashing method called Label Self-Adaption Hashing (LSAH), which contains effective hashing function learning part and self-adaption label generation part. In the first part, we utilize anchor graph to keep the local structure of the data and introduce joint sparsity into the model to extract effective features for high-quality binary code learning. In the second part, a self-adaptive cluster label matrix is learned from the data under the assumption that the nearest neighbor points should have a large probability to be in the same cluster. Therefore, the proposed LSAH can make full use of the potential discriminative information of the data to guide the learning of binary code. It is worth noting that LSAH can learn effective binary codes, hashing function and cluster labels simultaneously in a unified optimization framework. To solve the resulting optimization problem, an Augmented Lagrange Multiplier based iterative algorithm is elaborately designed. Extensive experiments on three large-scale data sets indicate the promising performance of the proposed LSAH.

Constrained Spectral Clustering Network with Self-Training

Xinyue Liu, Shichong Yang, Linlin Zong

Responsive image

Auto-TLDR; Constrained Spectral Clustering Network: A Constrained Deep spectral clustering network

Slides Poster Similar

Deep spectral clustering networks have shown their superiorities due to the integration of feature learning and cluster assignment, and the ability to deal with non-convex clusters. Nevertheless, deep spectral clustering is still an ill-posed problem. Specifically, the affinity learned by the most remarkable SpectralNet is not guaranteed to be consistent with local invariance and thus hurts the final clustering performance. In this paper, we propose a novel framework of Constrained Spectral Clustering Network (CSCN) by incorporating pairwise constraints and clustering oriented fine-tuning to deal with the ill-posedness. To the best of our knowledge, this is the first constrained deep spectral clustering method. Another advantage of CSCN over existing constrained deep clustering networks is that it propagates pairwise constraints throughout the entire dataset. In addition, we design a clustering oriented loss by self-training to simultaneously finetune feature representations and perform cluster assignments, which further improve the quality of clustering. Extensive experiments on benchmark datasets demonstrate that our approach outperforms the state-of-the-art clustering methods.

Wasserstein k-Means with Sparse Simplex Projection

Takumi Fukunaga, Hiroyuki Kasai

Responsive image

Auto-TLDR; SSPW $k$-means: Sparse Simplex Projection-based Wasserstein $ k$-Means Algorithm

Slides Poster Similar

This paper presents a proposal of a faster Wasserstein $k$-means algorithm for histogram data by reducing Wasserstein distance computations exploiting sparse simplex projection. We shrink data samples, centroids and ground cost matrix, which enables significant reduction of the computations to solve optimal transport problems without loss of clustering quality. Furthermore, we dynamically reduce computational complexity by removing lower-valued data samples harnessing sparse simplex projection while keeping degradation of clustering quality lower. We designate this proposed algorithm as sparse simplex projection-based Wasserstein $k$-means, for short, SSPW $k$-means. Numerical evaluations against Wasserstein $k$-means algorithm demonstrate the effectiveness of the proposed SSPW $k$-means on real-world datasets.

Ultrasound Image Restoration Using Weighted Nuclear Norm Minimization

Hanmei Yang, Ye Luo, Jianwei Lu, Jian Lu

Responsive image

Auto-TLDR; A Nonconvex Low-Rank Matrix Approximation Model for Ultrasound Images Restoration

Poster Similar

Ultrasound images are often contaminated by speckle noise during the acquisition process, which influences the performance of subsequent application. The paper introduces a nonconvex low-rank matrix approximation model for ultrasound images restoration, which integrates the weighted unclear norm minimization (WNNM) and data fidelity term. WNNM can adaptively assign weights on differnt singular values to preserve more details in restored images. The fidelity term about ultrasound images do not be utilized in existing low-rank ultrasound denoising methods. This optimization question can effectively solved by alternating direction method of multipliers (ADMM). The experimental results on simulated images and real medical ultrasound images demonstrate the excellent performance of the proposed method compared with other four state-of-the-art methods.

Feature Extraction and Selection Via Robust Discriminant Analysis and Class Sparsity

Ahmad Khoder, Fadi Dornaika

Responsive image

Auto-TLDR; Hybrid Linear Discriminant Embedding for supervised multi-class classification

Slides Poster Similar

The main goal of discriminant embedding is to extract features that can be compact and informative representations of the original set of features. This paper introduces a hybrid scheme for linear feature extraction for supervised multi-class classification. We introduce a unifying criterion that is able to retain the advantages of robust sparse LDA and Inter-class sparsity. Thus, the estimated transformation includes two types of discrimination which are the inter-class sparsity and robust Linear Discriminant Analysis with feature selection. In order to optimize the proposed objective function, we deploy an iterative alternating minimization scheme for estimating the linear transformation and the orthogonal matrix. The introduced scheme is generic in the sense that it can be used for combining and tuning many other linear embedding methods. In the lights of the experiments conducted on six image datasets including faces, objects, and digits, the proposed scheme was able to outperform competing methods in most of the cases.

Classification and Feature Selection Using a Primal-Dual Method and Projections on Structured Constraints

Michel Barlaud, Antonin Chambolle, Jean_Baptiste Caillau

Responsive image

Auto-TLDR; A Constrained Primal-dual Method for Structured Feature Selection on High Dimensional Data

Slides Poster Similar

This paper deals with feature selection using supervised classification on high dimensional datasets. A classical approach is to project data on a low dimensional space and classify by minimizing an appropriate quadratic cost. Our first contribution is to introduce a matrix of centers in the definition of this cost. Moreover, as quadratic costs are not robust to outliers, we propose to use an $\ell_1$ cost instead (or Huber loss to mitigate overfitting issues). While control on sparsity is commonly obtained by adding an $\ell_1$ constraint on the vectorized matrix of weights used for projecting the data, our second contribution is to enforce structured sparsity. To this end we propose constraints that take into account the matrix structure of the data, based either on the nuclear norm, on the $\ell_{2,1}$ norm, or on the $\ell_{1,2}$ norm for which we provide a new projection algorithm. We optimize simultaneously the projection matrix and the matrix of centers thanks to a new tailored constrained primal-dual method. The primal-dual framework is general enough to encompass the various robust losses and structured constraints we use, and allows a convergence analysis. We demonstrate the effectiveness of the approach on three biological datasets. Our primal-dual method with robust losses, adaptive centers and structured constraints does significantly better than classical methods, both in terms of accuracy and computational time.

A Multi-Task Multi-View Based Multi-Objective Clustering Algorithm

Sayantan Mitra, Sriparna Saha

Responsive image

Auto-TLDR; MTMV-MO: Multi-task multi-view multi-objective optimization for multi-task clustering

Slides Poster Similar

In recent years, multi-view multi-task clustering has received much attention. There are several real-life problems that involve both multi-view clustering and multi-task clustering, i.e., the tasks are closely related, and each task can be analyzed using multiple views. Traditional multi-task multi-view clustering algorithms use single-objective optimization-based approaches and cannot apply too-many regularization terms. However, these problems are inherently some multi-objective optimization problems because conflict may be between different views within a given task and also between different tasks, necessitating a trade-off. Based on these observations, in this paper, we have proposed a novel multi-task multi-view multi-objective optimization (MTMV-MO) algorithm which simultaneously optimizes three objectives, i.e., within-view task relation, within-task view relation and the quality of the clusters obtained. The proposed methodology (MTMV-MO) is evaluated on four different datasets and the results are compared with five state-of-the-art algorithms in terms of Adjusted Rand Index (ARI) and Classification Accuracy (%CoA). MTMV-MO illustrates an improvement of 1.5-2% in terms of ARI and 4-5% in terms of %CoA compared to the state-of-the-art algorithms.

Graph Spectral Feature Learning for Mixed Data of Categorical and Numerical Type

Saswata Sahoo, Souradip Chakraborty

Responsive image

Auto-TLDR; Feature Learning in Mixed Type of Variable by an undirected graph

Slides Poster Similar

Feature learning in the presence of a mixed type of variables, numerical and categorical types, is important for related modeling problems. In this work, we propose a novel strategy to explicitly model the probabilistic dependence structure among the mixed type of variables by an undirected graph. The dependence structure among different pairs of variables are encoded by a suitable mapping function to estimate the edges of the graph. Spectral decomposition of the graph Laplacian provides the desired feature transformation. We numerically validate the implications of the feature learning strategy on various datasets in terms of data clustering.

Variational Deep Embedding Clustering by Augmented Mutual Information Maximization

Qiang Ji, Yanfeng Sun, Yongli Hu, Baocai Yin

Responsive image

Auto-TLDR; Clustering by Augmented Mutual Information maximization for Deep Embedding

Slides Poster Similar

Clustering is a crucial but challenging task in pattern analysis and machine learning. Recent many deep clustering methods combining representation learning with cluster techniques emerged. These deep clustering methods mainly focus on the correlation among samples and ignore the relationship between samples and their representations. In this paper, we propose a novel end-to-end clustering framework, namely variational deep embedding clustering by augmented mutual information maximization (VCAMI). From the perspective of VAE, we prove that minimizing reconstruction loss is equivalent to maximizing the mutual information of the input and its latent representation. This provides a theoretical guarantee for us to directly maximize the mutual information instead of minimizing reconstruction loss. Therefore we proposed the augmented mutual information which highlights the uniqueness of the representations while discovering invariant information among similar samples. Extensive experiments on several challenging image datasets show that the VCAMI achieves good performance. we achieve state-of-the-art results for clustering on MNIST (99.5%) and CIFAR-10 (65.4%) to the best of our knowledge.

JECL: Joint Embedding and Cluster Learning for Image-Text Pairs

Sean Yang, Kuan-Hao Huang, Bill Howe

Responsive image

Auto-TLDR; JECL: Clustering Image-Caption Pairs with Parallel Encoders and Regularized Clusters

Poster Similar

We propose JECL, a method for clustering image-caption pairs by training parallel encoders with regularized clustering and alignment objectives, simultaneously learning both representations and cluster assignments. These image-caption pairs arise frequently in high-value applications where structured training data is expensive to produce, but free-text descriptions are common. JECL trains by minimizing the Kullback-Leibler divergence between the distribution of the images and text to that of a combined joint target distribution and optimizing the Jensen-Shannon divergence between the soft cluster assignments of the images and text. Regularizers are also applied to JECL to prevent trivial solutions. Experiments show that JECL outperforms both single-view and multi-view methods on large benchmark image-caption datasets, and is remarkably robust to missing captions and varying data sizes.

Deep Topic Modeling by Multilayer Bootstrap Network and Lasso

Jian-Yu Wang, Xiao-Lei Zhang

Responsive image

Auto-TLDR; Unsupervised Deep Topic Modeling with Multilayer Bootstrap Network and Lasso

Slides Poster Similar

Topic modeling is widely studied for the dimension reduction and analysis of documents. However, it is formulated as a difficult optimization problem. Current approximate solutions also suffer from inaccurate model- or data-assumptions. To deal with the above problems, we propose a polynomial-time deep topic model with no model and data assumptions. Specifically, we first apply multilayer bootstrap network (MBN), which is an unsupervised deep model, to reduce the dimension of documents, and then use the low-dimensional data representations or their clustering results as the target of supervised Lasso for topic word discovery. To our knowledge, this is the first time that MBN and Lasso are applied to unsupervised topic modeling. Experimental comparison results with five representative topic models on the 20-newsgroups and TDT2 corpora illustrate the effectiveness of the proposed algorithm.

A Distinct Discriminant Canonical Correlation Analysis Network Based Deep Information Quality Representation for Image Classification

Lei Gao, Zheng Guo, Ling Guan Ling Guan

Responsive image

Auto-TLDR; DDCCANet: Deep Information Quality Representation for Image Classification

Slides Poster Similar

In this paper, we present a distinct discriminant canonical correlation analysis network (DDCCANet) based deep information quality representation with application to image classification. Specifically, to explore the sufficient discriminant information between different data sets, the within-class and between-class correlation matrices are employed and optimized jointly. Moreover, different from the existing canonical correlation analysis network (CCANet) and related algorithms, an information theoretic descriptor, information quality (IQ), is adopted to generate the deep-level feature representation for image classification. Benefiting from the explored discriminant information and IQ descriptor, it is potential to gain a more effective deep-level representation from multi-view data sets, leading to improved performance in classification tasks. To demonstrate the effectiveness of the proposed DDCCANet, we conduct experiments on the Olivetti Research Lab (ORL) face database, ETH80 database and CIFAR10 database. Experimental results show the superiority of the proposed solution on image classification.

Discrete Semantic Matrix Factorization Hashing for Cross-Modal Retrieval

Jianyang Qin, Lunke Fei, Shaohua Teng, Wei Zhang, Genping Zhao, Haoliang Yuan

Responsive image

Auto-TLDR; Discrete Semantic Matrix Factorization Hashing for Cross-Modal Retrieval

Slides Poster Similar

Hashing has been widely studied for cross-modal retrieval due to its promising efficiency and effectiveness in massive data analysis. However, most existing supervised hashing has the limitations of inefficiency for very large-scale search and intractable discrete constraint for hash codes learning. In this paper, we propose a new supervised hashing method, namely, Discrete Semantic Matrix Factorization Hashing (DSMFH), for cross-modal retrieval. First, we conduct the matrix factorization via directly utilizing the available label information to obtain a latent representation, so that both the inter-modality and intra-modality similarities are well preserved. Then, we simultaneously learn the discriminative hash codes and corresponding hash functions by deriving the matrix factorization into a discrete optimization. Finally, we adopt an alternatively iterative procedure to efficiently optimize the matrix factorization and discrete learning. Extensive experimental results on three widely used image-tag databases demonstrate the superiority of the DSMFH over state-of-the-art cross-modal hashing methods.

Snapshot Hyperspectral Imaging Based on Weighted High-Order Singular Value Regularization

Hua Huang, Cheng Niankai, Lizhi Wang

Responsive image

Auto-TLDR; High-Order Tensor Optimization for Hyperspectral Imaging

Slides Poster Similar

Snapshot hyperspectral imaging can capture the 3D hyperspectral image (HSI) with a single 2D measurement and has attracted increasing attention recently. Recovering the underlying HSI from the compressive measurement is an ill-posed problem and exploiting the image prior is essential for solving this ill-posed problem. However, existing reconstruction methods always start from modeling image prior with the 1D vector or 2D matrix and cannot fully exploit the structurally spectral-spatial nature in 3D HSI, thus leading to a poor fidelity. In this paper, we propose an effective high-order tensor optimization based method to boost the reconstruction fidelity for snapshot hyperspectral imaging. We first build high-order tensors by exploiting the spatial-spectral correlation in HSI. Then, we propose a weight high-order singular value regularization (WHOSVR) based low-rank tensor recovery model to characterize the structure prior of HSI. By integrating the structure prior in WHOSVR with the system imaging process, we develop an optimization framework for HSI reconstruction, which is finally solved via the alternating minimization algorithm. Extensive experiments implemented on two representative systems demonstrate that our method outperforms state-of-the-art methods.

Unveiling Groups of Related Tasks in Multi-Task Learning

Jordan Frecon, Saverio Salzo, Massimiliano Pontil

Responsive image

Auto-TLDR; Continuous Bilevel Optimization for Multi-Task Learning

Slides Poster Similar

A common approach in multi-task learning is to encourage the tasks to share a low dimensional representation. This has led to the popular method of trace norm regularization, which has proved effective in many applications. In this paper, we extend this approach by allowing the tasks to partition into different groups, within which trace norm regularization is separately applied. We propose a continuous bilevel optimization framework to simultaneously identify groups of related tasks and learn a low dimensional representation within each group. Hinging on recent results on the derivative of generalized matrix functions, we devise a smooth approximation of the upper-level objective via a dual forward-backward algorithm with Bregman distances. This allows us to solve the bilevel problem by a gradient-based scheme. Numerical experiments on synthetic and benchmark datasets support the effectiveness of the proposed method.

Feature-Aware Unsupervised Learning with Joint Variational Attention and Automatic Clustering

Wang Ru, Lin Li, Peipei Wang, Liu Peiyu

Responsive image

Auto-TLDR; Deep Variational Attention Encoder-Decoder for Clustering

Slides Poster Similar

Deep clustering aims to cluster unlabeled real-world samples by mining deep feature representation. Most of existing methods remain challenging when handling high-dimensional data and simultaneously exploring the complementarity of deep feature representation and clustering. In this paper, we propose a novel Deep Variational Attention Encoder-decoder for Clustering (DVAEC). Our DVAEC improves the representation learning ability by fusing variational attention. Specifically, we design a feature-aware automatic clustering module to mitigate the unreliability of similarity calculation and guide network learning. Besides, to further boost the performance of deep clustering from a global perspective, we define a joint optimization objective to promote feature representation learning and automatic clustering synergistically. Extensive experimental results show the promising performance achieved by our DVAEC on six datasets comparing with several popular baseline clustering methods.

Exploiting Elasticity in Tensor Ranks for Compressing Neural Networks

Jie Ran, Rui Lin, Hayden Kwok-Hay So, Graziano Chesi, Ngai Wong

Responsive image

Auto-TLDR; Nuclear-Norm Rank Minimization Factorization for Deep Neural Networks

Slides Poster Similar

Elasticities in depth, width, kernel size and resolution have been explored in compressing deep neural networks (DNNs). Recognizing that the kernels in a convolutional neural network (CNN) are 4-way tensors, we further exploit a new elasticity dimension along the input-output channels. Specifically, a novel nuclear-norm rank minimization factorization (NRMF) approach is proposed to dynamically and globally search for the reduced tensor ranks during training. Correlation between tensor ranks across multiple layers is revealed, and a graceful tradeoff between model size and accuracy is obtained. Experiments then show the superiority of NRMF over the previous non-elastic variational Bayesian matrix factorization (VBMF) scheme.

Nonlinear Ranking Loss on Riemannian Potato Embedding

Byung Hyung Kim, Yoonje Suh, Honggu Lee, Sungho Jo

Responsive image

Auto-TLDR; Riemannian Potato for Rank-based Metric Learning

Slides Poster Similar

We propose a rank-based metric learning method by leveraging a concept of the Riemannian Potato for better separating non-linear data. By exploring the geometric properties of Riemannian manifolds, the proposed loss function optimizes the measure of dispersion using the distribution of Riemannian distances between a reference sample and neighbors and builds a ranked list according to the similarities. We show the proposed function can learn a hypersphere for each class, preserving the similarity structure inside it on Riemannian manifold. As a result, compared with Euclidean distance-based metric, our method can further jointly reduce the intra-class distances and enlarge the inter-class distances for learned features, consistently outperforming state-of-the-art methods on three widely used non-linear datasets.

Learning Sign-Constrained Support Vector Machines

Kenya Tajima, Kouhei Tsuchida, Esmeraldo Ronnie Rey Zara, Naoya Ohta, Tsuyoshi Kato

Responsive image

Auto-TLDR; Constrained Sign Constraints for Learning Linear Support Vector Machine

Poster Similar

Domain knowledge is useful to improve the generalization performance of learning machines. Sign constraints are a handy representation to combine domain knowledge with learning machine. In this paper, we consider constraining the signs of the weight coefficients in learning the linear support vector machine, and develop two optimization algorithms for minimizing the empirical risk under the sign constraints. One of the two algorithms is based on the projected gradient method, in which each iteration of the projected gradient method takes O(nd) computational cost and the sublinear convergence of the objective error is guaranteed. The second algorithm is based on the Frank-Wolfe method that also converges sublinearly and possesses a clear termination criterion. We show that each iteration of the Frank-Wolfe also requires O(nd) cost. Furthermore, we derive the explicit expression for the minimal iteration number to ensure an epsilon-accurate solution by analyzing the curvature of the objective function. Finally, we empirically demonstrate that the sign constraints are a promising technique when similarities to the training examples compose the feature vector.

Learning Connectivity with Graph Convolutional Networks

Hichem Sahbi

Responsive image

Auto-TLDR; Learning Graph Convolutional Networks Using Topological Properties of Graphs

Slides Poster Similar

Learning graph convolutional networks (GCNs) is an emerging field which aims at generalizing convolutional operations to arbitrary non-regular domains. In particular, GCNs operating on spatial domains show superior performances compared to spectral ones, however their success is highly dependent on how the topology of input graphs is defined. In this paper, we introduce a novel framework for graph convolutional networks that learns the topological properties of graphs. The design principle of our method is based on the optimization of a constrained objective function which learns not only the usual convolutional parameters in GCNs but also a transformation basis that conveys the most relevant topological relationships in these graphs. Experiments conducted on the challenging task of skeleton-based action recognition shows the superiority of the proposed method compared to handcrafted graph design as well as the related work.

Learning Sparse Deep Neural Networks Using Efficient Structured Projections on Convex Constraints for Green AI

Michel Barlaud, Frederic Guyard

Responsive image

Auto-TLDR; Constrained Deep Neural Network with Constrained Splitting Projection

Slides Poster Similar

In recent years, deep neural networks (DNN) have been applied to different domains and achieved dramatic performance improvements over state-of-the-art classical methods. These performances of DNNs were however often obtained with networks containing millions of parameters and which training required heavy computational power. In order to cope with this computational issue a huge literature deals with proximal regularization methods which are time consuming.\\ In this paper, we propose instead a constrained approach. We provide the general framework for our new splitting projection gradient method. Our splitting algorithm iterates a gradient step and a projection on convex sets. We study algorithms for different constraints: the classical $\ell_1$ unstructured constraint and structured constraints such as the nuclear norm, the $\ell_{2,1} $ constraint (Group LASSO). We propose a new $\ell_{1,1} $ structured constraint for which we provide a new projection algorithm We demonstrate the effectiveness of our method on three popular datasets (MNIST, Fashion MNIST and CIFAR). Experiments on these datasets show that our splitting projection method with our new $\ell_{1,1} $ structured constraint provides the best reduction of memory and computational power. Experiments show that fully connected linear DNN are more efficient for green AI.

Joint Learning Multiple Curvature Descriptor for 3D Palmprint Recognition

Lunke Fei, Bob Zhang, Jie Wen, Chunwei Tian, Peng Liu, Shuping Zhao

Responsive image

Auto-TLDR; Joint Feature Learning for 3D palmprint recognition using curvature data vectors

Slides Poster Similar

3D palmprint-based biometric recognition has drawn growing research attention due to its several merits over 2D counterpart such as robust structural measurement of a palm surface and high anti-counterfeiting capability. However, most existing 3D palmprint descriptors are hand-crafted that usually extract stationary features from 3D palmprint images. In this paper, we propose a feature learning method to jointly learn compact curvature feature descriptor for 3D palmprint recognition. We first form multiple curvature data vectors to completely sample the intrinsic curvature information of 3D palmprint images. Then, we jointly learn a feature projection function that project curvature data vectors into binary feature codes, which have the maximum inter-class variances and minimum intra-class distance so that they are discriminative. Moreover, we learn the collaborative binary representation of the multiple curvature feature codes by minimizing the information loss between the final representation and the multiple curvature features, so that the proposed method is more compact in feature representation and efficient in matching. Experimental results on the baseline 3D palmprint database demonstrate the superiority of the proposed method in terms of recognition performance in comparison with state-of-the-art 3D palmprint descriptors.

Webly Supervised Image-Text Embedding with Noisy Tag Refinement

Niluthpol Mithun, Ravdeep Pasricha, Evangelos Papalexakis, Amit Roy-Chowdhury

Responsive image

Auto-TLDR; Robust Joint Embedding for Image-Text Retrieval Using Web Images

Slides Similar

In this paper, we address the problem of utilizing web images in training robust joint embedding models for the image-text retrieval task. Prior webly supervised approaches directly leverage weakly annotated web images in the joint embedding learning framework. The objective of these approaches would suffer significantly when the ratio of noisy and missing tags associated with the web images is very high. In this regard, we propose a CP decomposition based tensor completion framework to refine the tags of web images by modeling observed ternary inter-relations between the sets of labeled images, tags, and web images as a tensor. To effectively deal with the high ratio of missing entries likely in our case, we incorporate intra-modal correlation as side information in the proposed framework. Our tag refinement approach combined with existing web supervised image-text embedding approaches provide a more principled way for learning the joint embedding models in the presence of significant noise from web data and limited clean labeled data. Experiments on benchmark datasets demonstrate that the proposed approach helps to achieve a significant performance gain in image-text retrieval.

N2D: (Not Too) Deep Clustering Via Clustering the Local Manifold of an Autoencoded Embedding

Ryan Mcconville, Raul Santos-Rodriguez, Robert Piechocki, Ian Craddock

Responsive image

Auto-TLDR; Local Manifold Learning for Deep Clustering on Autoencoded Embeddings

Slides Similar

Deep clustering has increasingly been demonstrating superiority over conventional shallow clustering algorithms. Deep clustering algorithms usually combine representation learning with deep neural networks to achieve this performance, typically optimizing a clustering and non-clustering loss. In such cases, an autoencoder is typically connected with a clustering network, and the final clustering is jointly learned by both the autoencoder and clustering network. Instead, we propose to learn an autoencoded embedding and then search this further for the underlying manifold. For simplicity, we then cluster this with a shallow clustering algorithm, rather than a deeper network. We study a number of local and global manifold learning methods on both the raw data and autoencoded embedding, concluding that UMAP in our framework is able to find the best clusterable manifold of the embedding. This suggests that local manifold learning on an autoencoded embedding is effective for discovering higher quality clusters. We quantitatively show across a range of image and time-series datasets that our method has competitive performance against the latest deep clustering algorithms, including out-performing current state-of-the-art on several. We postulate that these results show a promising research direction for deep clustering. The code can be found at https://github.com/rymc/n2d.

An Efficient Empirical Solver for Localized Multiple Kernel Learning Via DNNs

Ziming Zhang

Responsive image

Auto-TLDR; Localized Multiple Kernel Learning using LMKL-Net

Slides Poster Similar

In this paper we propose solving localized multiple kernel learning (LMKL) using LMKL-Net, a feedforward deep neural network (DNN). In contrast to previous works, as a learning principle we propose parameterizing the gating function for learning kernel combination weights and the multiclass classifier using an attentional network (AN) and a multilayer perceptron (MLP), respectively. Such interpretability helps us better understand how the network solves the problem. Thanks to stochastic gradient descent (SGD), our approach has {\em linear} computational complexity in training. Empirically on benchmark datasets we demonstrate that with comparable or better accuracy than the state-of-the-art, our LMKL-Net can be trained about {\bf two orders of magnitude} faster with about {\bf two orders of magnitude} smaller memory footprint for large-scale learning.

Low-Cost Lipschitz-Independent Adaptive Importance Sampling of Stochastic Gradients

Huikang Liu, Xiaolu Wang, Jiajin Li, Man-Cho Anthony So

Responsive image

Auto-TLDR; Adaptive Importance Sampling for Stochastic Gradient Descent

Slides Similar

Stochastic gradient descent (SGD) usually samples training data based on the uniform distribution, which may not be a good choice because of the high variance of its stochastic gradient. Thus, importance sampling methods are considered in the literature to improve the performance. Most previous work on SGD-based methods with importance sampling requires the knowledge of Lipschitz constants of all component gradients, which are in general difficult to estimate. In this paper, we study an adaptive importance sampling method for common SGD-based methods by exploiting the local first-order information without knowing any Lipschitz constants. In particular, we periodically changes the sampling distribution by only utilizing the gradient norms in the past few iterations. We prove that our adaptive importance sampling non-asymptotically reduces the variance of the stochastic gradients in SGD, and thus better convergence bounds than that for vanilla SGD can be obtained. We extend this sampling method to several other widely used stochastic gradient algorithms including SGD with momentum and ADAM. Experiments on common convex learning problems and deep neural networks illustrate notably enhanced performance using the adaptive sampling strategy.

Cross-spectrum Face Recognition Using Subspace Projection Hashing

Hanrui Wang, Xingbo Dong, Jin Zhe, Jean-Luc Dugelay, Massimo Tistarelli

Responsive image

Auto-TLDR; Subspace Projection Hashing for Cross-Spectrum Face Recognition

Slides Poster Similar

Cross-spectrum face recognition, e.g. visible to thermal matching, remains a challenging task due to the large variation originated from different domains. This paper proposed a subspace projection hashing (SPH) to enable the cross-spectrum face recognition task. The intrinsic idea behind SPH is to project the features from different domains onto a common subspace, where matching the faces from different domains can be accomplished. Notably, we proposed a new loss function that can (i) preserve both inter-domain and intra-domain similarity; (ii) regularize a scaled-up pairwise distance between hashed codes, to optimize projection matrix. Three datasets, Wiki, EURECOM VIS-TH paired face and TDFace are adopted to evaluate the proposed SPH. The experimental results indicate that the proposed SPH outperforms the original linear subspace ranking hashing (LSRH) in the benchmark dataset (Wiki) and demonstrates a reasonably good performance for visible-thermal, visible-near-infrared face recognition, therefore suggests the feasibility and effectiveness of the proposed SPH.

Randomized Transferable Machine

Pengfei Wei, Tze Yun Leong

Responsive image

Auto-TLDR; Randomized Transferable Machine for Suboptimal Feature-based Transfer Learning

Slides Poster Similar

Feature-based transfer method is one of the most effective methodologies for transfer learning. Existing works usually claim the learned new feature representation is truly \emph{domain-invariant}, and thus directly train a transfer model $\mathcal{M}$ on source domain. In this paper, we work on a more realistic scenario where the new feature representation is suboptimal where small divergence still exists across domains. We propose a new learning strategy and name the transfer model following the learning strategy as Randomized Transferable Machine (RTM). More specifically, we work on source data with the new feature representation learned from existing feature-based transfer methods. Our key idea is to enlarge source training data populations by randomly corrupting source data using some noises, and then train a transfer model $\widetilde{\mathcal{M}}$ performing well on all these corrupted source data populations. In principle, the more corruptions are made, the higher probability of the target data can be covered by the constructed source populations and thus a better transfer performance can be achieved by $\widetilde{\mathcal{M}}$. An ideal case is with infinite corruptions, which however is infeasible in reality. We instead develop a marginalized solution. With a marginalization trick, we can train an RTM that is equivalently trained using infinite source noisy populations without truly conducting any corruption. More importantly, such an RTM has a closed-form solution, which enables a super fast and efficient training. Extensive experiments on various real-world transfer tasks show that RTM is a very promising transfer model.

Probabilistic Word Embeddings in Kinematic Space

Adarsh Jamadandi, Rishabh Tigadoli, Ramesh Ashok Tabib, Uma Mudenagudi

Responsive image

Auto-TLDR; Kinematic Space for Hierarchical Representation Learning

Slides Poster Similar

In this paper, we propose a method for learning representations in the space of Gaussian-like distribution defined on a novel geometrical space called Kinematic space. The utility of non-Euclidean geometry for deep representation learning has gained traction, specifically different models of hyperbolic geometry such as Poincar\'{e} and Lorentz models have proven useful for learning hierarchical representations. Going beyond manifolds with constant curvature, albeit has better representation capacity might lead to unhanding of computationally tractable tools like Riemannian optimization methods. Here, we explore a pseudo-Riemannian auxiliary Lorentzian space called Kinematic space and provide a principled approach for constructing a Gaussian-like distribution, which is compatible with gradient-based learning methods, to formulate a probabilistic word embedding framework. Contrary to, mapping lexically distributed representations to a single point vector in Euclidean space, we advocate for mapping entities to density-based representations, as it provides explicit control over the uncertainty in representations. We test our framework by embedding WordNet-Noun hierarchy, a large lexical database, our experiments report strong consistent improvements in Mean Rank and Mean Average Precision (MAP) values compared to probabilistic word embedding frameworks defined on Euclidean and hyperbolic spaces. Our framework reports a significant 83.140\% improvement in Mean Rank compared to Euclidean version and an improvement of 79.416\% over hyperbolic version. Our work serves as an evidence for the utility of novel geometrical spaces for learning hierarchical representations.

More Correlations Better Performance: Fully Associative Networks for Multi-Label Image Classification

Yaning Li, Liu Yang

Responsive image

Auto-TLDR; Fully Associative Network for Fully Exploiting Correlation Information in Multi-Label Classification

Slides Poster Similar

Recent researches demonstrate that correlation modeling plays a key role in high-performance multi-label classification methods. However, existing methods do not take full advantage of correlation information, especially correlations in feature and label spaces of each image, which limits the performance of correlation-based multi-label classification methods. With more correlations considered, in this study, a Fully Associative Network (FAN) is proposed for fully exploiting correlation information, which involves both visual feature and label correlations. Specifically, FAN introduces a robust covariance pooling to summarize convolution features as global image representation for capturing feature correlation in the multi-label task. Moreover, it constructs an effective label correlation matrix based on a re-weighted scheme, which is fed into a graph convolution network for capturing label correlation. Then, correlation between covariance representations (i.e., feature correlation ) and the outputs of GCN (i.e., label correlation) are modeled for final prediction. Experimental results on two datasets illustrate the effectiveness and efficiency of our proposed FAN compared with state-of-the-art methods.

Object Classification of Remote Sensing Images Based on Optimized Projection Supervised Discrete Hashing

Qianqian Zhang, Yazhou Liu, Quansen Sun

Responsive image

Auto-TLDR; Optimized Projection Supervised Discrete Hashing for Large-Scale Remote Sensing Image Object Classification

Slides Poster Similar

Recently, with the increasing number of large-scale remote sensing images, the demand for large-scale remote sensing image object classification is growing and attracting the interest of many researchers. Hashing, because of its low memory requirements and high time efficiency, has been widely solve the problem of large-scale remote sensing image. Supervised hashing methods mainly leverage the label information of remote sensing image to learn hash function, however, the similarity of the original feature space cannot be well preserved, which can not meet the accurate requirements for object classification of remote sensing image. To solve the mentioned problem, we propose a novel method named Optimized Projection Supervised Discrete Hashing(OPSDH), which jointly learns a discrete binary codes generation and optimized projection constraint model. It uses an effective optimized projection method to further constraint the supervised hash learning and generated hash codes preserve the similarity based on the data label while retaining the similarity of the original feature space. The experimental results show that OPSDH reaches improved performance compared with the existing hash learning methods and demonstrate that the proposed method is more efficient for operational applications

Improved Time-Series Clustering with UMAP Dimension Reduction Method

Clément Pealat, Vincent Cheutet, Guillaume Bouleux

Responsive image

Auto-TLDR; Time Series Clustering with UMAP as a Pre-processing Step

Slides Poster Similar

Clustering is an unsupervised machine learning method giving insights on data without early knowledge. Classes of data are return by assembling similar elements together. Giving the increasing of the available data, this method is now applied in a lot of fields with various data types. Here, we propose to explore the case of time series clustering. Indeed, time series are one of the most classic data type, and are present in various fields such as medical or finance. This kind of data can be pre-processed by of dimension reduction methods, such as the recent UMAP algorithm. In this paper, a benchmark of time series clustering is created, comparing the results with and without UMAP as a pre-processing step. UMAP is used to enhance clustering results. For completeness, three different clustering algorithms and two different geometric representation for the time series (Classic Euclidean geometry, and Riemannian geometry on the Stiefel Manifold) are applied. The results are compared with and without UMAP as a pre-processing step on the databases available at UCR Time Series Classification Archive www.cs.ucr.edu/~eamonn/time_series_data/.

A Randomized Algorithm for Sparse Recovery

Huiyuan Yu, Maggie Cheng, Yingdong Lu

Responsive image

Auto-TLDR; A Constrained Graph Optimization Algorithm for Sparse Signal Recovery

Poster Similar

This paper considers the problem of sparse signal recovery where there is a structure in the signal. Efficient recovery schemes can be designed to leverage the signal structure. Following the model-based compressive sensing framework, we have developed an efficient algorithm for both head and tail approximations for the model-projection problem. The problem is modeled as a constrained graph optimization problem, which is an NP-hard optimization problem. Solving the NP-hard optimization program is then transformed to solving a linear program and finding a randomized algorithm to find an integral solution. The integral solution is optimal-in-expectation. The algorithm is proved to have the same geometric convergence as previous work. The algorithm has been tested on various compressing matrices. It worked well with the matrices with the Restricted Isometry Property (RIP), also worked well with some matrices that have not been shown to have RIP. The proposed algorithm demonstrated improved recoverability and used fewer number of iterations to recover the signal.

Tensorized Feature Spaces for Feature Explosion

Ravdeep Pasricha, Pravallika Devineni, Evangelos Papalexakis, Ramakrishnan Kannan

Responsive image

Auto-TLDR; Tensor Rank Decomposition for Hyperspectral Image Classification

Slides Poster Similar

In this paper, we present a novel framework that uses tensor factorization to generate richer feature spaces for pixel classification in hyperspectral images. In particular, we assess the performance of different tensor rank decomposition methods as compared to the traditional kernel-based approaches for the hyperspectral image classification problem. We propose ORION, which takes as input a hyperspectral image tensor and a rank and outputs an enhanced feature space from the factor matrices of the decomposed tensor. Our method is a feature explosion technique that inherently maps low dimensional input space in R^K to high dimensional space in R^R, where R >> K, say in the order of 1000x, like a kernel. We show how the proposed method exploits the multi-linear structure of hyperspectral three-dimensional tensor. We demonstrate the effectiveness of our method with experiments on three publicly available hyperspectral datasets with labeled pixels and compare their classification performance against traditional linear and non-linear supervised learning methods such as SVM with Linear, Polynomial, RBF kernels, and the Multi-Layer Perceptron model. Finally, we explore the relationship between the rank of the tensor decomposition and the classification accuracy using several hyperspectral datasets with ground truth.