Learning Graph Matching Substitution Weights Based on a Linear Regression

Shaima Algabli, Francesc Serratosa

Responsive image

Auto-TLDR; Learning the weights on local attributes of attributed graphs

Slides Poster

Attributed graphs are structures that are useful to represent objects through the information of their local parts and their relations. Each characteristic in the local parts is represented by different attributes on the nodes. In this context, the comparison between structured objects is performed through a distance between attributed graphs. If we want to correctly tune the distance and the node correspondence between graphs, we have to add some weights on the node attributes to gauge the importance of each local characteristic. In this paper, we present a method to learn the weights on each node attribute. It is based on building an embedded space and imposing the weights we want to learn to be the constants of the hyperplane deduced by a linear regression applied on a cloud of points. These points represent the node-to-node mappings.

Similar papers

Incorporating a Graph-Matching Algorithm into a Muscle Mechanics Model

Jose Luis Santacruz Muñoz, Francesc Serratosa

Responsive image

Auto-TLDR; Recomputing the Mesh Grid for Differential Models of the Muscle Mechanics

Slides Poster Similar

Differential models for the simulation of the muscle mechanics are based on iteratively updating a mesh grid and deducing its new state through a finite element model. Models usually assume that the mesh grid is almost regular, and this makes a degradation of the simulation accuracy in long simulation sequences, since the mesh tends to be less regular when the number of iterations increases. We present a model that has the aim of reducing this accuracy degradation. It is based on recomputing the mesh grid returned by the model in each iteration through the concept of graph matching. The new model is currently in use to analyse the dynamics of the human heart when some pressure is applied to it. The final goal of the project (which is not shown in this paper) is to deduce the optimal position and strength pressure applied to the heart that increases the chance of reviving it with the minimum tissue damage. Experimental validation shows our model returns a higher accuracy of the muscle position through some iterations than classical differential models with an insignificant increase of runtime. Thus, it is worth recomputing the mesh grid since the simulation accuracy drastically increases at the expense of a low runtime increase.

Matching of Matching-Graphs – a Novel Approach for Graph Classification

Mathias Fuchs, Kaspar Riesen

Responsive image

Auto-TLDR; Stable Graph Matching Information for Pattern Recognition

Slides Poster Similar

Due to fast developments in data acquisition, we observe rapidly increasing amounts of data available in diverse areas. Simultaneously, we observe that in many applications the underlying data is inherently complex, making graphs a very useful and adequate data structure for formal representation. A large amount of graph based methods for pattern recognition have been proposed. Many of these methods actually rely on graph matching. In the present paper a novel encoding of graph matching information is proposed. The idea of this encoding is to formalize the stable cores of specific classes by means of graphs. In an empirical evaluation we show that it can be highly beneficial to focus on these stable parts of graphs during graph classification.

Classification of Intestinal Gland Cell-Graphs Using Graph Neural Networks

Linda Studer, Jannis Wallau, Heather Dawson, Inti Zlobec, Andreas Fischer

Responsive image

Auto-TLDR; Graph Neural Networks for Classification of Dysplastic Gland Glands using Graph Neural Networks

Slides Poster Similar

We propose to classify intestinal glands as normal or dysplastic using cell-graphs and graph-based deep learning methods. Dysplastic intestinal glands can lead to colorectal cancer, which is one of the three most common cancer types in the world. In order to assess the cancer stage and thus the treatment of a patient, pathologists analyse tissue samples of affected patients. Among other factors, they look at the changes in morphology of different tissues, such as the intestinal glands. Cell-graphs have a high representational power and can describe topological and geometrical properties of intestinal glands. However, classical graph-based methods have a high computational complexity and there is only a limited range of machine learning methods available. In this paper, we propose Graph Neural Networks (GNNs) as an efficient learning-based approach to classify cell-graphs. We investigate different variants of so-called Message Passing Neural Networks and compare them with a classical graph-based approach based on approximated Graph Edit Distance and k-nearest neighbours classifier. A promising classification accuracy of 94.1% is achieved by the proposed method on the pT1 Gland Graph dataset, which is an increase of 11.5% over the baseline result.

Learning Embeddings for Image Clustering: An Empirical Study of Triplet Loss Approaches

Kalun Ho, Janis Keuper, Franz-Josef Pfreundt, Margret Keuper

Responsive image

Auto-TLDR; Clustering Objectives for K-means and Correlation Clustering Using Triplet Loss

Slides Poster Similar

In this work, we evaluate two different image clustering objectives, k-means clustering and correlation clustering, in the context of Triplet Loss induced feature space embeddings. Specifically, we train a convolutional neural network to learn discriminative features by optimizing two popular versions of the Triplet Loss in order to study their clustering properties under the assumption of noisy labels. Additionally, we propose a new, simple Triplet Loss formulation, which shows desirable properties with respect to formal clustering objectives and outperforms the existing methods. We evaluate all three Triplet loss formulations for K-means and correlation clustering on the CIFAR-10 image classification dataset.

Siamese Graph Convolution Network for Face Sketch Recognition

Liang Fan, Xianfang Sun, Paul Rosin

Responsive image

Auto-TLDR; A novel Siamese graph convolution network for face sketch recognition

Slides Poster Similar

In this paper, we present a novel Siamese graph convolution network (GCN) for face sketch recognition. To build a graph from an image, we utilize a deep learning method to detect the image edges, and then use a superpixel method to segment the edge image. Each segmented superpixel region is taken as a node, and each pair of adjacent regions forms an edge of the graph. Graphs from both a face sketch and a face photo are input into the Siamese GCN for recognition. A deep graph matching method is used to share messages between cross-modal graphs in this model. Experiments show that the GCN can obtain high performance on several face photo-sketch datasets, including seen and unseen face photo-sketch datasets. It is also shown that the model performance based on the graph structure representation of the data using the Siamese GCN is more stable than a Siamese CNN model.

Generalized Conics: Properties and Applications

Aysylu Gabdulkhakova, Walter Kropatsch

Responsive image

Auto-TLDR; A Generalized Conic Representation for Distance Fields

Slides Poster Similar

In this paper the properties of the generalized conics are used to create a unified framework for generating various types of the distance fields. The main concept behind this work is a metric that measures the distance from a point to a line segment according to the definition of the ellipse. The proposed representation provides a possibility to efficiently compute the proximity, arithmetic mean of the distances and a space tessellation with regard to the given set of polygonal objects, line segments and points. In addition, the weights can be introduced for objects, their parts and combinations. This fact leads to a hierarchical representation that can be efficiently obtained using the pixel-wise operations. The practical value of the proposed ideas is demonstrated on example of applications like skeletonization, smoothing, optimal location finding and clustering.

Map-Based Temporally Consistent Geolocalization through Learning Motion Trajectories

Bing Zha, Alper Yilmaz

Responsive image

Auto-TLDR; Exploiting Motion Trajectories for Geolocalization of Object on Topological Map using Recurrent Neural Network

Slides Poster Similar

In this paper, we propose a novel trajectory learning method that exploits motion trajectories on topological map using recurrent neural network for temporally consistent geolocalization of object. Inspired by human's ability to both be aware of distance and direction of self-motion in navigation, our trajectory learning method learns a pattern representation of trajectories encoded as a sequence of distances and turning angles to assist self-localization. We pose the learning process as a conditional sequence prediction problem in which each output locates the object on a traversable edge in a map. Considering the prediction sequence ought to be topologically connected in the graph-structured map, we adopt two different hypotheses generation and elimination strategies to eliminate disconnected sequence prediction. We demonstrate our approach on the KITTI stereo visual odometry dataset which is a city-scale environment. The key benefits of our approach to geolocalization are that 1) we take advantage of powerful sequence modeling ability of recurrent neural network and its robustness to noisy input, 2) only require a map in the form of a graph and 3) simply use an affordable sensor that generates motion trajectory. The experiments show that the motion trajectories can be learned by training an recurrent neural network, and temporally consistent geolocation can be predicted with both of the proposed strategies.

Anime Sketch Colorization by Component-Based Matching Using Deep Appearance Features and Graph Representation

Thien Do, Pham Van, Anh Nguyen, Trung Dang, Quoc Nguyen, Bach Hoang, Giao Nguyen

Responsive image

Auto-TLDR; Combining Deep Learning and Graph Representation for Sketch Colorization

Slides Poster Similar

Sketch colorization is usually expensive and time-consuming for artists, and automating this process can have many pragmatic applications in the animation, comic book, and video game industry. However, automatic image colorization faces many challenges, because sketches not only lack texture information but also potentially entail complicated objects that require acute coloring. These difficulties usually result in incorrect color assignments that can ruin the aesthetic appeal of the final output. In this paper, we present a novel component-based matching framework that combines deep learned features and quadratic programming {\color{red} with a new cost function} to solve this colorization problem. The proposed framework inputs a character's sketches as well as a colored image in the same cut of a movie, and outputs a high-quality sequence of colorized frames based on the color assignment in the reference colored image. To carry out this colorization task, we first utilize a pretrained ResNet-34 model to extract elementary components' features to match certain pairs of components (one component from the sketch and one from reference). Next, a graph representation is constructed in order to process and match the remaining components that could not be done in the first step. Since the first step has reduced the number of components to be matched by the graph, we can solve this graph problem in a short computing time even when there are hundreds of different components present in each sketch. We demonstrate the effectiveness of the proposed solution by conducting comprehensive experiments and producing aesthetically pleasing results. To the best of our knowledge, our framework is the first work that combines deep learning and graph representation to colorize anime and achieves a high pixel-level accuracy at a reasonable time cost.

Equation Attention Relationship Network (EARN) : A Geometric Deep Metric Framework for Learning Similar Math Expression Embedding

Saleem Ahmed, Kenny Davila, Srirangaraj Setlur, Venu Govindaraju

Responsive image

Auto-TLDR; Representational Learning for Similarity Based Retrieval of Mathematical Expressions

Slides Poster Similar

Representational Learning in the form of high dimensional embeddings have been used for multiple pattern recognition applications. There has been a significant interest in building embedding based systems for learning representationsin the mathematical domain. At the same time, retrieval of structured information such as mathematical expressions is an important need for modern IR systems. In this work, our motivation is to introduce a robust framework for learning representations for similarity based retrieval of mathematical expressions. Given a query by example, the embedding can find the closest matching expression as a function of euclidean distance between them. We leverage recent advancements in image-based and graph-based deep learning algorithms to learn our similarity embeddings. We do this first, by using uni-modal encoders in graph space and image space and then, a multi-modal combination of the same. To overcome the lack of training data, we force the networks to learn a deep metric using triplets generated with a heuristic scoring function. We also adopt a custom strategy for mining hard samples to train our neural networks. Our system produces rankings similar to those generated by the original scoring function, but using only a fraction of the time. Our results establish the viability of using such a multi-modal embedding for this task.

Hcore-Init: Neural Network Initialization Based on Graph Degeneracy

Stratis Limnios, George Dasoulas, Dimitrios Thilikos, Michalis Vazirgiannis

Responsive image

Auto-TLDR; K-hypercore: Graph Mining for Deep Neural Networks

Slides Poster Similar

Neural networks are the pinnacle of Artificial Intelligence, as in recent years we witnessed many novel architectures, learning and optimization techniques for deep learning. Capitalizing on the fact that neural networks inherently constitute multipartite graphs among neuron layers, we aim to analyze directly their structure to extract meaningful information that can improve the learning process. To our knowledge graph mining techniques for enhancing learning in neural networks have not been thoroughly investigated. In this paper we propose an adapted version of the k-core structure for the complete weighted multipartite graph extracted from a deep learning architecture. As a multipartite graph is a combination of bipartite graphs, that are in turn the incidence graphs of hypergraphs, we design k-hypercore decomposition, the hypergraph analogue of k-core degeneracy. We applied k-hypercore to several neural network architectures, more specifically to convolutional neural networks and multilayer perceptrons for image recognition tasks after a very short pretraining. Then we used the information provided by the hypercore numbers of the neurons to re-initialize the weights of the neural network, thus biasing the gradient optimization scheme. Extensive experiments proved that k-hypercore outperforms the state-of-the-art initialization methods.

Cluster-Size Constrained Network Partitioning

Maksim Mironov, Konstantin Avrachenkov

Responsive image

Auto-TLDR; Unsupervised Graph Clustering with Stochastic Block Model

Slides Poster Similar

In this paper we consider a graph clustering problem with a given number of clusters and approximate desired sizes of the clusters. One possible motivation for such task could be the problem of databases or servers allocation within several given large computational clusters, where we want related objects to share the same cluster in order to minimize latency and transaction costs. This task differs from the original community detection problem, though we adopt some ideas from Glauber Dynamics and Label Propagation Algorithm. At the same time we consider no additional information about node labels, so the task has nature of unsupervised learning. We propose an algorithm for the problem, show that it works well for a large set of parameters of Stochastic Block Model (SBM) and theoretically show its running time complexity for achieving almost exact recovery is of $O(n\cdot\deg_{av} \cdot \omega )$ for the mean-field SBM with $\omega$ tending to infinity arbitrary slow. Other significant advantage of the proposed approach is its local nature, which means it can be efficiently distributed with no scheduling or synchronization.

Region and Relations Based Multi Attention Network for Graph Classification

Manasvi Aggarwal, M. Narasimha Murty

Responsive image

Auto-TLDR; R2POOL: A Graph Pooling Layer for Non-euclidean Structures

Slides Poster Similar

Graphs are non-euclidean structures that can represent many relational data efficiently. Many studies have proposed the convolution and the pooling operators on the non-euclidean domain. The graph convolution operators have shown astounding performance on various tasks such as node representation and classification. For graph classification, different pooling techniques are introduced, but none of them has considered both neighborhood of the node and the long-range dependencies of the node. In this paper, we propose a novel graph pooling layer R2POOL, which balances the structure information around the node as well as the dependencies with far away nodes. Further, we propose a new training strategy to learn coarse to fine representations. We add supervision at only intermediate levels to generate predictions using only intermediate-level features. For this, we propose the concept of an alignment score. Moreover, each layer's prediction is controlled by our proposed branch training strategy. This complete training helps in learning dominant class features at each layer for representing graphs. We call the combined model by R2MAN. Experiments show that R2MAN the potential to improve the performance of graph classification on various datasets.

Graph Approximations to Geodesics on Metric Graphs

Robin Vandaele, Yvan Saeys, Tijl De Bie

Responsive image

Auto-TLDR; Topological Pattern Recognition of Metric Graphs Using Proximity Graphs

Slides Poster Similar

In machine learning, high-dimensional point clouds are often assumed to be sampled from a topological space of which the intrinsic dimension is significantly lower than the representation dimension. Proximity graphs, such as the Rips graph or kNN graph, are often used as an intermediate representation to learn or visualize topological and geometrical properties of this space. The key idea behind this approach is that distances on the graph preserve the geodesic distances on the unknown space well, and as such, can be used to infer local and global geometric patterns of this space. Prior results provide us with conditions under which these distances are well-preserved for geodesically convex, smooth, compact manifolds. Yet, proximity graphs are ideal representations for a much broader class of spaces, such as metric graphs, i.e., graphs embedded in the Euclidean space. It turns out—as proven in this paper—that these existing conditions cannot be straightforwardly adapted to these spaces. In this work, we provide novel, flexible, and insightful characteristics and results for topological pattern recognition of metric graphs to bridge this gap.

On Morphological Hierarchies for Image Sequences

Caglayan Tuna, Alain Giros, François Merciol, Sébastien Lefèvre

Responsive image

Auto-TLDR; Comparison of Hierarchies for Image Sequences

Slides Poster Similar

Morphological hierarchies form a popular framework aiming at emphasizing the multiscale structure of digital image by performing an unsupervised spatial partitioning of the data. These hierarchies have been recently extended to cope with image sequences, and different strategies have been proposed to allow their construction from spatio-temporal data. In this paper, we compare these hierarchical representation strategies for image sequences according to their structural properties. We introduce a projection method to make these representations comparable. Furthermore, we extend one of these recent strategies in order to obtain more efficient hierarchical representations for image sequences. Experiments were conducted on both synthetic and real datasets, the latter being made of satellite image time series. We show that building one hierarchy by using spatial and temporal information together is more efficient comparing to other existing strategies.

Sketch-Based Community Detection Via Representative Node Sampling

Mahlagha Sedghi, Andre Beckus, George Atia

Responsive image

Auto-TLDR; Sketch-based Clustering of Community Detection Using a Small Sketch

Slides Poster Similar

This paper proposes a sketch-based approach to the community detection problem which clusters the full graph through the use of an informative and concise sketch. The reduced sketch is built through an effective sampling approach which selects few nodes that best represent the complete graph and operates on a pairwise node similarity measure based on the average commute time. After sampling, the proposed algorithm clusters the nodes in the sketch, and then infers the cluster membership of the remaining nodes in the full graph based on their aggregate similarity to nodes in the partitioned sketch. By sampling nodes with strong representation power, our approach can improve the success rates over full graph clustering. In challenging cases with large node degree variation, our approach not only maintains competitive accuracy with full graph clustering despite using a small sketch, but also outperforms existing sampling methods. The use of a small sketch allows considerable storage savings, and computational and timing improvements for further analysis such as clustering and visualization. We provide numerical results on synthetic data based on the homogeneous, heterogeneous and degree corrected versions of the stochastic block model, as well as experimental results on real-world data.

Social Network Analysis Using Knowledge-Graph Embeddings and Convolution Operations

Bonaventure Chidube Molokwu, Shaon Bhatta Shuvo, Ziad Kobti, Narayan C. Kar

Responsive image

Auto-TLDR; RLVECO: Representation Learning via Knowledge- Graph Embeddings and Convolution Operations for Social Network Analysis

Slides Poster Similar

Link prediction and node classification tasks in Social Network Analysis (SNA) remain open research problems with respect to Artificial Intelligence (AI). Thus, the inherent representations about social network structures can be effectively harnessed for training AI models in a bid to predict ties as well as detect clusters via classification of actors with regard to a given social network structure. In this paper, we have proposed a special hybrid model comprising dual layers of Feature Learning (FL): Representation Learning via Knowledge- Graph Embeddings and Convolution Operations (RLVECO). The architecture of RLVECO is tailored towards analyzing and extracting meaningful representations from social network structures so as to aid in link prediction, node classification, and community detection tasks. RLVECO utilizes an edge sampling approach for exploiting features of the social graph via learning the context of each actor with respect to its neighboring actors.

Generic Merging of Structure from Motion Maps with a Low Memory Footprint

Gabrielle Flood, David Gillsjö, Patrik Persson, Anders Heyden, Kalle Åström

Responsive image

Auto-TLDR; A Low-Memory Footprint Representation for Robust Map Merge

Slides Poster Similar

With the development of cheap image sensors, the amount of available image data have increased enormously, and the possibility of using crowdsourced collection methods has emerged. This calls for development of ways to handle all these data. In this paper, we present new tools that will enable efficient, flexible and robust map merging. Assuming that separate optimisations have been performed for the individual maps, we show how only relevant data can be stored in a low memory footprint representation. We use these representations to perform map merging so that the algorithm is invariant to the merging order and independent of the choice of coordinate system. The result is a robust algorithm that can be applied to several maps simultaneously. The result of a merge can also be represented with the same type of low-memory footprint format, which enables further merging and updating of the map in a hierarchical way. Furthermore, the method can perform loop closing and also detect changes in the scene between the capture of the different image sequences. Using both simulated and real data — from both a hand held mobile phone and from a drone — we verify the performance of the proposed method.

Edge-Aware Graph Attention Network for Ratio of Edge-User Estimation in Mobile Networks

Jiehui Deng, Sheng Wan, Xiang Wang, Enmei Tu, Xiaolin Huang, Jie Yang, Chen Gong

Responsive image

Auto-TLDR; EAGAT: Edge-Aware Graph Attention Network for Automatic REU Estimation in Mobile Networks

Slides Poster Similar

Estimating the Ratio of Edge-Users (REU) is an important issue in mobile networks, as it helps the subsequent adjustment of loads in different cells. However, existing approaches usually determine the REU manually, which are experience-dependent and labor-intensive, and thus the estimated REU might be imprecise. Considering the inherited graph structure of mobile networks, in this paper, we utilize a graph-based deep learning method for automatic REU estimation, where the practical cells are deemed as nodes and the load switchings among them constitute edges. Concretely, Graph Attention Network (GAT) is employed as the backbone of our method due to its impressive generalizability in dealing with networked data. Nevertheless, conventional GAT cannot make full use of the information in mobile networks, since it only incorporates node features to infer the pairwise importance and conduct graph convolutions, while the edge features that are actually critical in our problem are disregarded. To accommodate this issue, we propose an Edge-Aware Graph Attention Network (EAGAT), which is able to fuse the node features and edge features for REU estimation. Extensive experimental results on two real-world mobile network datasets demonstrate the superiority of our EAGAT approach to several state-of-the-art methods.

A Novel Random Forest Dissimilarity Measure for Multi-View Learning

Hongliu Cao, Simon Bernard, Robert Sabourin, Laurent Heutte

Responsive image

Auto-TLDR; Multi-view Learning with Random Forest Relation Measure and Instance Hardness

Slides Poster Similar

Multi-view learning is a learning task in which data is described by several concurrent representations. Its main challenge is most often to exploit the complementarities between these representations to help solve a classification/regression task. This is a challenge that can be met nowadays if there is a large amount of data available for learning. However, this is not necessarily true for all real-world problems, where data are sometimes scarce (e.g. problems related to the medical environment). In these situations, an effective strategy is to use intermediate representations based on the dissimilarities between instances. This work presents new ways of constructing these dissimilarity representations, learning them from data with Random Forest classifiers. More precisely, two methods are proposed, which modify the Random Forest proximity measure, to adapt it to the context of High Dimension Low Sample Size (HDLSS) multi-view classification problems. The second method, based on an Instance Hardness measurement, is significantly more accurate than other state-of-the-art measurements including the original RF Proximity measurement and the Large Margin Nearest Neighbor (LMNN) metric learning measurement.

Heterogeneous Graph-Based Knowledge Transfer for Generalized Zero-Shot Learning

Junjie Wang, Xiangfeng Wang, Bo Jin, Junchi Yan, Wenjie Zhang, Hongyuan Zha

Responsive image

Auto-TLDR; Heterogeneous Graph-based Knowledge Transfer for Generalized Zero-Shot Learning

Slides Poster Similar

Generalized zero-shot learning (GZSL) tackles the problem of learning to classify instances involving both seen classes and unseen ones. The key issue is how to effectively transfer the model learned from seen classes to unseen classes. Existing works in GZSL usually assume that some prior information about unseen classes are available. However, such an assumption is unrealistic when new unseen classes appear dynamically. To this end, we propose a novel heterogeneous graph-based knowledge transfer method (HGKT) for GZSL, agnostic to unseen classes and instances, by leveraging graph neural network. Specifically, a structured heterogeneous graph is constructed with high-level representative nodes for seen classes, which are chosen through Wasserstein barycenter in order to simultaneously capture inter-class and intra-class relationship. The aggregation and embedding functions can be learned throughgraph neural network, which can be used to compute the embeddings of unseen classes by transferring the knowledge from their neighbors. Extensive experiments on public benchmark datasets show that our method achieves state-of-the-art results.

3D Facial Matching by Spiral Convolutional Metric Learning and a Biometric Fusion-Net of Demographic Properties

Soha Sadat Mahdi, Nele Nauwelaers, Philip Joris, Giorgos Bouritsas, Imperial London, Sergiy Bokhnyak, Susan Walsh, Mark Shriver, Michael Bronstein, Peter Claes

Responsive image

Auto-TLDR; Multi-biometric Fusion for Biometric Verification using 3D Facial Mesures

Slides Similar

Face recognition is a widely accepted biometric verification tool, as the face contains a lot of information about the identity of a person. In this study, a 2-step neural-based pipeline is presented for matching 3D facial shape to multiple DNA-related properties (sex, age, BMI and genomic background). The first step consists of a triplet loss-based metric learner that compresses facial shape into a lower dimensional embedding while preserving information about the property of interest. Most studies in the field of metric learning have only focused on Euclidean data. In this work, geometric deep learning is employed to learn directly from 3D facial meshes. To this end, spiral convolutions are used along with a novel mesh-sampling scheme that retains uniformly sampled 3D points at different levels of resolution. The second step is a multi-biometric fusion by a fully connected neural network. The network takes an ensemble of embeddings and property labels as input and returns genuine and imposter scores. Since embeddings are accepted as an input, there is no need to train classifiers for the different properties and available data can be used more efficiently. Results obtained by a 10-fold cross-validation for biometric verification show that combining multiple properties leads to stronger biometric systems. Furthermore, the proposed neural-based pipeline outperforms a linear baseline, which consists of principal component analysis, followed by classification with linear support vector machines and a Naïve Bayes-based score-fuser.

Robust Skeletonization for Plant Root Structure Reconstruction from MRI

Jannis Horn

Responsive image

Auto-TLDR; Structural reconstruction of plant roots from MRI using semantic root vs shoot segmentation and 3D skeletonization

Slides Poster Similar

Structural reconstruction of plant roots from MRI is challenging, because of low resolution and low signal-to-noise ratio of the 3D measurements which may lead to disconnectivities and wrongly connected roots. We propose a two-stage approach for this task. The first stage is based on semantic root vs. soil segmentation and finds lowest-cost paths from any root voxel to the shoot. The second stage takes the largest fully connected component generated in the first stage and uses 3D skeletonization to extract a graph structure. We evaluate our method on 22 MRI scans and compare to human expert reconstructions.

A Multi-Head Self-Relation Network for Scene Text Recognition

Zhou Junwei, Hongchao Gao, Jiao Dai, Dongqin Liu, Jizhong Han

Responsive image

Auto-TLDR; Multi-head Self-relation Network for Scene Text Recognition

Slides Poster Similar

The text embedded in scene images can be seen everywhere in our lives. However, recognizing text from natural scene images is still a challenge because of its diverse shapes and distorted patterns. Recently, advanced recognition networks generally treat scene text recognition as a sequence prediction task. Although achieving excellent performance, these recognition networks consider the feature map cells as independent individuals and update cells state without utilizing the information of their neighboring cells. And the local receptive field of traditional convolutional neural network (CNN) makes a single cell that cannot cover the whole text region in an image. Due to these issues, the existing recognition networks cannot extract the global context in a visual scene. To deal with the above problems, we propose a Multi-head Self-relation Network(MSRN) for scene text recognition in this paper. The MSRN consists of several multi-head self-relation layers, which is designed for extracting the global context of a visual scene, so that transforms a cell into a new cell that fuses the information of the related cells. Furthermore, experiments over several public datasets demonstrate that our proposed recognition network achieves superior performance on several benchmark datasets including IC03, IC13, IC15, SVT-Perspective.

Assortative-Constrained Stochastic Block Models

Daniel Gribel, Thibaut Vidal, Michel Gendreau

Responsive image

Auto-TLDR; Constrained Stochastic Block Models for Assortative Communities in Neural Networks

Slides Poster Similar

Stochastic block models (SBMs) are often used to find assortative community structures in networks, such that the probability of connections within communities is higher than in between communities. However, classic SBMs are not limited to assortative structures. In this study, we discuss the implications of this model-inherent indifference towards assortativity or disassortativity, and show that it can lead to undesirable outcomes in datasets which are known to be assortative but which contain a reduced amount of information. To circumvent these issues, we propose a constrained SBM that imposes strong assortativity constraints, along with efficient algorithmic solutions. These constraints significantly boost community-detection capabilities in regimes which are close to the detectability threshold. They also permit to identify structurally-different communities in networks representing cerebral-cortex activity regions.

Low-Cost Lipschitz-Independent Adaptive Importance Sampling of Stochastic Gradients

Huikang Liu, Xiaolu Wang, Jiajin Li, Man-Cho Anthony So

Responsive image

Auto-TLDR; Adaptive Importance Sampling for Stochastic Gradient Descent

Slides Similar

Stochastic gradient descent (SGD) usually samples training data based on the uniform distribution, which may not be a good choice because of the high variance of its stochastic gradient. Thus, importance sampling methods are considered in the literature to improve the performance. Most previous work on SGD-based methods with importance sampling requires the knowledge of Lipschitz constants of all component gradients, which are in general difficult to estimate. In this paper, we study an adaptive importance sampling method for common SGD-based methods by exploiting the local first-order information without knowing any Lipschitz constants. In particular, we periodically changes the sampling distribution by only utilizing the gradient norms in the past few iterations. We prove that our adaptive importance sampling non-asymptotically reduces the variance of the stochastic gradients in SGD, and thus better convergence bounds than that for vanilla SGD can be obtained. We extend this sampling method to several other widely used stochastic gradient algorithms including SGD with momentum and ADAM. Experiments on common convex learning problems and deep neural networks illustrate notably enhanced performance using the adaptive sampling strategy.

Aggregating Dependent Gaussian Experts in Local Approximation

Hamed Jalali, Gjergji Kasneci

Responsive image

Auto-TLDR; A novel approach for aggregating the Gaussian experts by detecting strong violations of conditional independence

Slides Poster Similar

Distributed Gaussian processes (DGPs) are prominent local approximation methods to scale Gaussian processes (GPs) to large datasets. Instead of a global estimation, they train local experts by dividing the training set into subsets, thus reducing the time complexity. This strategy is based on the conditional independence assumption, which basically means that there is a perfect diversity between the local experts. In practice, however, this assumption is often violated, and the aggregation of experts leads to sub-optimal and inconsistent solutions. In this paper, we propose a novel approach for aggregating the Gaussian experts by detecting strong violations of conditional independence. The dependency between experts is determined by using a Gaussian graphical model, which yields the precision matrix. The precision matrix encodes conditional dependencies between experts and is used to detect strongly dependent experts and construct an improved aggregation. Using both synthetic and real datasets, our experimental evaluations illustrate that our new method outperforms other state-of-the-art (SOTA) DGP approaches while being substantially more time-efficient than SOTA approaches, which build on independent experts.

3D Pots Configuration System by Optimizing Over Geometric Constraints

Jae Eun Kim, Muhammad Zeeshan Arshad, Seong Jong Yoo, Je Hyeong Hong, Jinwook Kim, Young Min Kim

Responsive image

Auto-TLDR; Optimizing 3D Configurations for Stable Pottery Restoration from irregular and noisy evidence

Slides Poster Similar

While potteries are common artifacts excavated in archaeological sites, the restoration process relies on the manual cleaning and reassembling shattered pieces. Since the number of possible 3D configurations is considerably large, the exhaustive manual trial may result in an abrasion on fractured surfaces and even failure to find the correct matches. As a result, many recent works suggest virtual reassembly from 3D scans of the fragments. The problem is challenging in the view of the conventional 3D geometric analysis, as it is hard to extract reliable shape features from the thin break lines. We propose to optimize the global configuration by combining geometric constraints with information from noisy shape features. Specifically, we enforce bijection and continuity of sequence of correspondences given estimates of corners and pair-wise matching scores between multiple break lines. We demonstrate that our pipeline greatly increases the accuracy of correspondences, resulting in the stable restoration of 3D configurations from irregular and noisy evidence.

A Plane-Based Approach for Indoor Point Clouds Registration

Ketty Favre, Muriel Pressigout, Luce Morin, Eric Marchand

Responsive image

Auto-TLDR; A plane-based registration approach for indoor environments based on LiDAR data

Slides Poster Similar

Iterative Closest Point (ICP) is one of the mostly used algorithms for 3D point clouds registration. This classical approach can be impacted by the large number of points contained in a point cloud. Planar structures, which are less numerous than points, can be used in well-structured man-made environment. In this paper we propose a registration method inspired by the ICP algorithm in a plane-based registration approach for indoor environments. This method is based solely on data acquired with a LiDAR sensor. A new metric based on plane characteristics is introduced to find the best plane correspondences. The optimal transformation is estimated through a two-step minimization approach, successively performing robust plane-to-plane minimization and non-linear robust point-to-plane registration. Experiments on the Autonomous Systems Lab (ASL) dataset show that the proposed method enables to successfully register 100% of the scans from the three indoor sequences. Experiments also show that the proposed method is more robust in large motion scenarios than other state-of-the-art algorithms.

Recognizing Bengali Word Images - A Zero-Shot Learning Perspective

Sukalpa Chanda, Daniël Arjen Willem Haitink, Prashant Kumar Prasad, Jochem Baas, Umapada Pal, Lambert Schomaker

Responsive image

Auto-TLDR; Zero-Shot Learning for Word Recognition in Bengali Script

Slides Poster Similar

Zero-Shot Learning(ZSL) techniques could classify a completely unseen class, which it has never seen before during training. Thus, making it more apt for any real-life classification problem, where it is not possible to train a system with annotated data for all possible class types. This work investigates recognition of word images written in Bengali Script in a ZSL framework. The proposed approach performs Zero-Shot word recognition by coupling deep learned features procured from VGG16 architecture along with 13 basic shapes/stroke primitives commonly observed in Bengali script characters. As per the notion of ZSL framework those 13 basic shapes are termed as “Signature Attributes”. The obtained results are promising while evaluation was carried out in a Five-Fold cross-validation setup dealing with samples from 250 word classes.

Privacy Attributes-Aware Message Passing Neural Network for Visual Privacy Attributes Classification

Hanbin Hong, Wentao Bao, Yuan Hong, Yu Kong

Responsive image

Auto-TLDR; Privacy Attributes-Aware Message Passing Neural Network for Visual Privacy Attribute Classification

Slides Poster Similar

Visual Privacy Attribute Classification (VPAC) identifies privacy information leakage via social media images. These images containing privacy attributes such as skin color, face or gender are classified into multiple privacy attribute categories in VPAC. With limited works in this task, current methods often extract features from images and simply classify the extracted feature into multiple privacy attribute classes. The dependencies between privacy attributes, e.g., skin color and face typically co-exist in the same image, are usually ignored in classification, which causes performance degradation in VPAC. In this paper, we propose a novel end-to-end Privacy Attributes-aware Message Passing Neural Network (PA-MPNN) to address VPAC. Privacy attributes are considered as nodes on a graph and an MPNN is introduced to model the privacy attribute dependencies. To generate representative features for privacy attribute nodes, a class-wise encoder-decoder is proposed to learn a latent space for each attribute. An attention mechanism with multiple correlation matrices is also introduced in MPNN to learn the privacy attributes graph automatically. Experimental results on the Privacy Attribute Dataset demonstrate that our framework achieves better performance than state-of-the-art methods on visual privacy attributes classification.

A General Model for Learning Node and Graph Representations Jointly

Chaofan Chen

Responsive image

Auto-TLDR; Joint Community Detection/Dynamic Routing for Graph Classification

Slides Poster Similar

This paper focuses on two fundamental graph recognition tasks: node classification and graph classification. Existing methods usually learn the node and graph representations for these two tasks separately, and ignore modeling the relations between the local and global structures. In this paper, we propose a general approach to learn the local and global features collaboratively: (1) in order to characterize the correlation among nodes and communities (a set of nodes), we employ the joint community detection/dynamic routing modules to generate the clustering assignment matrices at first and then utilize these matrices to cluster nodes to capture the global information of graphs (locally relevant graph representations). Inspired by the success of spectral clustering, we minimize the ratiocut loss to help optimize the learned assignment matrices. (2) We maximize the mutual information between local and global representations to help learn the globally relevant node representations. Experimental results on a variety of node and graph classification benchmarks show that our model can achieve superior performance over the state-of-the-art approaches.

Tensor Factorization of Brain Structural Graph for Unsupervised Classification in Multiple Sclerosis

Berardino Barile, Marzullo Aldo, Claudio Stamile, Françoise Durand-Dubief, Dominique Sappey-Marinier

Responsive image

Auto-TLDR; A Fully Automated Tensor-based Algorithm for Multiple Sclerosis Classification based on Structural Connectivity Graph of the White Matter Network

Slides Poster Similar

Analysis of longitudinal changes in brain diseases is essential for a better characterization of pathological processes and evaluation of the prognosis. This is particularly important in Multiple Sclerosis (MS) which is the first traumatic disease in young adults, with unknown etiology and characterized by complex inflammatory and degenerative processes leading to different clinical courses. In this work, we propose a fully automated tensor-based algorithm for the classification of MS clinical forms based on the structural connectivity graph of the white matter (WM) network. Using non-negative tensor factorization (NTF), we first focused on the detection of pathological patterns of the brain WM network affected by significant longitudinal variations. Second, we performed unsupervised classification of different MS phenotypes based on these longitudinal patterns, and finally, we used the latent factors obtained by the factorization algorithm to identify the most affected brain regions.

Cost Volume Refinement for Depth Prediction

João L. Cardoso, Nuno Goncalves, Michael Wimmer

Responsive image

Auto-TLDR; Refining the Cost Volume for Depth Prediction from Light Field Cameras

Slides Poster Similar

Light-field cameras are becoming more popular in the consumer market. Their data redundancy allows, in theory, to accurately refocus images after acquisition and to predict the depth of each point visible from the camera. Combined, these two features allow for the generation of full-focus images, which is impossible in traditional cameras. Multiple methods for depth prediction from light fields (or stereo) have been proposed over the years. A large subset of these methods relies on cost-volume estimates -- 3D objects where each layer represents a heuristic of whether each point in the image is at a certain distance from the camera. Generally, this volume is used to regress a disparity map, which is then refined for better results. In this paper, we argue that refining the cost volumes is superior to refining the disparity maps in order to further increase the accuracy of depth predictions. We propose a set of cost-volume refinement algorithms and show their effectiveness.

Recovery of 2D and 3D Layout Information through an Advanced Image Stitching Algorithm Using Scanning Electron Microscope Images

Aayush Singla, Bernhard Lippmann, Helmut Graeb

Responsive image

Auto-TLDR; Image Stitching for True Geometrical Layout Recovery in Nanoscale Dimension

Slides Poster Similar

Image stitching describes the process of reconstruction of a high resolution image from combining multiple images. Using a scanning electron microscope as the image source, individual images will show patterns in a nm dimension whereas the combined image may cover an area of several mm2. The recovery of the physical layout of modern semiconductor products manufactured in advanced technologies nodes down to 22 nm requires a perfect stitching process with no deviation with respect to the original design data, as any stitching error will result in failures during the reconstruction of the electrical design. In addition, the recovery of the complete design requires the acquisition of all individual layers of a semiconductor device which represent a 3D structure with interconnections defining error limits on the stitching error for each individual scanned image mosaic. An advanced stitching and alignment process is presented enabling a true geometrical layout recovery in nanoscale dimensions which is also applied and evaluated on other use cases from biological applications.

Using Scene Graphs for Detecting Visual Relationships

Anurag Tripathi, Siddharth Srivastava, Brejesh Lall, Santanu Chaudhury

Responsive image

Auto-TLDR; Relationship Detection using Context Aligned Scene Graph Embeddings

Slides Poster Similar

In this paper we solve the problem of detecting relationships between pairs of objects in an image. We develop spatially aware word embeddings using scene graphs and use joint feature representations containing visual, spatial and semantic embeddings from the input images to train a deep network on the task of relationship detection. Further, we propose to utilize context aligned scene graph embeddings from the train set, without requiring explicit availability of scene graphs at test time. We show that the proposed method outperforms the state-of-the-art methods for predicate detection and provides competing results on relationship detection. We also show the generalization ability of the proposed method by performing predictions under zero shot settings. Further, we also provide an exhaustive empirical evaluation on each component of the proposed network.

Deep Convolutional Embedding for Digitized Painting Clustering

Giovanna Castellano, Gennaro Vessio

Responsive image

Auto-TLDR; A Deep Convolutional Embedding Model for Clustering Artworks

Slides Poster Similar

Clustering artworks is difficult because of several reasons. On one hand, recognizing meaningful patterns in accordance with domain knowledge and visual perception is extremely hard. On the other hand, the application of traditional clustering and feature reduction techniques to the highly dimensional pixel space can be ineffective. To address these issues, we propose to use a deep convolutional embedding model for digitized painting clustering, in which the task of mapping the input raw data to an abstract, latent space is jointly optimized with the task of finding a set of cluster centroids in this latent feature space. Quantitative and qualitative experimental results show the effectiveness of the proposed method. The model is also able to outperform other state-of-the-art deep clustering approaches to the same problem. The proposed method may be beneficial to several art-related tasks, particularly visual link retrieval and historical knowledge discovery in painting datasets.

2D Discrete Mirror Transform for Image Non-Linear Approximation

Alessandro Gnutti, Fabrizio Guerrini, Riccardo Leonardi

Responsive image

Auto-TLDR; Discrete Mirror Transform (DMT)

Slides Poster Similar

In this paper, a new 2D transform named Discrete Mirror Transform (DMT) is presented. The DMT is computed by decomposing a signal into its even and odd parts around an optimal location in a given direction so that the signal energy is maximally split between the two components. After minimizing the information required to regenerate the original signal by removing redundant structures, the process is iterated leading the signal energy to distribute into a continuously smaller set of coefficients. The DMT can be displayed as a binary tree, where each node represents the single (even or odd) signal derived from the decomposition in the previous level. An optimized version of the DMT (ODMT) is also introduced, by exploiting the possibility to choose different directions at which performing the decomposition. Experimental simulations have been carried out in order to test the sparsity properties of the DMT and ODMT when applied on images: referring to both transforms, the results show a superior performance with respect to the popular Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) in terms of non-linear approximation.

Adaptive Word Embedding Module for Semantic Reasoning in Large-Scale Detection

Yu Zhang, Xiaoyu Wu, Ruolin Zhu

Responsive image

Auto-TLDR; Adaptive Word Embedding Module for Object Detection

Slides Poster Similar

In recent years, convolutional neural networks have achieved rapid development in the field of object detection. However, due to the imbalance of data, high costs in labor and uneven level of data labeling, the overall performance of the previous detection network has dropped sharply when dataset extended to the large-scale with hundreds and thousands categories. We present the Adaptive Word Embedding Module, extracting the adaptive semantic knowledge graph to reach semantic consistency within one image. Our method endows the ability to infer global semantic of detection networks without other attribute or relationship annotations. Compared with Faster RCNN, the algorithm on the MSCOCO dataset was significantly improved by 4.1%, and the mAP value has reached 32.8%. On the VG1000 dataset, it increased by 0.9% to 6.7% compared with Faster RCNN. Adaptive Word Embedding Module is lightweight, general-purpose and can be plugged into diverse detection networks. Code will be made available.

Improved Time-Series Clustering with UMAP Dimension Reduction Method

Clément Pealat, Vincent Cheutet, Guillaume Bouleux

Responsive image

Auto-TLDR; Time Series Clustering with UMAP as a Pre-processing Step

Slides Poster Similar

Clustering is an unsupervised machine learning method giving insights on data without early knowledge. Classes of data are return by assembling similar elements together. Giving the increasing of the available data, this method is now applied in a lot of fields with various data types. Here, we propose to explore the case of time series clustering. Indeed, time series are one of the most classic data type, and are present in various fields such as medical or finance. This kind of data can be pre-processed by of dimension reduction methods, such as the recent UMAP algorithm. In this paper, a benchmark of time series clustering is created, comparing the results with and without UMAP as a pre-processing step. UMAP is used to enhance clustering results. For completeness, three different clustering algorithms and two different geometric representation for the time series (Classic Euclidean geometry, and Riemannian geometry on the Stiefel Manifold) are applied. The results are compared with and without UMAP as a pre-processing step on the databases available at UCR Time Series Classification Archive www.cs.ucr.edu/~eamonn/time_series_data/.

Kernel-based Graph Convolutional Networks

Hichem Sahbi

Responsive image

Auto-TLDR; Spatial Graph Convolutional Networks in Recurrent Kernel Hilbert Space

Slides Poster Similar

Learning graph convolutional networks (GCNs) is an emerging field which aims at generalizing deep learning to arbitrary non-regular domains. Most of the existing GCNs follow a neighborhood aggregation scheme, where the representation of a node is recursively obtained by aggregating its neighboring node representations using averaging or sorting operations. However, these operations are either ill-posed or weak to be discriminant or increase the number of training parameters and thereby the computational complexity and the risk of overfitting. In this paper, we introduce a novel GCN framework that achieves spatial graph convolution in a reproducing kernel Hilbert space. The latter makes it possible to design, via implicit kernel representations, convolutional graph filters in a high dimensional and more discriminating space without increasing the number of training parameters. The particularity of our GCN model also resides in its ability to achieve convolutions without explicitly realigning nodes in the receptive fields of the learned graph filters with those of the input graphs, thereby making convolutions permutation agnostic and well defined. Experiments conducted on the challenging task of skeleton-based action recognition show the superiority of the proposed method against different baselines as well as the related work.

TreeRNN: Topology-Preserving Deep Graph Embedding and Learning

Yecheng Lyu, Ming Li, Xinming Huang, Ulkuhan Guler, Patrick Schaumont, Ziming Zhang

Responsive image

Auto-TLDR; TreeRNN: Recurrent Neural Network for General Graph Classification

Slides Poster Similar

General graphs are difficult for learning due to their irregular structures. Existing works employ message passing along graph edges to extract local patterns using customized graph kernels, but few of them are effective for the integration of such local patterns into global features. In contrast, in this paper we study the methods to transfer the graphs into trees so that explicit orders are learned to direct the feature integration from local to global. To this end, we apply the breadth first search (BFS) to construct trees from the graphs, which adds direction to the graph edges from the center node to the peripheral nodes. In addition, we proposed a novel projection scheme that transfer the trees to image representations, which is suitable for conventional convolution neural networks (CNNs) and recurrent neural networks (RNNs). To best learn the patterns from the graph-tree-images, we propose TreeRNN, a 2D RNN architecture that recurrently integrates the image pixels by rows and columns to help classify the graph categories. We evaluate the proposed method on several graph classification datasets, and manage to demonstrate comparable accuracy with the state-of-the-art on MUTAG, PTC-MR and NCI1 datasets.

Joint Learning Multiple Curvature Descriptor for 3D Palmprint Recognition

Lunke Fei, Bob Zhang, Jie Wen, Chunwei Tian, Peng Liu, Shuping Zhao

Responsive image

Auto-TLDR; Joint Feature Learning for 3D palmprint recognition using curvature data vectors

Slides Poster Similar

3D palmprint-based biometric recognition has drawn growing research attention due to its several merits over 2D counterpart such as robust structural measurement of a palm surface and high anti-counterfeiting capability. However, most existing 3D palmprint descriptors are hand-crafted that usually extract stationary features from 3D palmprint images. In this paper, we propose a feature learning method to jointly learn compact curvature feature descriptor for 3D palmprint recognition. We first form multiple curvature data vectors to completely sample the intrinsic curvature information of 3D palmprint images. Then, we jointly learn a feature projection function that project curvature data vectors into binary feature codes, which have the maximum inter-class variances and minimum intra-class distance so that they are discriminative. Moreover, we learn the collaborative binary representation of the multiple curvature feature codes by minimizing the information loss between the final representation and the multiple curvature features, so that the proposed method is more compact in feature representation and efficient in matching. Experimental results on the baseline 3D palmprint database demonstrate the superiority of the proposed method in terms of recognition performance in comparison with state-of-the-art 3D palmprint descriptors.

GraphBGS: Background Subtraction Via Recovery of Graph Signals

Jhony Heriberto Giraldo Zuluaga, Thierry Bouwmans

Responsive image

Auto-TLDR; Graph BackGround Subtraction using Graph Signals

Slides Poster Similar

Background subtraction is a fundamental pre-processing task in computer vision. This task becomes challenging in real scenarios due to variations in the background for both static and moving camera sequences. Several deep learning methods for background subtraction have been proposed in the literature with competitive performances. However, these models show performance degradation when tested on unseen videos; and they require huge amount of data to avoid overfitting. Recently, graph-based algorithms have been successful approaching unsupervised and semi-supervised learning problems. Furthermore, the theory of graph signal processing and semi-supervised learning have been combined leading to new insights in the field of machine learning. In this paper, concepts of recovery of graph signals are introduced in the problem of background subtraction. We propose a new algorithm called Graph BackGround Subtraction (GraphBGS), which is composed of: instance segmentation, background initialization, graph construction, graph sampling, and a semi-supervised algorithm inspired from the theory of recovery of graph signals. Our algorithm has the advantage of requiring less data than deep learning methods while having competitive results on both: static and moving camera videos. GraphBGS outperforms unsupervised and supervised methods in several challenging conditions on the publicly available Change Detection (CDNet2014), and UCSD background subtraction databases.

Directional Graph Networks with Hard Weight Assignments

Miguel Dominguez, Raymond Ptucha

Responsive image

Auto-TLDR; Hard Directional Graph Networks for Point Cloud Analysis

Slides Poster Similar

Point cloud analysis is an important field for 3D scene understanding. It has applications in self driving cars and robotics (via LIDAR sensors), 3D graphics, and computer-aided design. Neural networks have recently achieved strong results on point cloud analysis problems such as classification and segmentation. Each point cloud network has the challenge of defining a convolution that can learn useful features on unstructured points. Some recent point cloud convolutions create separate weight matrices for separate directions like a CNN, but apply every weight matrix to every neighbor with soft assignments. This increases computational complexity and makes relatively small neighborhood aggregations expensive to compute. We propose Hard Directional Graph Networks (HDGN), a point cloud model that both learns directional weight matrices and assigns a single matrix to each neighbor, achieving directional convolutions at lower computational cost. HDGN's directional modeling achieves state-of-the-art results on multiple point cloud vision benchmarks.

How to Define a Rejection Class Based on Model Learning?

Sarah Laroui, Xavier Descombes, Aurelia Vernay, Florent Villiers, Francois Villalba, Eric Debreuve

Responsive image

Auto-TLDR; An innovative learning strategy for supervised classification that is able, by design, to reject a sample as not belonging to any of the known classes

Slides Poster Similar

In supervised classification, the learning process typically trains a classifier to optimize the accuracy of classifying data into the classes that appear in the learning set, and only them. While this framework fits many use cases, there are situations where the learning process is knowingly performed using a learning set that only represents the data that have been observed so far among a virtually unconstrained variety of possible samples. It is then crucial to define a classifier which has the ability to reject a sample, i.e., to classify it into a rejection class that has not been yet defined. Although obvious solutions can add this ability a posteriori to a classifier that has been learned classically, a better approach seems to directly account for this requirement in the classifier design. In this paper, we propose an innovative learning strategy for supervised classification that is able, by design, to reject a sample as not belonging to any of the known classes. For that, we rely on modeling each class as the combination of a probability density function (PDF) and a threshold that is computed with respect to the other classes. Several alternatives are proposed and compared in this framework. A comparison with straightforward approaches is also provided.

Graph Convolutional Neural Networks for Power Line Outage Identification

Jia He, Maggie Cheng

Responsive image

Auto-TLDR; Graph Convolutional Networks for Power Line Outage Identification

Poster Similar

In this paper, we consider the power line outage identification problem as a graph signal classification problem, where the signal at each vertex is given as a time series. We propose graph convolutional networks (GCNs) for the task of classifying signals supported on graphs. An important element of the GCN design is filter design. We consider filtering signals in either the vertex (spatial) domain, or the frequency (spectral) domain. Two basic architectures are proposed. In the spatial GCN architecture, the GCN uses a graph shift operator as the basic building block to incorporate the underlying graph structure into the convolution layer. The spatial filter directly utilizes the graph connectivity information. It defines the filter to be a polynomial in the graph shift operator to obtain the convolved features that aggregate neighborhood information of each node. In the spectral GCN architecture, a frequency filter is used instead. A graph Fourier transform operator first transforms the raw graph signal from the vertex domain to the frequency domain, and then a filter is defined using the graph's spectral parameters. The spectral GCN then uses the output from the graph Fourier transform to compute the convolved features. There are additional challenges to classify the time-evolving graph signal as the signal value at each vertex changes over time. The GCNs are designed to recognize different spatiotemporal patterns from high-dimensional data defined on a graph. The application of the proposed methods to power line outage identification shows that these GCN architectures can successfully classify abnormal signal patterns and identify the outage location.

Supervised Feature Embedding for Classification by Learning Rank-Based Neighborhoods

Ghazaal Sheikhi, Hakan Altincay

Responsive image

Auto-TLDR; Supervised Feature Embedding with Representation Learning of Rank-based Neighborhoods

Slides Similar

In feature embedding, the recovery of associated discriminative information in the reduced subspace is critical for downstream classifiers. In this study, a supervised feature embedding method is proposed inspired by the well-known word embedding technique, word2vec. Proposed embedding method is implemented as representative learning of rank-based neighborhoods. The notion of context words in word2vec is extended into neighboring instances within a given window. Neighborship is defined using ranks of instances rather than their values so that regions with different densities are captured properly. Each sample is represented by a unique one-hot vector whereas its neighbors are encoded by several two-hot vectors. The two-hot vectors are identical for neighboring samples of the same class. A feed-forward neural network with a continuous projection layer, then learns the mapping from one-hot vectors to multiple two-hot vectors. The hidden layer determines the reduced subspace for the train samples. The obtained transformation is then applied on test data to find a lower-dimensional representation. Proposed method is tested in classification problems on 10 UCI data sets. Experimental results confirm that the proposed method is effective in finding a discriminative representation of the features and outperforms several supervised embedding approaches in terms of classification performance.

Efficient Game-Theoretic Hypergraph Matching

Jian Hou, Nai-Ming Qi

Responsive image

Auto-TLDR; Hypergraph Matching with Game-Theoretic Clustering

Slides Poster Similar

Feature matching is a fundamental problem in computer vision. Compared with graph matching, hypergraph matching is able to encode more invariance between correspondences. Different from the majority of existing hypergraph matching algorithms, a game-theoretic algorithm has been developed by transforming hypergraph matching to hypergraph clustering, which is then solved within the framework of a non-cooperative multi-player clustering game. This algorithm obtains the final matches as a cluster of consistent candidate matches and has high accuracy and robustness to outliers in comparison with other competitors. However, in further works we find that this algorithm tends to generate a small number of matches, and the increase of number of matches can only be obtained at the cost of a huge computation load. Our investigation of the algorithm shows that it has a large requirement of internal similarity in a cluster, and therefore generates small clusters of high density. This motivates us to expand the cluster so that more candidate matches are accepted as final matches. For this purpose, we define the density of vertices in a hypergraph and expand the cluster based on relative density relationship among the vertices. In matching experiments with both synthetic and real datasets, our algorithm is shown to generate the same number of or more matches with much less running time in comparison with the original algorithm. Meanwhile, it preserves the advantage of high accuracy and robustness to outlier in comparison with some competitors.