ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Leveraging Sequential Pattern Information for Active Learning from Sequential Data

Raul Fidalgo-Merino, Lorenzo Gabrielli, Enrico Checchi

Auto-TLDR; Sequential Pattern Information for Active Learning

Abstract Slides Poster

This paper presents a novel active learning technique aimed at the selection of sequences for manual annotation from a database of unlabelled sequences. Supervised machine learning algorithms can employ these sequences to build better models than those based on using random sequences for training. The main contribution of the proposed method is the use of sequential pattern information contained in the database to select representative and diverse sequences for annotation. These two characteristics ensure the proper coverage of the instance space of sequences and, at the same time, avoids over-fitting the trained model. The approach, called SPIAL (Sequential Pattern Information for Active Learning), uses sequential pattern mining algorithms to extract frequently occurring sub-sequences from the database and evaluates how representative and diverse each sequence is, based on this information. The output is a list of sequences for annotation sorted by representativeness and diversity. The algorithm is modular and, unlike current techniques, independent of the features taken into account by the machine learning algorithm that trains the model. Experiments done on well-known benchmarks involving sequential data show that the models trained using SPIAL increase their convergence speed while reducing manual effort by selecting small sets of very informative sequences for annotation. In addition, the computation cost using SPIAL is much lower than for the state-of-the-art algorithms evaluated.

Similar papers

Budgeted Batch Mode Active Learning with Generalized Cost and Utility Functions

Arvind Agarwal, Shashank Mujumdar, Nitin Gupta, Sameep Mehta

Auto-TLDR; Active Learning Based on Utility and Cost Functions

Leveraging Sequential Pattern Information for Active Learning from Sequential Data

Similar papers

Budgeted Batch Mode Active Learning with Generalized Cost and Utility Functions

Learning Neural Textual Representations for Citation Recommendation

Algorithm Recommendation for Data Streams

Minority Class Oriented Active Learning for Imbalanced Datasets

Learning to Rank for Active Learning: A Listwise Approach

Scientific Document Summarization using Citation Context and Multi-objective Optimization

Multi-annotator Probabilistic Active Learning

Assessing the Severity of Health States Based on Social Media Posts

Adversarial Training for Aspect-Based Sentiment Analysis with BERT

Text Synopsis Generation for Egocentric Videos

Multimodal Side-Tuning for Document Classification

Named Entity Recognition and Relation Extraction with Graph Neural Networks in Semi Structured Documents

Automatic Classification of Human Granulosa Cells in Assisted Reproductive Technology Using Vibrational Spectroscopy Imaging

Cross-Supervised Joint-Event-Extraction with Heterogeneous Information Networks

Weakly Supervised Learning through Rank-Based Contextual Measures

The eXPose Approach to Crosslier Detection

Categorizing the Feature Space for Two-Class Imbalance Learning

Uncertainty-Aware Data Augmentation for Food Recognition

Textual-Content Based Classification of Bundles of Untranscribed of Manuscript Images

PIN: A Novel Parallel Interactive Network for Spoken Language Understanding

Zero-Shot Text Classification with Semantically Extended Graph Convolutional Network

Sketch-Based Community Detection Via Representative Node Sampling

Reinforcement Learning with Dual Attention Guided Graph Convolution for Relation Extraction

Automatic Annotation of Corpora for Emotion Recognition through Facial Expressions Analysis

Explain2Attack: Text Adversarial Attacks via Cross-Domain Interpretability

An Adaptive Video-To-Video Face Identification System Based on Self-Training

Equation Attention Relationship Network (EARN) : A Geometric Deep Metric Framework for Learning Similar Math Expression Embedding

A Novel Adaptive Minority Oversampling Technique for Improved Classification in Data Imbalanced Scenarios

Multi-Attribute Learning with Highly Imbalanced Data

Classifier Pool Generation Based on a Two-Level Diversity Approach

Active Sampling for Pairwise Comparisons via Approximate Message Passing and Information Gain Maximization

Rethinking Deep Active Learning: Using Unlabeled Data at Model Training

Personalized Models in Human Activity Recognition Using Deep Learning

Transformer Reasoning Network for Image-Text Matching and Retrieval

Memetic Evolution of Training Sets with Adaptive Radial Basis Kernels for Support Vector Machines

PIF: Anomaly detection via preference embedding

3D Semantic Labeling of Photogrammetry Meshes Based on Active Learning

Attentive Visual Semantic Specialized Network for Video Captioning

Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks

Mean Decision Rules Method with Smart Sampling for Fast Large-Scale Binary SVM Classification

Watermelon: A Novel Feature Selection Method Based on Bayes Error Rate Estimation and a New Interpretation of Feature Relevance and Redundancy

Making Every Label Count: Handling Semantic Imprecision by Integrating Domain Knowledge

A Multilinear Sampling Algorithm to Estimate Shapley Values

Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes

Confidence Calibration for Deep Renal Biopsy Immunofluorescence Image Classification

GCNs-Based Context-Aware Short Text Similarity Model

Learning with Delayed Feedback

Supporting Skin Lesion Diagnosis with Content-Based Image Retrieval