ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization

Yair Shemer, Daniel Rotman, Nahum Shimkin

Auto-TLDR; ILS-SUMM: Iterated Local Search for Video Summarization

Abstract Slides

In recent years, there has been an increasing interest in building video summarization tools, where the goal is to automatically create a short summary of an input video that properly represents the original content. We consider shot-based video summarization where the summary consists of a subset of the video shots which can be of various lengths. A straightforward approach to maximize the representativeness of a subset of shots is by minimizing the total distance between shots and their nearest selected shots. We formulate the task of video summarization as an optimization problem with a knapsack-like constraint on the total summary duration. Previous studies have proposed greedy algorithms to solve this problem approximately, but no experiments were presented to measure the ability of these methods to obtain solutions with low total distance. Indeed, our experiments on video summarization datasets show that the success of current methods in obtaining results with low total distance still has much room for improvement. In this paper, we develop ILS-SUMM, a novel video summarization algorithm to solve the subset selection problem under the knapsack constraint. Our algorithm is based on the well-known metaheuristic optimization framework -- Iterated Local Search (ILS), known for its ability to avoid weak local minima and obtain a good near-global minimum. Extensive experiments show that our method finds solutions with significantly better total distance than previous methods. Moreover, to indicate the high scalability of ILS-SUMM, we introduce a new dataset consisting of videos of various lengths.

Similar papers

Video Summarization with a Dual Attention Capsule Network

Hao Fu, Hongxing Wang, Jianyu Yang

Auto-TLDR; Dual Self-Attention Capsule Network for Video Summarization

ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization

Similar papers

Video Summarization with a Dual Attention Capsule Network

Text Synopsis Generation for Egocentric Videos

ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos

Scientific Document Summarization using Citation Context and Multi-objective Optimization

Automated Whiteboard Lecture Video Summarization by Content Region Detection and Representation

Hierarchical Multimodal Attention for Deep Video Summarization

Creating Classifier Ensembles through Meta-Heuristic Algorithms for Aerial Scene Classification

Exploiting Local Indexing and Deep Feature Confidence Scores for Fast Image-To-Video Search

RWF-2000: An Open Large Scale Video Database for Violence Detection

Progressive Learning Algorithm for Efficient Person Re-Identification

Audio-Based Near-Duplicate Video Retrieval with Audio Similarity Learning

Deep Convolutional Embedding for Digitized Painting Clustering

A Heuristic-Based Decision Tree for Connected Components Labeling of 3D Volumes

Expectation-Maximization for Scheduling Problems in Satellite Communication

Adaptive Sampling of Pareto Frontiers with Binary Constraints Using Regression and Classification

Understanding Integrated Gradients with SmoothTaylor for Deep Neural Network Attribution

A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular Control

Attention-Based Deep Metric Learning for Near-Duplicate Video Retrieval

RMS-Net: Regression and Masking for Soccer Event Spotting

3D Semantic Labeling of Photogrammetry Meshes Based on Active Learning

Model Decay in Long-Term Tracking

What and How? Jointly Forecasting Human Action and Pose

Making Every Label Count: Handling Semantic Imprecision by Integrating Domain Knowledge

Deep Composer: A Hash-Based Duplicative Neural Network for Generating Multi-Instrument Songs

Minority Class Oriented Active Learning for Imbalanced Datasets

Trajectory Representation Learning for Multi-Task NMRDP Planning

Learning Natural Thresholds for Image Ranking

Learning from Learners: Adapting Reinforcement Learning Agents to Be Competitive in a Card Game

Memetic Evolution of Training Sets with Adaptive Radial Basis Kernels for Support Vector Machines

Video Face Manipulation Detection through Ensemble of CNNs

ClusterFace: Joint Clustering and Classification for Set-Based Face Recognition

Deep Reinforcement Learning on a Budget: 3D Control and Reasoning without a Supercomputer

Learning Embeddings for Image Clustering: An Empirical Study of Triplet Loss Approaches

N2D: (Not Too) Deep Clustering Via Clustering the Local Manifold of an Autoencoded Embedding

Sketch-Based Community Detection Via Representative Node Sampling

Ballroom Dance Recognition from Audio Recordings

AdaFilter: Adaptive Filter Design with Local Image Basis Decomposition for Optimizing Image Recognition Preprocessing

Edge-Aware Monocular Dense Depth Estimation with Morphology

SSDL: Self-Supervised Domain Learning for Improved Face Recognition

Trainable Spectrally Initializable Matrix Transformations in Convolutional Neural Networks

Interactive Style Space of Deep Features and Style Innovation

Iterative Label Improvement: Robust Training by Confidence Based Filtering and Dataset Partitioning

Trajectory-User Link with Attention Recurrent Networks

Sample-Dependent Distance for 1 : N Identification Via Discriminative Feature Selection

A Grid-Based Representation for Human Action Recognition

A Quantitative Evaluation Framework of Video De-Identification Methods

Video Episode Boundary Detection with Joint Episode-Topic Model

Developing Motion Code Embedding for Action Recognition in Videos