ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Exploiting Local Indexing and Deep Feature Confidence Scores for Fast Image-To-Video Search

Savas Ozkan, Gözde Bozdağı Akar

Auto-TLDR; Fast and Robust Image-to-Video Retrieval Using Local and Global Descriptors

Abstract Slides Poster

Cost-effective visual representation and fast query-by-example search are two challenging goals hat should be provided for web-scale visual retrieval task on a moderate hardware. In this paper, we introduce a fast yet robust method that ensures both of these goals by obtaining the state-of-the-art results for an image-to-video search scenario. To this end, we present important enhancements to commonly used indexing and visual representation techniques by promoting faster, better and more moderate retrieval performance. We also boost the effectiveness of the method for visual distortion by exploiting the individual decision results of local and global descriptors in the query time. By this way, local content descriptors effectively represent copied / duplicated scenes with large geometric deformations, while global descriptors for near duplicate and semantic searches are more practical. Experiments are conducted on the large-scale Stanford I2V dataset. The experimental results show that the method is effective in terms of complexity and query processing time for large-scale visual retrieval scenarios, even if local and global representations are used together. In addition, the proposed method is fairly accurate and achieves state-of-the-art performance based on the mAP score of the dataset. Lastly, we report additional mAP scores after updating the ground annotations obtained by the retrieval results of the proposed method showing more clearly the actual performance.

Similar papers

Hierarchical Deep Hashing for Fast Large Scale Image Retrieval

Yongfei Zhang, Cheng Peng, Zhang Jingtao, Xianglong Liu, Shiliang Pu, Changhuai Chen

Auto-TLDR; Hierarchical indexed deep hashing for fast large scale image retrieval

Exploiting Local Indexing and Deep Feature Confidence Scores for Fast Image-To-Video Search

Similar papers

Hierarchical Deep Hashing for Fast Large Scale Image Retrieval

Attention-Based Deep Metric Learning for Near-Duplicate Video Retrieval

Multi-Scale Keypoint Matching

Audio-Based Near-Duplicate Video Retrieval with Audio Similarity Learning

Visual Localization for Autonomous Driving: Mapping the Accurate Location in the City Maze

Discrete Semantic Matrix Factorization Hashing for Cross-Modal Retrieval

VSB^2-Net: Visual-Semantic Bi-Branch Network for Zero-Shot Hashing

Leveraging Quadratic Spherical Mutual Information Hashing for Fast Image Retrieval

Supporting Skin Lesion Diagnosis with Content-Based Image Retrieval

Do We Really Need Scene-Specific Pose Encoders?

Improved Deep Classwise Hashing with Centers Similarity Learning for Image Retrieval

Object Classification of Remote Sensing Images Based on Optimized Projection Supervised Discrete Hashing

DFH-GAN: A Deep Face Hashing with Generative Adversarial Network

Hybrid Decomposition Convolution Neural Network and Vocabulary Forest for Image Retrieval

Comparison of Deep Learning and Hand Crafted Features for Mining Simulation Data

Cross-Media Hash Retrieval Using Multi-head Attention Network

Fast Discrete Cross-Modal Hashing Based on Label Relaxation and Matrix Factorization

Enhancing Deep Semantic Segmentation of RGB-D Data with Entangled Forests

Not 3D Re-ID: Simple Single Stream 2D Convolution for Robust Video Re-Identification

Aggregating Object Features Based on Attention Weights for Fine-Grained Image Retrieval

Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

Multi-Level Deep Learning Vehicle Re-Identification Using Ranked-Based Loss Functions

A CNN-RNN Framework for Image Annotation from Visual Cues and Social Network Metadata

On Identification and Retrieval of Near-Duplicate Biological Images: A New Dataset and Protocol

Label Self-Adaption Hashing for Image Retrieval

RSINet: Rotation-Scale Invariant Network for Online Visual Tracking

Story Comparison for Estimating Field of View Overlap in a Video Collection

Large-Scale Historical Watermark Recognition: Dataset and a New Consistency-Based Approach

Localization and Transformation Reconstruction of Image Regions: An Extended Congruent Triangles Approach

ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization

Weakly Supervised Learning through Rank-Based Contextual Measures

Adaptive L2 Regularization in Person Re-Identification

Generalized Local Attention Pooling for Deep Metric Learning

Loop-closure detection by LiDAR scan re-identification

Writer Identification Using Deep Neural Networks: Impact of Patch Size and Number of Patches

Effective Deployment of CNNs for 3DoF Pose Estimation and Grasping in Industrial Settings

Deep Composer: A Hash-Based Duplicative Neural Network for Generating Multi-Instrument Songs

Distinctive 3D Local Deep Descriptors

A Grid-Based Representation for Human Action Recognition

Ordinal Depth Classification Using Region-Based Self-Attention

Progressive Learning Algorithm for Efficient Person Re-Identification

Joint Learning Multiple Curvature Descriptor for 3D Palmprint Recognition

Light3DPose: Real-Time Multi-Person 3D Pose Estimation from Multiple Views

Text Synopsis Generation for Egocentric Videos

Equation Attention Relationship Network (EARN) : A Geometric Deep Metric Framework for Learning Similar Math Expression Embedding

RISEdb: A Novel Indoor Localization Dataset

Modeling the Distribution of Normal Data in Pre-Trained Deep Features for Anomaly Detection

Can You Trust Your Pose? Confidence Estimation in Visual Localization