ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Visual Object Tracking in Drone Images with Deep Reinforcement Learning

Derya Gözen, Sedat Ozer

Auto-TLDR; A Deep Reinforcement Learning based Single Object Tracker for Drone Applications

Abstract Slides Poster

There is an increasing demand on utilizing camera equipped drones and their applications in many domains varying from agriculture to entertainment and from sports events to surveillance. In such drone applications, an essential and a common task is tracking an object of interest visually. Drone (or UAV) images have different properties when compared to the ground taken (natural) images and those differences introduce additional complexities to the existing object trackers to be directly applied on drone applications. Some important differences among those complexities include (i) smaller object sizes to be tracked and (ii) different orientations and viewing angles yielding different texture and features to be observed. Therefore, new algorithms trained on drone images are needed for the drone-based applications. In this paper, we introduce a deep reinforcement learning (RL) based single object tracker that tracks an object of interest in drone images by estimating a series of actions to find the location of the object in the next frame. This is the first work introducing a single object tracker using a deep RL-based technique for drone images. Our proposed solution introduces a novel reward function that aims to reduce the total number of actions taken to estimate the object's location in the next frame and also introduces a different backbone network to be used on low resolution images. Additionally, we introduce a set of new actions into the action library to better deal with the above-mentioned complexities. We compare our proposed solutions to a state of the art tracking algorithm from the recent literature and demonstrate up to 3.87\% improvement in precision and 3.6\% improvement in IoU values on the VisDrone2019 dataset. We also provide additional results on OTB-100 dataset and show up to 3.15\% improvement in precision on the OTB-100 dataset when compared to the same previous state of the art algorithm. Lastly, we analyze the ability to handle some of the challenges faced during tracking, including but not limited to occlusion, deformation, and scale variation for our proposed solutions.

Similar papers

RSINet: Rotation-Scale Invariant Network for Online Visual Tracking

Yang Fang, Geunsik Jo, Chang-Hee Lee

Auto-TLDR; RSINet: Rotation-Scale Invariant Network for Adaptive Tracking

Visual Object Tracking in Drone Images with Deep Reinforcement Learning

Similar papers

RSINet: Rotation-Scale Invariant Network for Online Visual Tracking

Model Decay in Long-Term Tracking

Siamese Fully Convolutional Tracker with Motion Correction

DAL: A Deep Depth-Aware Long-Term Tracker

VTT: Long-Term Visual Tracking with Transformers

Tackling Occlusion in Siamese Tracking with Structured Dropouts

TSDM: Tracking by SiamRPN++ with a Depth-Refiner and a Mask-Generator

Adaptive Context-Aware Discriminative Correlation Filters for Robust Visual Object Tracking

MFST: Multi-Features Siamese Tracker

Robust Visual Object Tracking with Two-Stream Residual Convolutional Networks

Efficient Correlation Filter Tracking with Adaptive Training Sample Update Scheme

Reducing False Positives in Object Tracking with Siamese Network

SiamMT: Real-Time Arbitrary Multi-Object Tracking

AerialMPTNet: Multi-Pedestrian Tracking in Aerial Imagery Using Temporal and Graphical Features

Exploiting Distilled Learning for Deep Siamese Tracking

Compact and Discriminative Multi-Object Tracking with Siamese CNNs

ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos

SyNet: An Ensemble Network for Object Detection in UAV Images

Object-Oriented Map Exploration and Construction Based on Auxiliary Task Aided DRL

Tracking Fast Moving Objects by Segmentation Network

RLST: A Reinforcement Learning Approach to Scene Text Detection Refinement

Low Dimensional State Representation Learning with Reward-Shaped Priors

Siamese Dynamic Mask Estimation Network for Fast Video Object Segmentation

SynDHN: Multi-Object Fish Tracker Trained on Synthetic Underwater Videos

A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular Control

IPT: A Dataset for Identity Preserved Tracking in Closed Domains

Vehicle Lane Merge Visual Benchmark

Vacant Parking Space Detection Based on Task Consistency and Reinforcement Learning

Visual Saliency Oriented Vehicle Scale Estimation

Utilising Visual Attention Cues for Vehicle Detection and Tracking

Adaptive Remote Sensing Image Attribute Learning for Active Object Detection

Gabriella: An Online System for Real-Time Activity Detection in Untrimmed Security Videos

Revisiting Sequence-To-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

An Adaptive Fusion Model Based on Kalman Filtering and LSTM for Fast Tracking of Road Signs

Mobile Augmented Reality: Fast, Precise, and Smooth Planar Object Tracking

Meta Learning Via Learned Loss

Object Detection Model Based on Scene-Level Region Proposal Self-Attention

Anomaly Detection, Localization and Classification for Railway Inspection

Precise Temporal Action Localization with Quantified Temporal Structure of Actions

Can Reinforcement Learning Lead to Healthy Life?: Simulation Study Based on User Activity Logs

Improving Robotic Grasping on Monocular Images Via Multi-Task Learning and Positional Loss

Real-Time Drone Detection and Tracking with Visible, Thermal and Acoustic Sensors

Construction Worker Hardhat-Wearing Detection Based on an Improved BiFPN

Early Wildfire Smoke Detection in Videos

A Grid-Based Representation for Human Action Recognition

Learning Object Deformation and Motion Adaption for Semi-Supervised Video Object Segmentation

Detecting and Adapting to Crisis Pattern with Context Based Deep Reinforcement Learning

Iterative Bounding Box Annotation for Object Detection