ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

The DeepScoresV2 Dataset and Benchmark for Music Object Detection

Lukas Tuggener, Yvan Putra Satyawan, Alexander Pacha, Jürgen Schmidhuber, Thilo Stadelmann

Auto-TLDR; DeepScoresV2: an extended version of the DeepScores dataset for optical music recognition

Abstract Slides Poster

In this paper, we present DeepScoresV2, an extended version of the DeepScores dataset for optical music recognition (OMR). We improve upon the original DeepScores dataset by providing much more detailed annotations, namely (a) annotations for 135 classes including fundamental symbols of non-fixed size and shape, increasing the number of annotated symbols by 23%; (b) oriented bounding boxes; (c) higher-level rhythm and pitch information (onset beat for all symbols and line position for noteheads); and (d) a compatibility mode for easy use in conjunction with the MUSCIMA++ dataset for OMR on handwritten documents. These additions open up the potential for future advancement in OMR research. Additionally, we release two state-of-the-art baselines for DeepScoresV2 based on Faster R-CNN and the Deep Watershed Detector. An analysis of the baselines shows that regular orthogonal bounding boxes are unsuitable for objects which are long, small, and potentially rotated, such as ties and beams, which demonstrates the need for detection algorithms that naturally incorporate object angles. Dataset, code and pre-trained models, as well as user instructions, are publicly available at https://tuggeluk.github.io/dsv2_preview/

Similar papers

EAGLE: Large-Scale Vehicle Detection Dataset in Real-World Scenarios Using Aerial Imagery

Seyed Majid Azimi, Reza Bahmanyar, Corentin Henry, Kurz Franz

Auto-TLDR; EAGLE: A Large-Scale Dataset for Multi-class Vehicle Detection with Object Orientation Information in Airborne Imagery

Abstract Slides Similar

Multi-class vehicle detection from airborne imagery with orientation estimation is an important task in the near and remote vision domains with applications in traffic monitoring and disaster management. In the last decade, we have witnessed significant progress in object detection in ground imagery, but it is still in its infancy in airborne imagery, mostly due to the scarcity of diverse and large-scale datasets. Despite being a useful tool for different applications, current airborne datasets only partially reflect the challenges of real-world scenarios. To address this issue, we introduce EAGLE (oriEnted object detection using Aerial imaGery in real-worLd scEnarios), a large-scale dataset for multi-class vehicle detection with object orientation information in aerial imagery. It features high-resolution aerial images composed of different real-world situations with a wide variety of camera sensor, resolution, flight altitude, weather, illumination, haze, shadow, time, city, country, occlusion, and camera angle. The annotation was done by airborne imagery experts with small- and large-vehicle classes. EAGLE contains 215,986 instances annotated with oriented bounding boxes defined by four points and orientation, making it by far the largest dataset to date in this task. It also supports researches on the haze and shadow removal as well as super-resolution and in-painting applications. We define three tasks: detection by (1) horizontal bounding boxes, (2) rotated bounding boxes, and (3) oriented bounding boxes. We carried out several experiments to evaluate several state-of-the-art methods in object detection on our dataset to form a baseline. Experiments show that the EAGLE dataset accurately reflects real-world situations and correspondingly challenging applications. The dataset will be made publicly available.

Vision-Based Layout Detection from Scientific Literature Using Recurrent Convolutional Neural Networks

Huichen Yang, William Hsu

Auto-TLDR; Transfer Learning for Scientific Literature Layout Detection Using Convolutional Neural Networks

The DeepScoresV2 Dataset and Benchmark for Music Object Detection

Similar papers

EAGLE: Large-Scale Vehicle Detection Dataset in Real-World Scenarios Using Aerial Imagery

Vision-Based Layout Detection from Scientific Literature Using Recurrent Convolutional Neural Networks

Detecting Objects with High Object Region Percentage

A Few-Shot Learning Approach for Historical Ciphered Manuscript Recognition

Tiny Object Detection in Aerial Images

SyNet: An Ensemble Network for Object Detection in UAV Images

Iterative Bounding Box Annotation for Object Detection

Image-Based Table Cell Detection: A New Dataset and an Improved Detection Method

StrongPose: Bottom-up and Strong Keypoint Heat Map Based Pose Estimation

CASNet: Common Attribute Support Network for Image Instance and Panoptic Segmentation

IPT: A Dataset for Identity Preserved Tracking in Closed Domains

An Integrated Approach of Deep Learning and Symbolic Analysis for Digital PDF Table Extraction

An Evaluation of DNN Architectures for Page Segmentation of Historical Newspapers

TGCRBNW: A Dataset for Runner Bib Number Detection (and Recognition) in the Wild

HPERL: 3D Human Pose Estimastion from RGB and LiDAR

Scene Text Detection with Selected Anchors

Object Features and Face Detection Performance: Analyses with 3D-Rendered Synthetic Data

Effective Deployment of CNNs for 3DoF Pose Estimation and Grasping in Industrial Settings

A Fast and Accurate Object Detector for Handwritten Digit String Recognition

Text Recognition - Real World Data and Where to Find Them

End-To-End Deep Learning Methods for Automated Damage Detection in Extreme Events at Various Scales

FeatureNMS: Non-Maximum Suppression by Learning Feature Embeddings

Automated Whiteboard Lecture Video Summarization by Content Region Detection and Representation

Recursive Recognition of Offline Handwritten Mathematical Expressions

Detective: An Attentive Recurrent Model for Sparse Object Detection

A Novel Region of Interest Extraction Layer for Instance Segmentation

Hybrid Cascade Point Search Network for High Precision Bar Chart Component Detection

Text Baseline Recognition Using a Recurrent Convolutional Neural Network

Gabriella: An Online System for Real-Time Activity Detection in Untrimmed Security Videos

SIMCO: SIMilarity-Based Object COunting

The HisClima Database: Historical Weather Logs for Automatic Transcription and Information Extraction

CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

Tracking Fast Moving Objects by Segmentation Network

Uncertainty Guided Recognition of Tiny Craters on the Moon

Transformer-Encoder Detector Module: Using Context to Improve Robustness to Adversarial Attacks on Object Detection

Multiple-Step Sampling for Dense Object Detection and Counting

Derivation of Geometrically and Semantically Annotated UAV Datasets at Large Scales from 3D City Models

Construction Worker Hardhat-Wearing Detection Based on an Improved BiFPN

Forground-Guided Vehicle Perception Framework

Hierarchical Head Design for Object Detectors

Feature Embedding Based Text Instance Grouping for Largely Spaced and Occluded Text Detection

SynDHN: Multi-Object Fish Tracker Trained on Synthetic Underwater Videos

An Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped Text

MagnifierNet: Learning Efficient Small-Scale Pedestrian Detector towards Multiple Dense Regions

Tilting at Windmills: Data Augmentation for Deeppose Estimation Does Not Help with Occlusions

Point In: Counting Trees with Weakly Supervised Segmentation Network

A Modified Single-Shot Multibox Detector for Beyond Real-Time Object Detection

Learning a Dynamic High-Resolution Network for Multi-Scale Pedestrian Detection