ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Deep Composer: A Hash-Based Duplicative Neural Network for Generating Multi-Instrument Songs

Jacob Galajda, Brandon Royal, Kien Hua

Auto-TLDR; Deep Composer for Intelligence Duplication

Abstract Poster

Music is one of the most appreciated forms of art, and generating songs has become a popular subject in the artificial intelligence community. There are various networks that can produce pleasant sounding music, but no model has been able to produce music that duplicates the style of a specific artist or artists. In this paper, we extend a previous single-instrument model: the Deep Composer -a model we believe to be capable of achieving this. Deep Composer originates from the Deep Segment Hash Learning (DSHL) single instrument model and is designed to learn how a specific artist would place individual segments of music together rather than create music similar to a specific genre. To the best of our knowledge, no other network has been designed to achieve this. For these reasons, we introduce a new field of study, Intelligence Duplication (ID). AI research generally focuses on developing techniques to mimic universal intelligence. Intelligence Duplication (ID) research focuses on techniques to artificially duplicate or clone a specific mind such as Mozart. Additionally, we present a new retrieval algorithm, Segment Barrier Retrieval (SBR), to improve retrieval accuracy within the hash-space as opposed to a more traditionally used feature-space. SBR prevents retrieval branches from entering areas of low-density within the hash-space, a phenomena we identify and label as segment sparsity. To test our Deep Composer and the effectiveness of SBR, we evaluate various models with different SBR threshold values and conduct qualitative surveys for each model. The survey results indicate that our Deep Composer model is capable of learning music generation from multiple composers. Our extended Deep Composer model provides a more suitable platform for Intelligence Duplication. Future work can apply this platform to duplicate great composers such as Mozart or allow them to collaborate in the virtual space.

Similar papers

Heuristics for Evaluation of AI Generated Music

Edmund Dervakos, Giorgos Filandrianos, Giorgos Stamou

Auto-TLDR; Evaluation of generative models in the symbolic music domain using the circle of fifths

Deep Composer: A Hash-Based Duplicative Neural Network for Generating Multi-Instrument Songs

Similar papers

Heuristics for Evaluation of AI Generated Music

Hierarchical Deep Hashing for Fast Large Scale Image Retrieval

VSB^2-Net: Visual-Semantic Bi-Branch Network for Zero-Shot Hashing

Cross-Media Hash Retrieval Using Multi-head Attention Network

Audio-Based Near-Duplicate Video Retrieval with Audio Similarity Learning

Ballroom Dance Recognition from Audio Recordings

Supporting Skin Lesion Diagnosis with Content-Based Image Retrieval

Information Graphic Summarization Using a Collection of Multimodal Deep Neural Networks

Leveraging Quadratic Spherical Mutual Information Hashing for Fast Image Retrieval

On Identification and Retrieval of Near-Duplicate Biological Images: A New Dataset and Protocol

The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy

Object Classification of Remote Sensing Images Based on Optimized Projection Supervised Discrete Hashing

The DeepScoresV2 Dataset and Benchmark for Music Object Detection

Cross-spectrum Face Recognition Using Subspace Projection Hashing

ESResNet: Environmental Sound Classification Based on Visual Domain Models

Fast Discrete Cross-Modal Hashing Based on Label Relaxation and Matrix Factorization

Mood Detection Analyzing Lyrics and Audio Signal Based on Deep Learning Architectures

Feature Engineering and Stacked Echo State Networks for Musical Onset Detection

S2I-Bird: Sound-To-Image Generation of Bird Species Using Generative Adversarial Networks

Exploiting Local Indexing and Deep Feature Confidence Scores for Fast Image-To-Video Search

Improved Deep Classwise Hashing with Centers Similarity Learning for Image Retrieval

Discrete Semantic Matrix Factorization Hashing for Cross-Modal Retrieval

Text Synopsis Generation for Egocentric Videos

Interactive Style Space of Deep Features and Style Innovation

Picture-To-Amount (PITA): Predicting Relative Ingredient Amounts from Food Images

ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization

Deep Convolutional Embedding for Digitized Painting Clustering

Attention-Based Deep Metric Learning for Near-Duplicate Video Retrieval

AttendAffectNet: Self-Attention Based Networks for Predicting Affective Responses from Movies

Segmenting Messy Text: Detecting Boundaries in Text Derived from Historical Newspaper Images

Transformer Networks for Trajectory Forecasting

Location Prediction in Real Homes of Older Adults based on K-Means in Low-Resolution Depth Videos

Equation Attention Relationship Network (EARN) : A Geometric Deep Metric Framework for Learning Similar Math Expression Embedding

Let's Play Music: Audio-Driven Performance Video Generation

Sketch-SNet: Deeper Subdivision of Temporal Cues for Sketch Recognition

DFH-GAN: A Deep Face Hashing with Generative Adversarial Network

Exploring Spatial-Temporal Representations for fNIRS-based Intimacy Detection via an Attention-enhanced Cascade Convolutional Recurrent Neural Network

Automated Whiteboard Lecture Video Summarization by Content Region Detection and Representation

AG-GAN: An Attentive Group-Aware GAN for Pedestrian Trajectory Prediction

Pose-Based Body Language Recognition for Emotion and Psychiatric Symptom Interpretation

Signal Generation Using 1d Deep Convolutional Generative Adversarial Networks for Fault Diagnosis of Electrical Machines

Label Self-Adaption Hashing for Image Retrieval

A Quantitative Evaluation Framework of Video De-Identification Methods

Hierarchical Multimodal Attention for Deep Video Summarization

Multiple Future Prediction Leveraging Synthetic Trajectories

Local Facial Attribute Transfer through Inpainting

Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes

Progressive Learning Algorithm for Efficient Person Re-Identification