ESResNet: Environmental Sound Classification Based on Visual Domain Models
Andrey Guzhov,
Federico Raue,
Jörn Hees,
Andreas Dengel
Auto-TLDR; Environmental Sound Classification with Short-Time Fourier Transform Spectrograms
Similar papers
Which are the factors affecting the performance of audio surveillance systems?
Antonio Greco, Antonio Roberto, Alessia Saggese, Mario Vento
Auto-TLDR; Sound Event Recognition Using Convolutional Neural Networks and Visual Representations on MIVIA Audio Events
Ballroom Dance Recognition from Audio Recordings
Tomas Pavlin, Jan Cech, Jiri Matas
Auto-TLDR; A CNN-based approach to classify ballroom dances given audio recordings
Abstract Slides Poster Similar
Mood Detection Analyzing Lyrics and Audio Signal Based on Deep Learning Architectures
Konstantinos Pyrovolakis, Paraskevi Tzouveli, Giorgos Stamou
Auto-TLDR; Automated Music Mood Detection using Music Information Retrieval
Abstract Slides Poster Similar
Feature Engineering and Stacked Echo State Networks for Musical Onset Detection
Peter Steiner, Azarakhsh Jalalvand, Simon Stone, Peter Birkholz
Auto-TLDR; Echo State Networks for Onset Detection in Music Analysis
Abstract Slides Poster Similar
Hybrid Network for End-To-End Text-Independent Speaker Identification
Wajdi Ghezaiel, Luc Brun, Olivier Lezoray
Auto-TLDR; Text-Independent Speaker Identification with Scattering Wavelet Network and Convolutional Neural Networks
Abstract Slides Poster Similar
Trainable Spectrally Initializable Matrix Transformations in Convolutional Neural Networks
Michele Alberti, Angela Botros, Schuetz Narayan, Rolf Ingold, Marcus Liwicki, Mathias Seuret
Auto-TLDR; Trainable and Spectrally Initializable Matrix Transformations for Neural Networks
Abstract Slides Poster Similar
Audio-Based Near-Duplicate Video Retrieval with Audio Similarity Learning
Pavlos Avgoustinakis, Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Andreas L. Symeonidis, Ioannis Kompatsiaris
Auto-TLDR; AuSiL: Audio Similarity Learning for Near-duplicate Video Retrieval
Abstract Slides Poster Similar
One-Shot Learning for Acoustic Identification of Bird Species in Non-Stationary Environments
Michelangelo Acconcjaioco, Stavros Ntalampiras
Auto-TLDR; One-shot Learning in the Bioacoustics Domain using Siamese Neural Networks
Abstract Slides Poster Similar
DenseRecognition of Spoken Languages
Jaybrata Chakraborty, Bappaditya Chakraborty, Ujjwal Bhattacharya
Auto-TLDR; DenseNet: A Dense Convolutional Network Architecture for Speech Recognition in Indian Languages
Abstract Slides Poster Similar
The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy
Kin Wai Cheuk, Yin-Jyun Luo, Emmanouil Benetos, Herremans Dorien
Auto-TLDR; Exploring the effect of spectrogram reconstruction loss on automatic music transcription
Adversarially Training for Audio Classifiers
Raymel Alfonso Sallo, Mohammad Esmaeilpour, Patrick Cardinal
Auto-TLDR; Adversarially Training for Robust Neural Networks against Adversarial Attacks
Abstract Slides Poster Similar
AttendAffectNet: Self-Attention Based Networks for Predicting Affective Responses from Movies
Thi Phuong Thao Ha, Bt Balamurali, Herremans Dorien, Roig Gemma
Auto-TLDR; AttendAffectNet: A Self-Attention Based Network for Emotion Prediction from Movies
Abstract Slides Poster Similar
Influence of Event Duration on Automatic Wheeze Classification
Bruno M Rocha, Diogo Pessoa, Alda Marques, Paulo Carvalho, Rui Pedro Paiva
Auto-TLDR; Experimental Design of the Non-wheeze Class for Wheeze Classification
Abstract Slides Poster Similar
FatNet: A Feature-Attentive Network for 3D Point Cloud Processing
Chaitanya Kaul, Nick Pears, Suresh Manandhar
Auto-TLDR; Feature-Attentive Neural Networks for Point Cloud Classification and Segmentation
3D Attention Mechanism for Fine-Grained Classification of Table Tennis Strokes Using a Twin Spatio-Temporal Convolutional Neural Networks
Pierre-Etienne Martin, Jenny Benois-Pineau, Renaud Péteri, Julien Morlier
Auto-TLDR; Attentional Blocks for Action Recognition in Table Tennis Strokes
Abstract Slides Poster Similar
The Application of Capsule Neural Network Based CNN for Speech Emotion Recognition
Auto-TLDR; CapCNN: A Capsule Neural Network for Speech Emotion Recognition
Abstract Slides Poster Similar
Improving Gravitational Wave Detection with 2D Convolutional Neural Networks
Siyu Fan, Yisen Wang, Yuan Luo, Alexander Michael Schmitt, Shenghua Yu
Auto-TLDR; Two-dimensional Convolutional Neural Networks for Gravitational Wave Detection from Time Series with Background Noise
Improving Mix-And-Separate Training in Audio-Visual Sound Source Separation with an Object Prior
Quan Nguyen, Simone Frintrop, Timo Gerkmann, Mikko Lauri, Julius Richter
Auto-TLDR; Object-Prior: Learning the 1-to-1 correspondence between visual and audio signals by audio- visual sound source methods
The Color Out of Space: Learning Self-Supervised Representations for Earth Observation Imagery
Stefano Vincenzi, Angelo Porrello, Pietro Buzzega, Marco Cipriano, Pietro Fronte, Roberto Cuccu, Carla Ippoliti, Annamaria Conte, Simone Calderara
Auto-TLDR; Satellite Image Representation Learning for Remote Sensing
Abstract Slides Poster Similar
Video Face Manipulation Detection through Ensemble of CNNs
Nicolo Bonettini, Edoardo Daniele Cannas, Sara Mandelli, Luca Bondi, Paolo Bestagini, Stefano Tubaro
Auto-TLDR; Face Manipulation Detection in Video Sequences Using Convolutional Neural Networks
Single-Modal Incremental Terrain Clustering from Self-Supervised Audio-Visual Feature Learning
Reina Ishikawa, Ryo Hachiuma, Akiyoshi Kurobe, Hideo Saito
Auto-TLDR; Multi-modal Variational Autoencoder for Terrain Type Clustering
Abstract Slides Poster Similar
Dynamically Mitigating Data Discrepancy with Balanced Focal Loss for Replay Attack Detection
Yongqiang Dou, Haocheng Yang, Maolin Yang, Yanyan Xu, Dengfeng Ke
Auto-TLDR; Anti-Spoofing with Balanced Focal Loss Function and Combination Features
Abstract Slides Poster Similar
Translation Resilient Opportunistic WiFi Sensing
Mohammud Junaid Bocus, Wenda Li, Jonas Paulavičius, Ryan Mcconville, Raul Santos-Rodriguez, Kevin Chetty, Robert Piechocki
Auto-TLDR; Activity Recognition using Fine-Grained WiFi Channel State Information using WiFi CSI
Abstract Slides Poster Similar
Detection of Calls from Smart Speaker Devices
Vinay Maddali, David Looney, Kailash Patil
Auto-TLDR; Distinguishing Between Smart Speaker and Cell Devices Using Only the Audio Using a Feature Set
Abstract Slides Poster Similar
Audio-Visual Speech Recognition Using a Two-Step Feature Fusion Strategy
Auto-TLDR; A Two-Step Feature Fusion Network for Speech Recognition
Abstract Slides Poster Similar
Spatial Bias in Vision-Based Voice Activity Detection
Kalin Stefanov, Mohammad Adiban, Giampiero Salvi
Auto-TLDR; Spatial Bias in Vision-based Voice Activity Detection in Multiparty Human-Human Interactions
Planar 3D Transfer Learning for End to End Unimodal MRI Unbalanced Data Segmentation
Martin Kolarik, Radim Burget, Carlos M. Travieso-Gonzalez, Jan Kocica
Auto-TLDR; Planar 3D Res-U-Net Network for Unbalanced 3D Image Segmentation using Fluid Attenuation Inversion Recover
End-To-End Triplet Loss Based Emotion Embedding System for Speech Emotion Recognition
Puneet Kumar, Sidharth Jain, Balasubramanian Raman, Partha Pratim Roy, Masakazu Iwamura
Auto-TLDR; End-to-End Neural Embedding System for Speech Emotion Recognition
Abstract Slides Poster Similar
A Close Look at Deep Learning with Small Data
Auto-TLDR; Low-Complex Neural Networks for Small Data Conditions
Abstract Slides Poster Similar
Enhancing Semantic Segmentation of Aerial Images with Inhibitory Neurons
Ihsan Ullah, Sean Reilly, Michael Madden
Auto-TLDR; Lateral Inhibition in Deep Neural Networks for Object Recognition and Semantic Segmentation
Abstract Slides Poster Similar
ResNet-Like Architecture with Low Hardware Requirements
Elena Limonova, Daniil Alfonso, Dmitry Nikolaev, Vladimir V. Arlazarov
Auto-TLDR; BM-ResNet: Bipolar Morphological ResNet for Image Classification
Abstract Slides Poster Similar
Directional Graph Networks with Hard Weight Assignments
Miguel Dominguez, Raymond Ptucha
Auto-TLDR; Hard Directional Graph Networks for Point Cloud Analysis
Abstract Slides Poster Similar
Fast and Accurate Real-Time Semantic Segmentation with Dilated Asymmetric Convolutions
Leonel Rosas-Arias, Gibran Benitez-Garcia, Jose Portillo-Portillo, Gabriel Sanchez-Perez, Keiji Yanai
Auto-TLDR; FASSD-Net: Dilated Asymmetric Pyramidal Fusion for Real-Time Semantic Segmentation
Abstract Slides Poster Similar
S2I-Bird: Sound-To-Image Generation of Bird Species Using Generative Adversarial Networks
Joo Yong Shim, Joongheon Kim, Jong-Kook Kim
Auto-TLDR; Generating bird images from sound using conditional generative adversarial networks
Abstract Slides Poster Similar
A Systematic Investigation on Deep Architectures for Automatic Skin Lesions Classification
Pierluigi Carcagni, Marco Leo, Andrea Cuna, Giuseppe Celeste, Cosimo Distante
Auto-TLDR; RegNet: Deep Investigation of Convolutional Neural Networks for Automatic Classification of Skin Lesions
Abstract Slides Poster Similar
Attention Pyramid Module for Scene Recognition
Zhinan Qiao, Xiaohui Yuan, Chengyuan Zhuang, Abolfazl Meyarian
Auto-TLDR; Attention Pyramid Module for Multi-Scale Scene Recognition
Abstract Slides Poster Similar
SECI-GAN: Semantic and Edge Completion for Dynamic Objects Removal
Francesco Pinto, Andrea Romanoni, Matteo Matteucci, Phil Torr
Auto-TLDR; SECI-GAN: Semantic and Edge Conditioned Inpainting Generative Adversarial Network
Abstract Slides Poster Similar
Building Computationally Efficient and Well-Generalizing Person Re-Identification Models with Metric Learning
Vladislav Sovrasov, Dmitry Sidnev
Auto-TLDR; Cross-Domain Generalization in Person Re-identification using Omni-Scale Network
Multimodal Side-Tuning for Document Classification
Stefano Zingaro, Giuseppe Lisanti, Maurizio Gabbrielli
Auto-TLDR; Side-tuning for Multimodal Document Classification
Abstract Slides Poster Similar
Domain Siamese CNNs for Sparse Multispectral Disparity Estimation
David-Alexandre Beaupre, Guillaume-Alexandre Bilodeau
Auto-TLDR; Multispectral Disparity Estimation between Thermal and Visible Images using Deep Neural Networks
Abstract Slides Poster Similar
Personalized Models in Human Activity Recognition Using Deep Learning
Hamza Amrani, Daniela Micucci, Paolo Napoletano
Auto-TLDR; Incremental Learning for Personalized Human Activity Recognition
Abstract Slides Poster Similar
Anticipating Activity from Multimodal Signals
Tiziana Rotondo, Giovanni Maria Farinella, Davide Giacalone, Sebastiano Mauro Strano, Valeria Tomaselli, Sebastiano Battiato
Auto-TLDR; Exploiting Multimodal Signal Embedding Space for Multi-Action Prediction
Abstract Slides Poster Similar
Exploiting the Logits: Joint Sign Language Recognition and Spell-Correction
Christina Runkel, Stefan Dorenkamp, Hartmut Bauermeister, Michael Möller
Auto-TLDR; A Convolutional Neural Network for Spell-correction in Sign Language Videos
Abstract Slides Poster Similar
GazeMAE: General Representations of Eye Movements Using a Micro-Macro Autoencoder
Louise Gillian C. Bautista, Prospero Naval
Auto-TLDR; Fast and Slow Eye Movement Representations for Sentiment-agnostic Eye Tracking
Abstract Slides Poster Similar
Automatic Semantic Segmentation of Structural Elements related to the Spinal Cord in the Lumbar Region by Using Convolutional Neural Networks
Jhon Jairo Sáenz Gamboa, Maria De La Iglesia-Vaya, Jon Ander Gómez
Auto-TLDR; Semantic Segmentation of Lumbar Spine Using Convolutional Neural Networks
Abstract Slides Poster Similar
The DeepHealth Toolkit: A Unified Framework to Boost Biomedical Applications
Michele Cancilla, Laura Canalini, Federico Bolelli, Stefano Allegretti, Salvador Carrión, Roberto Paredes, Jon Ander Gómez, Simone Leo, Marco Enrico Piras, Luca Pireddu, Asaf Badouh, Santiago Marco-Sola, Lluc Alvarez, Miquel Moreto, Costantino Grana
Auto-TLDR; DeepHealth Toolkit: An Open Source Deep Learning Toolkit for Cloud Computing and HPC
Abstract Slides Poster Similar
Digit Recognition Applied to Reconstructed Audio Signals Using Deep Learning
Anastasia-Sotiria Toufa, Constantine Kotropoulos
Auto-TLDR; Compressed Sensing for Digit Recognition in Audio Reconstruction
MEG: Multi-Evidence GNN for Multimodal Semantic Forensics
Ekraam Sabir, Ayush Jaiswal, Wael Abdalmageed, Prem Natarajan
Auto-TLDR; Scalable Image Repurposing Detection with Graph Neural Network Based Model
Abstract Slides Poster Similar