Audio-Visual Predictive Coding for Self-Supervised Visual Representation Learning
Mani Kumar Tellamekala,
Michel Valstar,
Michael Pound,
Timo Giesbrecht
![Responsive image](/icpr/media/video_thumbnails/12083.jpg)
Auto-TLDR; AV-PPC: A Multi-task Learning Framework for Learning Semantic Visual Features from Unlabeled Video Data
Similar papers
Mutual Alignment between Audiovisual Features for End-To-End Audiovisual Speech Recognition
Hong Liu, Yawei Wang, Bing Yang
![Responsive image](/icpr/media/video_thumbnails/11510.jpg)
Auto-TLDR; Mutual Iterative Attention for Audio Visual Speech Recognition
Abstract Slides Poster Similar
Robust Audio-Visual Speech Recognition Based on Hybrid Fusion
Hong Liu, Wenhao Li, Bing Yang
![Responsive image](/icpr/media/video_thumbnails/11792.jpg)
Auto-TLDR; Hybrid Fusion Based AVSR with Residual Networks and Bidirectional Gated Recurrent Unit for Robust Speech Recognition in Noise Conditions
Abstract Slides Poster Similar
Audio-Visual Speech Recognition Using a Two-Step Feature Fusion Strategy
![Responsive image](/icpr/media/video_thumbnails/11074.jpg)
Auto-TLDR; A Two-Step Feature Fusion Network for Speech Recognition
Abstract Slides Poster Similar
Temporally Coherent Embeddings for Self-Supervised Video Representation Learning
Joshua Knights, Ben Harwood, Daniel Ward, Anthony Vanderkop, Olivia Mackenzie-Ross, Peyman Moghadam
![Responsive image](/icpr/media/video_thumbnails/11955.jpg)
Auto-TLDR; Temporally Coherent Embeddings for Self-supervised Video Representation Learning
Abstract Slides Poster Similar
Unsupervised Co-Segmentation for Athlete Movements and Live Commentaries Using Crossmodal Temporal Proximity
Yasunori Ohishi, Yuki Tanaka, Kunio Kashino
![Responsive image](/icpr/media/video_thumbnails/11983.jpg)
Auto-TLDR; A guided attention scheme for audio-visual co-segmentation
Abstract Slides Poster Similar
Improving Mix-And-Separate Training in Audio-Visual Sound Source Separation with an Object Prior
Quan Nguyen, Simone Frintrop, Timo Gerkmann, Mikko Lauri, Julius Richter
![Responsive image](/icpr/media/video_thumbnails/11572.jpg)
Auto-TLDR; Object-Prior: Learning the 1-to-1 correspondence between visual and audio signals by audio- visual sound source methods
Talking Face Generation Via Learning Semantic and Temporal Synchronous Landmarks
Aihua Zheng, Feixia Zhu, Hao Zhu, Mandi Luo, Ran He
![Responsive image](/icpr/media/video_thumbnails/11298.jpg)
Auto-TLDR; A semantic and temporal synchronous landmark learning method for talking face generation
Abstract Slides Poster Similar
Single-Modal Incremental Terrain Clustering from Self-Supervised Audio-Visual Feature Learning
Reina Ishikawa, Ryo Hachiuma, Akiyoshi Kurobe, Hideo Saito
![Responsive image](/icpr/media/video_thumbnails/12016.jpg)
Auto-TLDR; Multi-modal Variational Autoencoder for Terrain Type Clustering
Abstract Slides Poster Similar
The Color Out of Space: Learning Self-Supervised Representations for Earth Observation Imagery
Stefano Vincenzi, Angelo Porrello, Pietro Buzzega, Marco Cipriano, Pietro Fronte, Roberto Cuccu, Carla Ippoliti, Annamaria Conte, Simone Calderara
![Responsive image](/icpr/media/video_thumbnails/11216.jpg)
Auto-TLDR; Satellite Image Representation Learning for Remote Sensing
Abstract Slides Poster Similar
A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual News Anchors
Ruobing Zheng, Zhou Zhu, Bo Song, Ji Changjiang
![Responsive image](/icpr/media/video_thumbnails/11502.jpg)
Auto-TLDR; Lip-sync: Synthesis of a Virtual News Anchor for Low-Delayed Applications
Abstract Slides Poster Similar
AttendAffectNet: Self-Attention Based Networks for Predicting Affective Responses from Movies
Thi Phuong Thao Ha, Bt Balamurali, Herremans Dorien, Roig Gemma
![Responsive image](/icpr/media/video_thumbnails/11931.jpg)
Auto-TLDR; AttendAffectNet: A Self-Attention Based Network for Emotion Prediction from Movies
Abstract Slides Poster Similar
Self-Supervised Joint Encoding of Motion and Appearance for First Person Action Recognition
Mirco Planamente, Andrea Bottino, Barbara Caputo
![Responsive image](/icpr/media/video_thumbnails/11935.jpg)
Auto-TLDR; A Single Stream Architecture for Egocentric Action Recognition from the First-Person Point of View
Abstract Slides Poster Similar
Person Recognition with HGR Maximal Correlation on Multimodal Data
Yihua Liang, Fei Ma, Yang Li, Shao-Lun Huang
![Responsive image](/icpr/media/video_thumbnails/11111.jpg)
Auto-TLDR; A correlation-based multimodal person recognition framework that learns discriminative embeddings of persons by joint learning visual features and audio features
Abstract Slides Poster Similar
Visual Oriented Encoder: Integrating Multimodal and Multi-Scale Contexts for Video Captioning
![Responsive image](/icpr/media/video_thumbnails/10852.jpg)
Auto-TLDR; Visual Oriented Encoder for Video Captioning
Abstract Slides Poster Similar
Audio-Video Detection of the Active Speaker in Meetings
Francisco Madrigal, Frederic Lerasle, Lionel Pibre, Isabelle Ferrané
![Responsive image](/icpr/media/video_thumbnails/11154.jpg)
Auto-TLDR; Active Speaker Detection with Visual and Contextual Information from Meeting Context
Abstract Slides Poster Similar
Let's Play Music: Audio-Driven Performance Video Generation
Hao Zhu, Yi Li, Feixia Zhu, Aihua Zheng, Ran He
![Responsive image](/icpr/media/video_thumbnails/11284.jpg)
Auto-TLDR; APVG: Audio-driven Performance Video Generation Using Structured Temporal UNet
Abstract Slides Poster Similar
Self-Supervised Learning of Dynamic Representations for Static Images
Siyang Song, Enrique Sanchez, Linlin Shen, Michel Valstar
![Responsive image](/icpr/media/video_thumbnails/11036.jpg)
Auto-TLDR; Facial Action Unit Intensity Estimation and Affect Estimation from Still Images with Multiple Temporal Scale
Abstract Slides Poster Similar
Feature-Supervised Action Modality Transfer
Fida Mohammad Thoker, Cees Snoek
![Responsive image](/icpr/media/video_thumbnails/11306.jpg)
Auto-TLDR; Cross-Modal Action Recognition and Detection in Non-RGB Video Modalities by Learning from Large-Scale Labeled RGB Data
Abstract Slides Poster Similar
What and How? Jointly Forecasting Human Action and Pose
Yanjun Zhu, Yanxia Zhang, Qiong Liu, Andreas Girgensohn
![Responsive image](/icpr/media/video_thumbnails/10928.jpg)
Auto-TLDR; Forecasting Human Actions and Motion Trajectories with Joint Action Classification and Pose Regression
Abstract Slides Poster Similar
Cascade Attention Guided Residue Learning GAN for Cross-Modal Translation
Bin Duan, Wei Wang, Hao Tang, Hugo Latapie, Yan Yan
![Responsive image](/icpr/media/video_thumbnails/11001.jpg)
Auto-TLDR; Cascade Attention-Guided Residue GAN for Cross-modal Audio-Visual Learning
Abstract Slides Poster Similar
Image Representation Learning by Transformation Regression
Xifeng Guo, Jiyuan Liu, Sihang Zhou, En Zhu, Shihao Dong
![Responsive image](/icpr/media/video_thumbnails/10898.jpg)
Auto-TLDR; Self-supervised Image Representation Learning using Continuous Parameter Prediction
Abstract Slides Poster Similar
Context Matters: Self-Attention for Sign Language Recognition
Fares Ben Slimane, Mohamed Bouguessa
![Responsive image](/icpr/media/video_thumbnails/11830.jpg)
Auto-TLDR; Attentional Network for Continuous Sign Language Recognition
Abstract Slides Poster Similar
Learning Visual Voice Activity Detection with an Automatically Annotated Dataset
Stéphane Lathuiliere, Pablo Mesejo, Radu Horaud
![Responsive image](/icpr/media/video_thumbnails/11447.jpg)
Auto-TLDR; Deep Visual Voice Activity Detection with Optical Flow
Self-Supervised Learning for Astronomical Image Classification
Ana Martinazzo, Mateus Espadoto, Nina S. T. Hirata
![Responsive image](/icpr/media/video_thumbnails/11359.jpg)
Auto-TLDR; Unlabeled Astronomical Images for Deep Neural Network Pre-training
Abstract Slides Poster Similar
Joint Supervised and Self-Supervised Learning for 3D Real World Challenges
Antonio Alliegro, Davide Boscaini, Tatiana Tommasi
![Responsive image](/icpr/media/video_thumbnails/11683.jpg)
Auto-TLDR; Self-supervision for 3D Shape Classification and Segmentation in Point Clouds
A Novel Attention-Based Aggregation Function to Combine Vision and Language
Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
![Responsive image](/icpr/media/video_thumbnails/10985.jpg)
Auto-TLDR; Fully-Attentive Reduction for Vision and Language
Abstract Slides Poster Similar
Spatial Bias in Vision-Based Voice Activity Detection
Kalin Stefanov, Mohammad Adiban, Giampiero Salvi
![Responsive image](/icpr/media/video_thumbnails/12148.jpg)
Auto-TLDR; Spatial Bias in Vision-based Voice Activity Detection in Multiparty Human-Human Interactions
Hierarchical Multimodal Attention for Deep Video Summarization
Melissa Sanabria, Frederic Precioso, Thomas Menguy
![Responsive image](/icpr/media/video_thumbnails/11841.jpg)
Auto-TLDR; Automatic Summarization of Professional Soccer Matches Using Event-Stream Data and Multi- Instance Learning
Abstract Slides Poster Similar
Multi-Modal Deep Clustering: Unsupervised Partitioning of Images
![Responsive image](/icpr/media/video_thumbnails/11431.jpg)
Auto-TLDR; Multi-Modal Deep Clustering for Unlabeled Images
Abstract Slides Poster Similar
Unsupervised Sound Source Localization From Audio-Image Pairs Using Input Gradient Map
Tomohiro Tanaka, Takahiro Shinozaki
![Responsive image](/icpr/media/video_thumbnails/11655.jpg)
Auto-TLDR; Unsupervised Sound Localization Using Gradient Method
Abstract Slides Poster Similar
Generative Latent Implicit Conditional Optimization When Learning from Small Sample
![Responsive image](/icpr/media/video_thumbnails/11915.jpg)
Auto-TLDR; GLICO: Generative Latent Implicit Conditional Optimization for Small Sample Learning
Abstract Slides Poster Similar
End-To-End Triplet Loss Based Emotion Embedding System for Speech Emotion Recognition
Puneet Kumar, Sidharth Jain, Balasubramanian Raman, Partha Pratim Roy, Masakazu Iwamura
![Responsive image](/icpr/media/video_thumbnails/11937.jpg)
Auto-TLDR; End-to-End Neural Embedding System for Speech Emotion Recognition
Abstract Slides Poster Similar
Two-Stream Temporal Convolutional Network for Dynamic Facial Attractiveness Prediction
Nina Weng, Jiahao Wang, Annan Li, Yunhong Wang
![Responsive image](/icpr/media/video_thumbnails/12098.jpg)
Auto-TLDR; 2S-TCN: A Two-Stream Temporal Convolutional Network for Dynamic Facial Attractiveness Prediction
Abstract Slides Poster Similar
Mutual Information Based Method for Unsupervised Disentanglement of Video Representation
Aditya Sreekar P, Ujjwal Tiwari, Anoop Namboodiri
![Responsive image](/icpr/media/video_thumbnails/11641.jpg)
Auto-TLDR; MIPAE: Mutual Information Predictive Auto-Encoder for Video Prediction
Abstract Slides Poster Similar
Reducing the Variance of Variational Estimates of Mutual Information by Limiting the Critic's Hypothesis Space to RKHS
Aditya Sreekar P, Ujjwal Tiwari, Anoop Namboodiri
![Responsive image](/icpr/media/video_thumbnails/12178.jpg)
Auto-TLDR; Mutual Information Estimation from Variational Lower Bounds Using a Critic's Hypothesis Space
Video Semantic Segmentation Using Deep Multi-View Representation Learning
Akrem Sellami, Salvatore Tabbone
![Responsive image](/icpr/media/video_thumbnails/11861.jpg)
Auto-TLDR; Deep Multi-view Representation Learning for Video Object Segmentation
Abstract Slides Poster Similar
Learning Group Activities from Skeletons without Individual Action Labels
Fabio Zappardino, Tiberio Uricchio, Lorenzo Seidenari, Alberto Del Bimbo
![Responsive image](/icpr/media/video_thumbnails/12516.jpg)
Auto-TLDR; Lean Pose Only for Group Activity Recognition
Three-Dimensional Lip Motion Network for Text-Independent Speaker Recognition
Jianrong Wang, Tong Wu, Shanyu Wang, Mei Yu, Qiang Fang, Ju Zhang, Li Liu
![Responsive image](/icpr/media/video_thumbnails/11259.jpg)
Auto-TLDR; Lip Motion Network for Text-Independent and Text-Dependent Speaker Recognition
Abstract Slides Poster Similar
RMS-Net: Regression and Masking for Soccer Event Spotting
Matteo Tomei, Lorenzo Baraldi, Simone Calderara, Simone Bronzin, Rita Cucchiara
![Responsive image](/icpr/media/video_thumbnails/11807.jpg)
Auto-TLDR; An Action Spotting Network for Soccer Videos
Abstract Slides Poster Similar
Not 3D Re-ID: Simple Single Stream 2D Convolution for Robust Video Re-Identification
![Responsive image](/icpr/media/video_thumbnails/11490.jpg)
Auto-TLDR; ResNet50-IBN for Video-based Person Re-Identification using Single Stream 2D Convolution Network
Abstract Slides Poster Similar
Multi-Scale 2D Representation Learning for Weakly-Supervised Moment Retrieval
Ding Li, Rui Wu, Zhizhong Zhang, Yongqiang Tang, Wensheng Zhang
![Responsive image](/icpr/media/video_thumbnails/11919.jpg)
Auto-TLDR; Multi-scale 2D Representation Learning for Weakly Supervised Video Moment Retrieval
Abstract Slides Poster Similar
Semi-Supervised Class Incremental Learning
Alexis Lechat, Stéphane Herbin, Frederic Jurie
![Responsive image](/icpr/media/video_thumbnails/12142.jpg)
Auto-TLDR; incremental class learning with non-annotated batches
Abstract Slides Poster Similar
Beyond the Deep Metric Learning: Enhance the Cross-Modal Matching with Adversarial Discriminative Domain Regularization
Li Ren, Kai Li, Liqiang Wang, Kien Hua
![Responsive image](/icpr/media/video_thumbnails/12115.jpg)
Auto-TLDR; Adversarial Discriminative Domain Regularization for Efficient Cross-Modal Matching
Abstract Slides Poster Similar
S2I-Bird: Sound-To-Image Generation of Bird Species Using Generative Adversarial Networks
Joo Yong Shim, Joongheon Kim, Jong-Kook Kim
![Responsive image](/icpr/media/video_thumbnails/11115.jpg)
Auto-TLDR; Generating bird images from sound using conditional generative adversarial networks
Abstract Slides Poster Similar
Progressive Cluster Purification for Unsupervised Feature Learning
Yifei Zhang, Chang Liu, Yu Zhou, Wei Wang, Weiping Wang, Qixiang Ye
![Responsive image](/icpr/media/video_thumbnails/11904.jpg)
Auto-TLDR; Progressive Cluster Purification for Unsupervised Feature Learning
Abstract Slides Poster Similar
The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy
Kin Wai Cheuk, Yin-Jyun Luo, Emmanouil Benetos, Herremans Dorien
![Responsive image](/icpr/media/video_thumbnails/11977.jpg)
Auto-TLDR; Exploring the effect of spectrogram reconstruction loss on automatic music transcription
Attention-Based Deep Metric Learning for Near-Duplicate Video Retrieval
Kuan-Hsun Wang, Chia Chun Cheng, Yi-Ling Chen, Yale Song, Shang-Hong Lai
![Responsive image](/icpr/media/video_thumbnails/11512.jpg)
Auto-TLDR; Attention-based Deep Metric Learning for Near-duplicate Video Retrieval
Motion-Supervised Co-Part Segmentation
Aliaksandr Siarohin, Subhankar Roy, Stéphane Lathuiliere, Sergey Tulyakov, Elisa Ricci, Nicu Sebe
![Responsive image](/icpr/media/video_thumbnails/12049.jpg)
Auto-TLDR; Self-supervised Co-Part Segmentation Using Motion Information from Videos