VSR++: Improving Visual Semantic Reasoning for Fine-Grained Image-Text Matching
Hui Yuan,
Yan Huang,
Dongbo Zhang,
Zerui Chen,
Wenlong Cheng,
Liang Wang
![Responsive image](/icpr/media/video_thumbnails/11304.jpg)
Auto-TLDR; Improving Visual Semantic Reasoning for Fine-Grained Image-Text Matching
Similar papers
A Novel Attention-Based Aggregation Function to Combine Vision and Language
Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
![Responsive image](/icpr/media/video_thumbnails/10985.jpg)
Auto-TLDR; Fully-Attentive Reduction for Vision and Language
Abstract Slides Poster Similar
Transformer Reasoning Network for Image-Text Matching and Retrieval
Nicola Messina, Fabrizio Falchi, Andrea Esuli, Giuseppe Amato
![Responsive image](/icpr/media/video_thumbnails/11494.jpg)
Auto-TLDR; A Transformer Encoder Reasoning Network for Image-Text Matching in Large-Scale Information Retrieval
Abstract Slides Poster Similar
Beyond the Deep Metric Learning: Enhance the Cross-Modal Matching with Adversarial Discriminative Domain Regularization
Li Ren, Kai Li, Liqiang Wang, Kien Hua
![Responsive image](/icpr/media/video_thumbnails/12115.jpg)
Auto-TLDR; Adversarial Discriminative Domain Regularization for Efficient Cross-Modal Matching
Abstract Slides Poster Similar
Dual Path Multi-Modal High-Order Features for Textual Content Based Visual Question Answering
Yanan Li, Yuetan Lin, Hongrui Zhao, Donghui Wang
![Responsive image](/icpr/media/video_thumbnails/11379.jpg)
Auto-TLDR; TextVQA: An End-to-End Visual Question Answering Model for Text-Based VQA
MAGNet: Multi-Region Attention-Assisted Grounding of Natural Language Queries at Phrase Level
Amar Shrestha, Krittaphat Pugdeethosapol, Haowen Fang, Qinru Qiu
![Responsive image](/icpr/media/video_thumbnails/11879.jpg)
Auto-TLDR; MAGNet: A Multi-Region Attention-Aware Grounding Network for Free-form Textual Queries
Abstract Slides Poster Similar
Integrating Historical States and Co-Attention Mechanism for Visual Dialog
Tianling Jiang, Yi Ji, Chunping Liu
![Responsive image](/icpr/media/video_thumbnails/11093.jpg)
Auto-TLDR; Integrating Historical States and Co-attention for Visual Dialog
Abstract Slides Poster Similar
Multi-Scale 2D Representation Learning for Weakly-Supervised Moment Retrieval
Ding Li, Rui Wu, Zhizhong Zhang, Yongqiang Tang, Wensheng Zhang
![Responsive image](/icpr/media/video_thumbnails/11919.jpg)
Auto-TLDR; Multi-scale 2D Representation Learning for Weakly Supervised Video Moment Retrieval
Abstract Slides Poster Similar
Cross-Media Hash Retrieval Using Multi-head Attention Network
Zhixin Li, Feng Ling, Chuansheng Xu, Canlong Zhang, Huifang Ma
![Responsive image](/icpr/media/video_thumbnails/10995.jpg)
Auto-TLDR; Unsupervised Cross-Media Hash Retrieval Using Multi-Head Attention Network
Abstract Slides Poster Similar
Multi-Modal Contextual Graph Neural Network for Text Visual Question Answering
Yaoyuan Liang, Xin Wang, Xuguang Duan, Wenwu Zhu
![Responsive image](/icpr/media/thumbnails/1025_FI.pdf.jpg)
Auto-TLDR; Multi-modal Contextual Graph Neural Network for Text Visual Question Answering
Abstract Slides Poster Similar
More Correlations Better Performance: Fully Associative Networks for Multi-Label Image Classification
![Responsive image](/icpr/media/video_thumbnails/12021.jpg)
Auto-TLDR; Fully Associative Network for Fully Exploiting Correlation Information in Multi-Label Classification
Abstract Slides Poster Similar
Multi-Stage Attention Based Visual Question Answering
Aakansha Mishra, Ashish Anand, Prithwijit Guha
![Responsive image](/icpr/media/video_thumbnails/12017.jpg)
Auto-TLDR; Alternative Bi-directional Attention for Visual Question Answering
Webly Supervised Image-Text Embedding with Noisy Tag Refinement
Niluthpol Mithun, Ravdeep Pasricha, Evangelos Papalexakis, Amit Roy-Chowdhury
![Responsive image](/icpr/media/video_thumbnails/11775.jpg)
Auto-TLDR; Robust Joint Embedding for Image-Text Retrieval Using Web Images
Answer-Checking in Context: A Multi-Modal Fully Attention Network for Visual Question Answering
Hantao Huang, Tao Han, Wei Han, Deep Yap Deep Yap, Cheng-Ming Chiang
![Responsive image](/icpr/media/video_thumbnails/10980.jpg)
Auto-TLDR; Fully Attention Based Visual Question Answering
Abstract Slides Poster Similar
Using Scene Graphs for Detecting Visual Relationships
Anurag Tripathi, Siddharth Srivastava, Brejesh Lall, Santanu Chaudhury
![Responsive image](/icpr/media/video_thumbnails/12103.jpg)
Auto-TLDR; Relationship Detection using Context Aligned Scene Graph Embeddings
Abstract Slides Poster Similar
Object Detection Using Dual Graph Network
Shengjia Chen, Zhixin Li, Feicheng Huang, Canlong Zhang, Huifang Ma
![Responsive image](/icpr/media/video_thumbnails/11247.jpg)
Auto-TLDR; A Graph Convolutional Network for Object Detection with Key Relation Information
Cross-Lingual Text Image Recognition Via Multi-Task Sequence to Sequence Learning
Zhuo Chen, Fei Yin, Xu-Yao Zhang, Qing Yang, Cheng-Lin Liu
![Responsive image](/icpr/media/video_thumbnails/11227.jpg)
Auto-TLDR; Cross-Lingual Text Image Recognition with Multi-task Learning
Abstract Slides Poster Similar
Question-Agnostic Attention for Visual Question Answering
Moshiur R Farazi, Salman Hameed Khan, Nick Barnes
![Responsive image](/icpr/media/video_thumbnails/11280.jpg)
Auto-TLDR; Question-Agnostic Attention for Visual Question Answering
Abstract Slides Poster Similar
Label Incorporated Graph Neural Networks for Text Classification
Yuan Xin, Linli Xu, Junliang Guo, Jiquan Li, Xin Sheng, Yuanyuan Zhou
![Responsive image](/icpr/media/video_thumbnails/11952.jpg)
Auto-TLDR; Graph Neural Networks for Semi-supervised Text Classification
Abstract Slides Poster Similar
Attentive Visual Semantic Specialized Network for Video Captioning
Jesus Perez-Martin, Benjamin Bustos, Jorge Pérez
![Responsive image](/icpr/media/video_thumbnails/11562.jpg)
Auto-TLDR; Adaptive Visual Semantic Specialized Network for Video Captioning
Abstract Slides Poster Similar
Context Visual Information-Based Deliberation Network for Video Captioning
Min Lu, Xueyong Li, Caihua Liu
![Responsive image](/icpr/media/video_thumbnails/12070.jpg)
Auto-TLDR; Context visual information-based deliberation network for video captioning
Abstract Slides Poster Similar
Aggregating Object Features Based on Attention Weights for Fine-Grained Image Retrieval
Hongli Lin, Yongqi Song, Zixuan Zeng, Weisheng Wang
![Responsive image](/icpr/media/thumbnails/0877_FI.pdf.jpg)
Auto-TLDR; DSAW: Unsupervised Dual-selection for Fine-Grained Image Retrieval
Context for Object Detection Via Lightweight Global and Mid-Level Representations
Mesut Erhan Unal, Adriana Kovashka
![Responsive image](/icpr/media/video_thumbnails/11898.jpg)
Auto-TLDR; Context-Based Object Detection with Semantic Similarity
Abstract Slides Poster Similar
Zero-Shot Text Classification with Semantically Extended Graph Convolutional Network
Tengfei Liu, Yongli Hu, Junbin Gao, Yanfeng Sun, Baocai Yin
![Responsive image](/icpr/media/video_thumbnails/11889.jpg)
Auto-TLDR; Semantically Extended Graph Convolutional Network for Zero-shot Text Classification
Abstract Slides Poster Similar
Multi-Scale Relational Reasoning with Regional Attention for Visual Question Answering
![Responsive image](/icpr/media/video_thumbnails/11547.jpg)
Auto-TLDR; Question-Guided Relational Reasoning for Visual Question Answering
Abstract Slides Poster Similar
Reinforcement Learning with Dual Attention Guided Graph Convolution for Relation Extraction
Zhixin Li, Yaru Sun, Suqin Tang, Canlong Zhang, Huifang Ma
![Responsive image](/icpr/media/video_thumbnails/10950.jpg)
Auto-TLDR; Dual Attention Graph Convolutional Network for Relation Extraction
Abstract Slides Poster Similar
Picture-To-Amount (PITA): Predicting Relative Ingredient Amounts from Food Images
Jiatong Li, Fangda Han, Ricardo Guerrero, Vladimir Pavlovic
![Responsive image](/icpr/media/video_thumbnails/12137.jpg)
Auto-TLDR; PITA: A Deep Learning Architecture for Predicting the Relative Amount of Ingredients from Food Images
Abstract Slides Poster Similar
PIN: A Novel Parallel Interactive Network for Spoken Language Understanding
Peilin Zhou, Zhiqi Huang, Fenglin Liu, Yuexian Zou
![Responsive image](/icpr/media/video_thumbnails/11206.jpg)
Auto-TLDR; Parallel Interactive Network for Spoken Language Understanding
Abstract Slides Poster Similar
Adaptive Word Embedding Module for Semantic Reasoning in Large-Scale Detection
Yu Zhang, Xiaoyu Wu, Ruolin Zhu
![Responsive image](/icpr/media/video_thumbnails/11100.jpg)
Auto-TLDR; Adaptive Word Embedding Module for Object Detection
Abstract Slides Poster Similar
Unsupervised Co-Segmentation for Athlete Movements and Live Commentaries Using Crossmodal Temporal Proximity
Yasunori Ohishi, Yuki Tanaka, Kunio Kashino
![Responsive image](/icpr/media/video_thumbnails/11983.jpg)
Auto-TLDR; A guided attention scheme for audio-visual co-segmentation
Abstract Slides Poster Similar
GCNs-Based Context-Aware Short Text Similarity Model
![Responsive image](/icpr/media/video_thumbnails/11000.jpg)
Auto-TLDR; Context-Aware Graph Convolutional Network for Text Similarity
Abstract Slides Poster Similar
MEG: Multi-Evidence GNN for Multimodal Semantic Forensics
Ekraam Sabir, Ayush Jaiswal, Wael Abdalmageed, Prem Natarajan
![Responsive image](/icpr/media/video_thumbnails/12069.jpg)
Auto-TLDR; Scalable Image Repurposing Detection with Graph Neural Network Based Model
Abstract Slides Poster Similar
Visual Oriented Encoder: Integrating Multimodal and Multi-Scale Contexts for Video Captioning
![Responsive image](/icpr/media/video_thumbnails/10852.jpg)
Auto-TLDR; Visual Oriented Encoder for Video Captioning
Abstract Slides Poster Similar
Gaussian Constrained Attention Network for Scene Text Recognition
Zhi Qiao, Xugong Qin, Yu Zhou, Fei Yang, Weiping Wang
![Responsive image](/icpr/media/video_thumbnails/11252.jpg)
Auto-TLDR; Gaussian Constrained Attention Network for Scene Text Recognition
Abstract Slides Poster Similar
PICK: Processing Key Information Extraction from Documents Using Improved Graph Learning-Convolutional Networks
Wenwen Yu, Ning Lu, Xianbiao Qi, Ping Gong, Rong Xiao
![Responsive image](/icpr/media/video_thumbnails/11383.jpg)
Auto-TLDR; PICK: A Graph Learning Framework for Key Information Extraction from Documents
Abstract Slides Poster Similar
Equation Attention Relationship Network (EARN) : A Geometric Deep Metric Framework for Learning Similar Math Expression Embedding
Saleem Ahmed, Kenny Davila, Srirangaraj Setlur, Venu Govindaraju
![Responsive image](/icpr/media/video_thumbnails/11626.jpg)
Auto-TLDR; Representational Learning for Similarity Based Retrieval of Mathematical Expressions
Abstract Slides Poster Similar
Attentive Part-Aware Networks for Partial Person Re-Identification
Lijuan Huo, Chunfeng Song, Zhengyi Liu, Zhaoxiang Zhang
![Responsive image](/icpr/media/video_thumbnails/11294.jpg)
Auto-TLDR; Part-Aware Learning for Partial Person Re-identification
Abstract Slides Poster Similar
JECL: Joint Embedding and Cluster Learning for Image-Text Pairs
Sean Yang, Kuan-Hao Huang, Bill Howe
![Responsive image](/icpr/media/video_thumbnails/11888.jpg)
Auto-TLDR; JECL: Clustering Image-Caption Pairs with Parallel Encoders and Regularized Clusters
Price Suggestion for Online Second-Hand Items
Liang Han, Zhaozheng Yin, Zhurong Xia, Li Guo, Mingqian Tang, Rong Jin
![Responsive image](/icpr/media/video_thumbnails/11581.jpg)
Auto-TLDR; An Intelligent Price Suggestion System for Online Second-hand Items
Abstract Slides Poster Similar
Global Context-Based Network with Transformer for Image2latex
Nuo Pang, Chun Yang, Xiaobin Zhu, Jixuan Li, Xu-Cheng Yin
![Responsive image](/icpr/media/video_thumbnails/11421.jpg)
Auto-TLDR; Image2latex with Global Context block and Transformer
Abstract Slides Poster Similar
A Multi-Head Self-Relation Network for Scene Text Recognition
Zhou Junwei, Hongchao Gao, Jiao Dai, Dongqin Liu, Jizhong Han
![Responsive image](/icpr/media/video_thumbnails/11334.jpg)
Auto-TLDR; Multi-head Self-relation Network for Scene Text Recognition
Abstract Slides Poster Similar
Detective: An Attentive Recurrent Model for Sparse Object Detection
Amine Kechaou, Manuel Martinez, Monica Haurilet, Rainer Stiefelhagen
![Responsive image](/icpr/media/video_thumbnails/11509.jpg)
Auto-TLDR; Detective: An attentive object detector that identifies objects in images in a sequential manner
Abstract Slides Poster Similar
You Ought to Look Around: Precise, Large Span Action Detection
Ge Pan, Zhang Han, Fan Yu, Yonghong Song, Yuanlin Zhang, Han Yuan
![Responsive image](/icpr/media/video_thumbnails/11030.jpg)
Auto-TLDR; YOLA: Local Feature Extraction for Action Localization with Variable receptive field
MEAN: A Multi-Element Attention Based Network for Scene Text Recognition
Ruijie Yan, Liangrui Peng, Shanyu Xiao, Gang Yao, Jaesik Min
![Responsive image](/icpr/media/video_thumbnails/11700.jpg)
Auto-TLDR; Multi-element Attention Network for Scene Text Recognition
Abstract Slides Poster Similar
Enriching Video Captions with Contextual Text
Philipp Rimle, Pelin Dogan, Markus Gross
![Responsive image](/icpr/media/video_thumbnails/11526.jpg)
Auto-TLDR; Contextualized Video Captioning Using Contextual Text
Abstract Slides Poster Similar
VSB^2-Net: Visual-Semantic Bi-Branch Network for Zero-Shot Hashing
Xin Li, Xiangfeng Wang, Bo Jin, Wenjie Zhang, Jun Wang, Hongyuan Zha
![Responsive image](/icpr/media/video_thumbnails/11066.jpg)
Auto-TLDR; VSB^2-Net: inductive zero-shot hashing for image retrieval
Abstract Slides Poster Similar
Text Recognition in Real Scenarios with a Few Labeled Samples
Jinghuang Lin, Cheng Zhanzhan, Fan Bai, Yi Niu, Shiliang Pu, Shuigeng Zhou
![Responsive image](/icpr/media/video_thumbnails/10877.jpg)
Auto-TLDR; Few-shot Adversarial Sequence Domain Adaptation for Scene Text Recognition
Abstract Slides Poster Similar
Augmented Bi-Path Network for Few-Shot Learning
Baoming Yan, Chen Zhou, Bo Zhao, Kan Guo, Yang Jiang, Xiaobo Li, Zhang Ming, Yizhou Wang
![Responsive image](/icpr/media/video_thumbnails/11902.jpg)
Auto-TLDR; Augmented Bi-path Network for Few-shot Learning
Abstract Slides Poster Similar
G-FAN: Graph-Based Feature Aggregation Network for Video Face Recognition
He Zhao, Yongjie Shi, Xin Tong, Jingsi Wen, Xianghua Ying, Jinshi Hongbin Zha
![Responsive image](/icpr/media/video_thumbnails/11044.jpg)
Auto-TLDR; Graph-based Feature Aggregation Network for Video Face Recognition
Abstract Slides Poster Similar