Dual Path Multi-Modal High-Order Features for Textual Content Based Visual Question Answering
Yanan Li,
Yuetan Lin,
Hongrui Zhao,
Donghui Wang
Auto-TLDR; TextVQA: An End-to-End Visual Question Answering Model for Text-Based VQA
Similar papers
Multi-Modal Contextual Graph Neural Network for Text Visual Question Answering
Yaoyuan Liang, Xin Wang, Xuguang Duan, Wenwu Zhu
Auto-TLDR; Multi-modal Contextual Graph Neural Network for Text Visual Question Answering
Abstract Slides Poster Similar
Multi-Stage Attention Based Visual Question Answering
Aakansha Mishra, Ashish Anand, Prithwijit Guha
Auto-TLDR; Alternative Bi-directional Attention for Visual Question Answering
A Novel Attention-Based Aggregation Function to Combine Vision and Language
Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
Auto-TLDR; Fully-Attentive Reduction for Vision and Language
Abstract Slides Poster Similar
Answer-Checking in Context: A Multi-Modal Fully Attention Network for Visual Question Answering
Hantao Huang, Tao Han, Wei Han, Deep Yap Deep Yap, Cheng-Ming Chiang
Auto-TLDR; Fully Attention Based Visual Question Answering
Abstract Slides Poster Similar
Integrating Historical States and Co-Attention Mechanism for Visual Dialog
Tianling Jiang, Yi Ji, Chunping Liu
Auto-TLDR; Integrating Historical States and Co-attention for Visual Dialog
Abstract Slides Poster Similar
Question-Agnostic Attention for Visual Question Answering
Moshiur R Farazi, Salman Hameed Khan, Nick Barnes
Auto-TLDR; Question-Agnostic Attention for Visual Question Answering
Abstract Slides Poster Similar
Multi-Scale Relational Reasoning with Regional Attention for Visual Question Answering
Auto-TLDR; Question-Guided Relational Reasoning for Visual Question Answering
Abstract Slides Poster Similar
Transformer Reasoning Network for Image-Text Matching and Retrieval
Nicola Messina, Fabrizio Falchi, Andrea Esuli, Giuseppe Amato
Auto-TLDR; A Transformer Encoder Reasoning Network for Image-Text Matching in Large-Scale Information Retrieval
Abstract Slides Poster Similar
P ≈ NP, at Least in Visual Question Answering
Shailza Jolly, Sebastian Palacio, Joachim Folz, Federico Raue, Jörn Hees, Andreas Dengel
Auto-TLDR; Polar vs Non-Polar VQA: A Cross-over Analysis of Feature Spaces for Joint Training
Label or Message: A Large-Scale Experimental Survey of Texts and Objects Co-Occurrence
Koki Takeshita, Juntaro Shioyama, Seiichi Uchida
Auto-TLDR; Large-scale Survey of Co-occurrence between Objects and Scene Text with a State-of-the-art Scene Text detector and Recognizer
Recognizing Multiple Text Sequences from an Image by Pure End-To-End Learning
Zhenlong Xu, Shuigeng Zhou, Fan Bai, Cheng Zhanzhan, Yi Niu, Shiliang Pu
Auto-TLDR; Pure End-to-End Learning for Multiple Text Sequences Recognition from Images
Abstract Slides Poster Similar
Improving Visual Question Answering Using Active Perception on Static Images
Theodoros Bozinis, Nikolaos Passalis, Anastasios Tefas
Auto-TLDR; Fine-Grained Visual Question Answering with Reinforcement Learning-based Active Perception
Abstract Slides Poster Similar
Cross-Lingual Text Image Recognition Via Multi-Task Sequence to Sequence Learning
Zhuo Chen, Fei Yin, Xu-Yao Zhang, Qing Yang, Cheng-Lin Liu
Auto-TLDR; Cross-Lingual Text Image Recognition with Multi-task Learning
Abstract Slides Poster Similar
MEAN: A Multi-Element Attention Based Network for Scene Text Recognition
Ruijie Yan, Liangrui Peng, Shanyu Xiao, Gang Yao, Jaesik Min
Auto-TLDR; Multi-element Attention Network for Scene Text Recognition
Abstract Slides Poster Similar
A Multi-Head Self-Relation Network for Scene Text Recognition
Zhou Junwei, Hongchao Gao, Jiao Dai, Dongqin Liu, Jizhong Han
Auto-TLDR; Multi-head Self-relation Network for Scene Text Recognition
Abstract Slides Poster Similar
VSR++: Improving Visual Semantic Reasoning for Fine-Grained Image-Text Matching
Hui Yuan, Yan Huang, Dongbo Zhang, Zerui Chen, Wenlong Cheng, Liang Wang
Auto-TLDR; Improving Visual Semantic Reasoning for Fine-Grained Image-Text Matching
Abstract Slides Poster Similar
Attentive Visual Semantic Specialized Network for Video Captioning
Jesus Perez-Martin, Benjamin Bustos, Jorge Pérez
Auto-TLDR; Adaptive Visual Semantic Specialized Network for Video Captioning
Abstract Slides Poster Similar
Context Visual Information-Based Deliberation Network for Video Captioning
Min Lu, Xueyong Li, Caihua Liu
Auto-TLDR; Context visual information-based deliberation network for video captioning
Abstract Slides Poster Similar
Visual Style Extraction from Chart Images for Chart Restyling
Danqing Huang, Jinpeng Wang, Guoxin Wang, Chin-Yew Lin
Auto-TLDR; Exploiting Visual Properties from Reference Chart Images for Chart Restyling
Abstract Slides Poster Similar
Weakly Supervised Attention Rectification for Scene Text Recognition
Chengyu Gu, Shilin Wang, Yiwei Zhu, Zheng Huang, Kai Chen
Auto-TLDR; An auxiliary supervision branch for attention-based scene text recognition
Abstract Slides Poster Similar
ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition
Qi Song, Qianyi Jiang, Xiaolin Wei, Nan Li, Rui Zhang
Auto-TLDR; ReADS: Rectified Attentional Double Supervised Network for General Scene Text Recognition
Abstract Slides Poster Similar
PICK: Processing Key Information Extraction from Documents Using Improved Graph Learning-Convolutional Networks
Wenwen Yu, Ning Lu, Xianbiao Qi, Ping Gong, Rong Xiao
Auto-TLDR; PICK: A Graph Learning Framework for Key Information Extraction from Documents
Abstract Slides Poster Similar
Text Recognition in Real Scenarios with a Few Labeled Samples
Jinghuang Lin, Cheng Zhanzhan, Fan Bai, Yi Niu, Shiliang Pu, Shuigeng Zhou
Auto-TLDR; Few-shot Adversarial Sequence Domain Adaptation for Scene Text Recognition
Abstract Slides Poster Similar
Gaussian Constrained Attention Network for Scene Text Recognition
Zhi Qiao, Xugong Qin, Yu Zhou, Fei Yang, Weiping Wang
Auto-TLDR; Gaussian Constrained Attention Network for Scene Text Recognition
Abstract Slides Poster Similar
GCNs-Based Context-Aware Short Text Similarity Model
Auto-TLDR; Context-Aware Graph Convolutional Network for Text Similarity
Abstract Slides Poster Similar
Text Recognition - Real World Data and Where to Find Them
Klára Janoušková, Lluis Gomez, Dimosthenis Karatzas, Jiri Matas
Auto-TLDR; Exploiting Weakly Annotated Images for Text Extraction
Abstract Slides Poster Similar
Visual Oriented Encoder: Integrating Multimodal and Multi-Scale Contexts for Video Captioning
Auto-TLDR; Visual Oriented Encoder for Video Captioning
Abstract Slides Poster Similar
Stratified Multi-Task Learning for Robust Spotting of Scene Texts
Kinjal Dasgupta, Sudip Das, Ujjwal Bhattacharya
Auto-TLDR; Feature Representation Block for Multi-task Learning of Scene Text
Zero-Shot Text Classification with Semantically Extended Graph Convolutional Network
Tengfei Liu, Yongli Hu, Junbin Gao, Yanfeng Sun, Baocai Yin
Auto-TLDR; Semantically Extended Graph Convolutional Network for Zero-shot Text Classification
Abstract Slides Poster Similar
MAGNet: Multi-Region Attention-Assisted Grounding of Natural Language Queries at Phrase Level
Amar Shrestha, Krittaphat Pugdeethosapol, Haowen Fang, Qinru Qiu
Auto-TLDR; MAGNet: A Multi-Region Attention-Aware Grounding Network for Free-form Textual Queries
Abstract Slides Poster Similar
Feature Embedding Based Text Instance Grouping for Largely Spaced and Occluded Text Detection
Pan Gao, Qi Wan, Renwu Gao, Linlin Shen
Auto-TLDR; Text Instance Embedding Based Feature Embeddings for Multiple Text Instance Grouping
Abstract Slides Poster Similar
Information Graphic Summarization Using a Collection of Multimodal Deep Neural Networks
Edward Kim, Connor Onweller, Kathleen F. Mccoy
Auto-TLDR; A multimodal deep learning framework that can generate summarization text supporting the main idea of an information graphic for presentation to blind or visually impaired
CKG: Dynamic Representation Based on Context and Knowledge Graph
Xunzhu Tang, Tiezhu Sun, Rujie Zhu
Auto-TLDR; CKG: Dynamic Representation Based on Knowledge Graph for Language Sentences
Abstract Slides Poster Similar
Enhanced User Interest and Expertise Modeling for Expert Recommendation
Tongze He, Caili Guo, Yunfei Chu
Auto-TLDR; A Unified Framework for Expert Recommendation in Community Question Answering
Abstract Slides Poster Similar
IBN-STR: A Robust Text Recognizer for Irregular Text in Natural Scenes
Xiaoqian Li, Jie Liu, Shuwu Zhang
Auto-TLDR; IBN-STR: A Robust Text Recognition System Based on Data and Feature Representation
PIN: A Novel Parallel Interactive Network for Spoken Language Understanding
Peilin Zhou, Zhiqi Huang, Fenglin Liu, Yuexian Zou
Auto-TLDR; Parallel Interactive Network for Spoken Language Understanding
Abstract Slides Poster Similar
Enriching Video Captions with Contextual Text
Philipp Rimle, Pelin Dogan, Markus Gross
Auto-TLDR; Contextualized Video Captioning Using Contextual Text
Abstract Slides Poster Similar
Automatic Student Network Search for Knowledge Distillation
Zhexi Zhang, Wei Zhu, Junchi Yan, Peng Gao, Guotong Xie
Auto-TLDR; NAS-KD: Knowledge Distillation for BERT
Abstract Slides Poster Similar
Cost-Effective Adversarial Attacks against Scene Text Recognition
Mingkun Yang, Haitian Zheng, Xiang Bai, Jiebo Luo
Auto-TLDR; Adversarial Attacks on Scene Text Recognition
Abstract Slides Poster Similar
Improving Word Recognition Using Multiple Hypotheses and Deep Embeddings
Siddhant Bansal, Praveen Krishnan, C. V. Jawahar
Auto-TLDR; EmbedNet: fuse recognition-based and recognition-free approaches for word recognition using learning-based methods
Abstract Slides Poster Similar
ConvMath : A Convolutional Sequence Network for Mathematical Expression Recognition
Zuoyu Yan, Xiaode Zhang, Liangcai Gao, Ke Yuan, Zhi Tang
Auto-TLDR; Convolutional Sequence Modeling for Mathematical Expressions Recognition
Abstract Slides Poster Similar
Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks
Hyunjin Choi, Judong Kim, Seongho Joe, Youngjune Gwon
Auto-TLDR; Sentence Embedding Models for BERT and ALBERT: A Comparison and Evaluation
Abstract Slides Poster Similar
Label Incorporated Graph Neural Networks for Text Classification
Yuan Xin, Linli Xu, Junliang Guo, Jiquan Li, Xin Sheng, Yuanyuan Zhou
Auto-TLDR; Graph Neural Networks for Semi-supervised Text Classification
Abstract Slides Poster Similar
Efficient Sentence Embedding Via Semantic Subspace Analysis
Bin Wang, Fenxiao Chen, Yun Cheng Wang, C.-C. Jay Kuo
Auto-TLDR; S3E: Semantic Subspace Sentence Embedding
Abstract Slides Poster Similar
Graph Discovery for Visual Test Generation
Neil Hallonquist, Laurent Younes, Donald Geman
Auto-TLDR; Visual Question Answering over Graphs: A Probabilistic Framework for VQA
Abstract Slides Poster Similar
2D License Plate Recognition based on Automatic Perspective Rectification
Hui Xu, Zhao-Hong Guo, Da-Han Wang, Xiang-Dong Zhou, Yu Shi
Auto-TLDR; Perspective Rectification Network for License Plate Recognition
Abstract Slides Poster Similar
Global Context-Based Network with Transformer for Image2latex
Nuo Pang, Chun Yang, Xiaobin Zhu, Jixuan Li, Xu-Cheng Yin
Auto-TLDR; Image2latex with Global Context block and Transformer
Abstract Slides Poster Similar
Sample-Aware Data Augmentor for Scene Text Recognition
Guanghao Meng, Tao Dai, Shudeng Wu, Bin Chen, Jian Lu, Yong Jiang, Shutao Xia
Auto-TLDR; Sample-Aware Data Augmentation for Scene Text Recognition
Abstract Slides Poster Similar