Yirui Wu

Papers from this author

Dynamic Low-Light Image Enhancement for Object Detection Via End-To-End Training

Haifeng Guo, Yirui Wu, Tong Lu

Responsive image

Auto-TLDR; Object Detection using Low-Light Image Enhancement for End-to-End Training

Slides Poster Similar

Object detection based on convolutional neural networks is a hot research topic in computer vision. The illumination component in the image has a great impact on object detection, and it will cause a sharp decline in detection performance under low-light conditions. Using low-light image enhancement technique as a pre-processing mechanism can improve image quality and obtain better detection results.However, due to the complexity of low-light environments, the existing enhancement methods may have negative effects on some samples. Therefore, it is difficult to improve the overall detection performance in low-light conditions. In this paper, our goal is to use image enhancement to improve object detection performance rather than perceptual quality for humans. We propose a novel framework that combines low-light enhancement and object detection for end-to-end training. The framework can dynamically select different enhancement subnetworks for each sample to improve the performance of the detector. Our proposed method consists of two stage: the enhancement stage and the detection stage. The enhancement stage dynamically enhances the low-light images under the supervision of several enhancement methods and output corresponding weights. During the detection stage, the weights offers information on object classification to generate high-quality region proposals and in turn result in accurate detection. Our experiments present promising results, which show that the proposed method can significantly improve the detection performance in low-light environment.

Multi-Scale Relational Reasoning with Regional Attention for Visual Question Answering

Yuntao Ma, Yirui Wu, Tong Lu

Responsive image

Auto-TLDR; Question-Guided Relational Reasoning for Visual Question Answering

Slides Poster Similar

The main challenges of visual question answering (VQA) lie in modeling an alignment between image and question to find out informative regions in images that related to the question and reasoning relations among visual objects according to the question. In this paper, we propose question-guided relational reasoning in multi-scales for visual question answering, in which each region is enhanced by regional attention. Specifically, we present regional attention, which consists of a soft attention and a hard attention, to pick up informative regions of the image according to informative evaluations implemented by question-guided soft attention. And combinations of different informative regions are then concatenated with question embedding in different scales to capture relational information. Relational reasoning can extract question-based relational information between regions, and the multi-scale mechanism gives it the ability to analyze relationships in diversity and sensitivity to numbers by modeling scales of relationships. We conduct experiments to show that our proposed architecture is effective and achieves a new state-of-the-art on VQA v2.