Santanu Chaudhury

Papers from this author

A Hierarchical Framework for Leaf Instance Segmentation: Application to Plant Phenotyping

Swati Bhugra, Kanish Garg, Santanu Chaudhury, Brejesh Lall

Responsive image

Auto-TLDR; Under-segmentation of plant image using a graph based formulation to extract leaf shape knowledge for the task of leaf instance segmentation

Slides Poster Similar

Image based analysis of plants is a high-throughput and non-invasive approach to study plant traits. Based on plant image data, the quantitative estimation of many plant traits (leaf area index, biomass etc.) is associated with accurate segmentation of individual leaves. However, this task is challenging due to the presence of overlapped leaves and lack of discernible boundaries between them. In addition, variability in leaf shapes and arrangement among different plant species limits the broad utilisation of current leaf instance segmentation algorithms. In this paper, we propose a novel framework that relies on under-segmentation of plant image using a graph based formulation to extract leaf shape knowledge for the task of leaf instance segmentation. These shape priors are generated based on leaf shape characteristics independent of plant species. We demonstrate the performance of the proposed framework across multiple plant dataset i.e. Arabidopsis, Komatsuna and Salad. Experimental results indicate its broad utility.

Collaborative Human Machine Attention Module for Character Recognition

Chetan Ralekar, Tapan Gandhi, Santanu Chaudhury

Responsive image

Auto-TLDR; A Collaborative Human-Machine Attention Module for Deep Neural Networks

Slides Poster Similar

The deep learning models which include attention mechanisms are shown to enhance the performance and efficiency of the various computer vision tasks such as pattern recognition, object detection, face recognition, etc. Although the visual attention mechanism is the source of inspiration for these models, recent attention models consider `attention' as a pure machine vision optimization problem and visual attention remains the most neglected aspect. Therefore, this paper presents a collaborative human and machine attention module which considers both visual and network's attention. The proposed module is inspired by the dorsal (`where') pathways of visual processing and it can be integrated with any convolutional neural network (CNN) model. First, the module computes the spatial attention map from the input feature maps which is then combined with the visual attention maps. The visual attention maps are created using eye-fixations obtained by performing an eye-tracking experiment with human participants. The visual attention map covers the highly salient and discriminative image regions as humans tend to focus on such regions, whereas the other relevant image regions are processed by spatial attention map. The combination of these two maps results in the finer refinement in feature maps which results in improved performance. The comparative analysis reveals that our model not only shows significant improvement over the baseline model but also outperforms the other models. We hope that our findings using a collaborative human-machine attention module will be helpful in other vision tasks as well.

Using Scene Graphs for Detecting Visual Relationships

Anurag Tripathi, Siddharth Srivastava, Brejesh Lall, Santanu Chaudhury

Responsive image

Auto-TLDR; Relationship Detection using Context Aligned Scene Graph Embeddings

Slides Poster Similar

In this paper we solve the problem of detecting relationships between pairs of objects in an image. We develop spatially aware word embeddings using scene graphs and use joint feature representations containing visual, spatial and semantic embeddings from the input images to train a deep network on the task of relationship detection. Further, we propose to utilize context aligned scene graph embeddings from the train set, without requiring explicit availability of scene graphs at test time. We show that the proposed method outperforms the state-of-the-art methods for predicate detection and provides competing results on relationship detection. We also show the generalization ability of the proposed method by performing predictions under zero shot settings. Further, we also provide an exhaustive empirical evaluation on each component of the proposed network.