Offset Curves Loss for Imbalanced Problem in Medical Segmentation

Ngan Le, Duc Toan Bui, Khoa Luu, Marios Savvides

Responsive image

Auto-TLDR; Offset Curves Loss for Medical Image Segmentation

Poster

Medical image segmentation has played an important role in medical analysis and widely developed for many clinical applications. Deep learning-based approaches have achieved high performance in semantic segmentation but they are limited to pixel-wise setting and imbalanced classes data problem. In this paper, we tackle those limitations by developing a new deep learning-based model which takes into account both higher feature level i.e. region inside contour, intermediate feature level i.e. offset curves around the contour and lower feature level i.e. contour. Our proposed Offset Curves (OsC) loss consists of three main fitting terms. The first fitting term focuses on pixel-wise level segmentation whereas the second fitting term acts as attention model which pays attention to the area around the boundaries (offset curves). The third terms plays a role as regularization term which takes the length of boundaries into account. We evaluate our proposed OsC loss on both 2D network and 3D network. Two common medical datasets, i.e. retina DRIVE and brain tumor BRATS 2018 datasets are used to benchmark our proposed loss performance. The experiments have showed that our proposed OsC loss function outperforms other mainstream loss functions such as Cross-Entropy, Dice, Focal on the most common segmentation networks Unet, FCN.

Similar papers

A Multi-Task Contextual Atrous Residual Network for Brain Tumor Detection & Segmentation

Ngan Le, Kashu Yamazaki, Quach Kha Gia, Thanh-Dat Truong, Marios Savvides

Responsive image

Auto-TLDR; Contextual Brain Tumor Segmentation Using 3D atrous Residual Networks and Cascaded Structures

Poster Similar

In recent years, deep neural networks have achieved state-of-the-art performance in a variety of recognition and segmentation tasks in medical imaging including brain tumor segmentation. We investigate that segmenting brain tumor is facing to the imbalanced data problem where the number of pixels belonging to background class (non tumor pixel) is much larger than the number of pixels belonging to foreground class (tumor pixel). To address this problem, we propose a multi-task network which is formed as a cascaded structure and designed to share the feature maps. Our model consists of two targets, i.e., (i) effectively differentiating brain tumor regions and (ii) estimating brain tumor masks. The first task is performed by our proposed contextual brain tumor detection network, which plays the role of an attention gate and focuses on the region around brain tumor only while ignore the background (non tumor area). Instead of processing every pixel, our contextual brain tumor detection network only processes contextual regions around ground-truth instances and this strategy helps to produce meaningful regions proposals. The second task is built upon a 3D atrous residual network and under an encode-decode network in order to effectively segment both large and small objects (brain tumor). Our 3D atrous residual network is designed with a skip connection to enables the gradient from the deep layers to be directly propagated to shallow layers, thus, features of different depths are preserved and used for refining each other. In order to incorporate larger contextual information in volume MRI data, our network is designed by 3D atrous convolution with various kernel sizes, which enlarges the receptive field of filters. Our proposed network has been evaluated on various datasets including BRATS2015, BRATS2017 and BRATS2018 datasets with both validation set and testing set. Our performance has been benchmarked by both region-based metrics and surface-based metrics. We also have conducted comparisons against state-of-the-art approaches.

A Benchmark Dataset for Segmenting Liver, Vasculature and Lesions from Large-Scale Computed Tomography Data

Bo Wang, Zhengqing Xu, Wei Xu, Qingsen Yan, Liang Zhang, Zheng You

Responsive image

Auto-TLDR; The Biggest Treatment-Oriented Liver Cancer Dataset for Segmentation

Slides Poster Similar

How to build a high-performance liver-related computer assisted diagnosis system is an open question of great interest. However, the performance of the state-of-art algorithm is always limited by the amount of data and quality of the label. To address this problem, we propose the biggest treatment-oriented liver cancer dataset for liver surgery and treatment planning. This dataset provides 216 cases (totally about 268K frames) scanned images in contrast-enhanced computed tomography (CT). We labeled all the CT images with the liver, liver vasculature and liver tumor segmentation ground truth for train and tune segmentation algorithms in advance. Based on that, we evaluate several recent and state-of-the-art segmentation algorithms, including 7 deep learning methods, on CT sequences. All results are compared to reference segmentations five error metrics that highlight different aspects of segmentation accuracy. In general, compared with previous datasets, our dataset is really a challenging dataset. To our knowledge, the proposed dataset and benchmark allow for the first time systematic exploration of such issues, and will be made available to allow for further research in this field.

Do Not Treat Boundaries and Regions Differently: An Example on Heart Left Atrial Segmentation

Zhou Zhao, Elodie Puybareau, Nicolas Boutry, Thierry Geraud

Responsive image

Auto-TLDR; Attention Full Convolutional Network for Atrial Segmentation using ResNet-101 Architecture

Slides Similar

Atrial fibrillation is the most common heart rhythm disease. Due to a lack of understanding in matter of underlying atrial structures, current treatments are still not satisfying. Recently, with the popularity of deep learning, many segmentation methods based on fully convolutional networks have been proposed to analyze atrial structures, especially from late gadolinium-enhanced magnetic resonance imaging. However, two problems still occur: 1) segmentation results include the atrial-like background; 2) boundaries are very hard to segment. Most segmentation approaches design a specific network that mainly focuses on the regions, to the detriment of the boundaries. Therefore, this paper proposes an attention full convolutional network framework based on the ResNet-101 architecture, which focuses on boundaries as much as on regions. The additional attention module is added to have the network pay more attention on regions and then to reduce the impact of the misleading similarity of neighboring tissues. We also use a hybrid loss composed of a region loss and a boundary loss to treat boundaries and regions at the same time. We demonstrate the efficiency of the proposed approach on the MICCAI 2018 Atrial Segmentation Challenge public dataset.

Learn to Segment Retinal Lesions and Beyond

Qijie Wei, Xirong Li, Weihong Yu, Xiao Zhang, Yongpeng Zhang, Bojie Hu, Bin Mo, Di Gong, Ning Chen, Dayong Ding, Youxin Chen

Responsive image

Auto-TLDR; Multi-task Lesion Segmentation and Disease Classification for Diabetic Retinopathy Grading

Poster Similar

Towards automated retinal screening, this paper makes an endeavor to simultaneously achieve pixel-level retinal lesion segmentation and image-level disease classification. Such a multi-task approach is crucial for accurate and clinically interpretable disease diagnosis. Prior art is insufficient due to three challenges, i.e., lesions lacking objective boundaries, clinical importance of lesions irrelevant to their size, and the lack of one-to-one correspondence between lesion and disease classes. This paper attacks the three challenges in the context of diabetic retinopathy (DR) grading. We propose Lesion-Net, a new variant of fully convolutional networks, with its expansive path re- designed to tackle the first challenge. A dual Dice loss that leverages both semantic segmentation and image classification losses is introduced to resolve the second challenge. Lastly, we build a multi-task network that employs Lesion-Net as a side- attention branch for both DR grading and result interpretation. A set of 12K fundus images is manually segmented by 45 ophthalmologists for 8 DR-related lesions, resulting in 290K manual segments in total. Extensive experiments on this large- scale dataset show that our proposed approach surpasses the prior art for multiple tasks including lesion segmentation, lesion classification and DR grading.

Segmentation of Intracranial Aneurysm Remnant in MRA Using Dual-Attention Atrous Net

Subhashis Banerjee, Ashis Kumar Dhara, Johan Wikström, Robin Strand

Responsive image

Auto-TLDR; Dual-Attention Atrous Net for Segmentation of Intracranial Aneurysm Remnant from MRA Images

Slides Poster Similar

Due to the advancement of non-invasive medical imaging modalities like Magnetic Resonance Angiography (MRA), an increasing number of Intracranial Aneurysm (IA) cases are being reported in recent years. The IAs are typically treated by so-called endovascular coiling, where blood flow in the IA is prevented by embolization with a platinum coil. Accurate quantification of the IA Remnant (IAR), i.e. the volume with blood flow present post treatment is the utmost important factor in choosing the right treatment planning. This is typically done by manually segmenting the aneurysm remnant from the MRA volume. Since manual segmentation of volumetric images is a labour-intensive and error-prone process, development of an automatic volumetric segmentation method is required. Segmentation of small structures such as IA, that may largely vary in size, shape, and location is considered extremely difficult. Similar intensity distribution of IAs and surrounding blood vessels makes it more challenging and susceptible to false positive. In this paper we propose a novel 3D CNN architecture called Dual-Attention Atrous Net (DAtt-ANet), which can efficiently segment IAR volumes from MRA images by reconciling features at different scales using the proposed Parallel Atrous Unit (PAU) along with the use of self-attention mechanism for extracting fine-grained features and intra-class correlation. The proposed DAtt-ANet model is trained and evaluated on a clinical MRA image dataset (prospective research project, approved by the local ethical committee) of IAR consisting of 46 subjects, annotated by an expert radiologist from our group. We compared the proposed DAtt-ANet with five state-of-the-art CNN models based on their segmentation performance. The proposed DAtt-ANet outperformed all other methods and was able to achieve a five-fold cross-validation DICE score of $0.73\pm0.06$.

Planar 3D Transfer Learning for End to End Unimodal MRI Unbalanced Data Segmentation

Martin Kolarik, Radim Burget, Carlos M. Travieso-Gonzalez, Jan Kocica

Responsive image

Auto-TLDR; Planar 3D Res-U-Net Network for Unbalanced 3D Image Segmentation using Fluid Attenuation Inversion Recover

Slides Similar

We present a novel approach of 2D to 3D transfer learning based on mapping pre-trained 2D convolutional neural network weights into planar 3D kernels. The method is validated by proposed planar 3D res-u-net network with encoder transferred from the 2D VGG-16 which is applied for a single-stage unbalanced 3D image data segmentation. In particular, we evaluate the method on the MICCAI 2016 MS lesion segmentation challenge dataset utilizing solely Fluid Attenuation Inversion Recover (FLAIR) sequence without brain extraction for training and inference to simulate real medical praxis. The planar 3D res-u-net network performed the best both in sensitivity and Dice score amongst end to end methods processing raw MRI scans and achieved comparable Dice score to a state-of-the-art unimodal not end to end approach. Complete source code was released under the open-source license and this paper is in compliance with the Machine learning Reproducibility Checklist. By implementing practical transfer learning for 3D data representation we were able to successfully segment heavily unbalanced data without selective sampling and achieved more reliable results using less training data in single modality. From medical perspective, the unimodal approach gives an advantage in real praxis as it does not require co-registration nor additional scanning time during examination. Although modern medical imaging methods capture high resolution 3D anatomy scans suitable for computer aided detection system processing, deployment of automatic systems for interpretation of radiology imaging is still rather theoretical in many medical areas. Our work aims to bridge the gap offering solution for partial research questions.

Learning to Segment Clustered Amoeboid Cells from Brightfield Microscopy Via Multi-Task Learning with Adaptive Weight Selection

Rituparna Sarkar, Suvadip Mukherjee, Elisabeth Labruyere, Jean-Christophe Olivo-Marin

Responsive image

Auto-TLDR; Supervised Cell Segmentation from Microscopy Images using Multi-task Learning in a Multi-Task Learning Paradigm

Poster Similar

Detecting and segmenting individual cells from microscopy images is critical to various life science applications. Traditional cell segmentation tools are often ill-suited for applications in brightfield microscopy due to poor contrast and intensity heterogeneity, and only a small subset are applicable to segment cells in a cluster. In this regard, we introduce a novel supervised technique for cell segmentation in a multi-task learning paradigm. A combination of a multi-task loss, based on the region and cell boundary detection, is employed for an improved prediction efficiency of the network. The learning problem is posed in a novel min-max framework which enables adaptive estimation of the hyper-parameters in an automatic fashion. The region and cell boundary predictions are combined via morphological operations and active contour model to segment individual cells. The proposed methodology is particularly suited to segment touching cells from brightfield microscopy images without manual interventions. Quantitatively, we observe an overall Dice score of 0.93 on the validation set, which is an improvement of over 15.9% on a recent unsupervised method, and outperforms the popular supervised U-net algorithm by at least 5.8% on average.

A Deep Learning Approach for the Segmentation of Myocardial Diseases

Khawala Brahim, Abdull Qayyum, Alain Lalande, Arnaud Boucher, Anis Sakly, Fabrice Meriaudeau

Responsive image

Auto-TLDR; Segmentation of Myocardium Infarction Using Late GADEMRI and SegU-Net

Slides Poster Similar

Cardiac left ventricular (LV) segmentation is of paramount essential step for both diagnosis and treatment of cardiac pathologies such as ischemia, myocardial infarction, arrhythmia and myocarditis. However, this segmentation is challenging due to high variability across patients and the potential lack of contrast between structures. In this work, we propose and evaluate a (2.5D) SegU-Net model based on the fusion of two deep learning techniques (U-Net and Seg-Net) for automated LGEMRI (Late gadolinium enhanced magnetic resonance imaging) myocardial disease (infarct core and no reflow region) quantification in a new multifield expert annotated dataset. Given that the scar tissue represents a small part of the whole MRI slices, we focused on myocardium area. Segmentation results show that this preprocessing step facilitate the learning procedure. In order to solve the class imbalance problem, we propose to apply the Jaccard loss and the Focal Loss as optimization loss function and to integrate a class weights strategy into the objective function. Late combination has been used to merge the output of the best trained models on a different set of hyperparameters. The final network segmentation performances will be useful for future comparison of new method to the current related work for this task. A total number of 2237 of slices (320 cases) were used for training/validation and 210 slices (35 cases) were used for testing. Experiments over our proposed dataset, using several evaluation metrics such Jaccard distance (IOU), Accuracy and Dice similarity coefficient (DSC), demonstrate efficiency performance in quantifying different zones of myocardium infarction across various patients. As compared to the second intra-observer study, our testing results showed that the SegUNet prediction model leads to these average dice coefficients over all segmented tissue classes, respectively : 'Background': 0.99999, 'Myocardium': 0.99434, 'Infarctus': 0.95587, 'Noreflow': 0.78187.

3D Medical Multi-Modal Segmentation Network Guided by Multi-Source Correlation Constraint

Tongxue Zhou, Stéphane Canu, Pierre Vera, Su Ruan

Responsive image

Auto-TLDR; Multi-modality Segmentation with Correlation Constrained Network

Slides Poster Similar

In the field of multimodal segmentation, the correlation between different modalities can be considered for improving the segmentation results. In this paper, we propose a multi-modality segmentation network with a correlation constraint. Our network includes N model-independent encoding paths with N image sources, a correlation constrain block, a feature fusion block, and a decoding path. The model-independent encoding path can capture modality-specific features from the N modalities. Since there exists a strong correlation between different modalities, we first propose a linear correlation block to learn the correlation between modalities, then a loss function is used to guide the network to learn the correlated features based on the correlation representation block. This block forces the network to learn the latent correlated features which are more relevant for segmentation. Considering that not all the features extracted from the encoders are useful for segmentation, we propose to use dual attention based fusion block to recalibrate the features along the modality and spatial paths, which can suppress less informative features and emphasize the useful ones. The fused feature representation is finally projected by the decoder to obtain the segmentation result. Our experiment results tested on BraTS-2018 dataset for brain tumor segmentation demonstrate the effectiveness of our proposed method.

Transfer Learning through Weighted Loss Function and Group Normalization for Vessel Segmentation from Retinal Images

Abdullah Sarhan, Jon Rokne, Reda Alhajj, Andrew Crichton

Responsive image

Auto-TLDR; Deep Learning for Segmentation of Blood Vessels in Retinal Images

Slides Poster Similar

The vascular structure of blood vessels is important in diagnosing retinal conditions such as glaucoma and diabetic retinopathy. Accurate segmentation of these vessels can help in detecting retinal objects such as the optic disc and optic cup and hence determine if there are damages to these areas. Moreover, the structure of the vessels can help in diagnosing glaucoma. The rapid development of digital imaging and computer-vision techniques has increased the potential for developing approaches for segmenting retinal vessels. In this paper, we propose an approach for segmenting retinal vessels that uses deep learning along with transfer learning. We adapted the U-Net structure to use a customized InceptionV3 as the encoder and used multiple skip connections to form the decoder. Moreover, we used a weighted loss function to handle the issue of class imbalance in retinal images. Furthermore, we contributed a new dataset to this field. We tested our approach on six publicly available datasets and a newly created dataset. We achieved an average accuracy of 95.60\% and a Dice coefficient of 80.98\%. The results obtained from comprehensive experiments demonstrate the robustness of our approach to the segmentation of blood vessels in retinal images obtained from different sources. Our approach results in greater segmentation accuracy than other approaches.

CAggNet: Crossing Aggregation Network for Medical Image Segmentation

Xu Cao, Yanghao Lin

Responsive image

Auto-TLDR; Crossing Aggregation Network for Medical Image Segmentation

Slides Poster Similar

In this paper, we present Crossing Aggregation Network (CAggNet), a novel densely connected semantic segmentation method for medical image analysis. The crossing aggregation network absorbs the idea of deep layer aggregation and makes significant innovations in layer connection and semantic information fusion. In this architecture, the traditional skip-connection structure of general U-Net is replaced by aggregations of multi-level down-sampling and up-sampling layers. This enables the network to fuse information interactively flows at different levels of layers in semantic segmentation. It also introduces weighted aggregation module to aggregate multi-scale output information. We have evaluated and compared our CAggNet with several advanced U-Net based methods in two public medical image datasets, including the 2018 Data Science Bowl nuclei detection dataset and the 2015 MICCAI gland segmentation competition dataset. Experimental results indicate that CAggNet improves medical object recognition and achieves a more accurate and efficient segmentation compared to existing improved U-Net and UNet++ structure.

DARN: Deep Attentive Refinement Network for Liver Tumor Segmentation from 3D CT Volume

Yao Zhang, Jiang Tian, Cheng Zhong, Yang Zhang, Zhongchao Shi, Zhiqiang He

Responsive image

Auto-TLDR; Deep Attentive Refinement Network for Liver Tumor Segmentation from 3D Computed Tomography Using Multi-Level Features

Slides Poster Similar

Automatic liver tumor segmentation from 3D Computed Tomography (CT) is a necessary prerequisite in the interventions of hepatic abnormalities and surgery planning. However, accurate liver tumor segmentation remains challenging due to the large variability of tumor sizes and inhomogeneous texture. Recent advances based on Fully Convolutional Network (FCN) in liver tumor segmentation draw on success of learning discriminative multi-level features. In this paper, we propose a Deep Attentive Refinement Network (DARN) for improved liver tumor segmentation from CT volumes by fully exploiting both low and high level features embedded in different layers of FCN. Different from existing works, we exploit attention mechanism to leverage the relation of different levels of features encoded in different layers of FCN. Specifically, we introduce a Semantic Attention Refinement (SemRef) module to selectively emphasize global semantic information in low level features with the guidance of high level ones, and a Spatial Attention Refinement (SpaRef) module to adaptively enhance spatial details in high level features with the guidance of low level ones. We evaluate our network on the public MICCAI 2017 Liver Tumor Segmentation Challenge dataset (LiTS dataset) and it achieves state-of-the-art performance. The proposed refinement modules are an effective strategy to exploit multi-level features and has great potential to generalize to other medical image segmentation tasks.

FOANet: A Focus of Attention Network with Application to Myocardium Segmentation

Zhou Zhao, Elodie Puybareau, Nicolas Boutry, Thierry Geraud

Responsive image

Auto-TLDR; FOANet: A Hybrid Loss Function for Myocardium Segmentation of Cardiac Magnetic Resonance Images

Slides Poster Similar

In myocardium segmentation of cardiac magnetic resonance images, ambiguities often appear near the boundaries of the target domains due to tissue similarities. To address this issue, we propose a new architecture, called FOANet, which can be decomposed in three main steps: a localization step, a Gaussian-based contrast enhancement step, and a segmentation step. This architecture is supplied with a hybrid loss function that guides the FOANet to study the transformation relationship between the input image and the corresponding label in a threelevel hierarchy (pixel-, patch- and map-level), which is helpful to improve segmentation and recovery of the boundaries. We demonstrate the efficiency of our approach on two public datasets in terms of regional and boundary segmentations.

Semantic Segmentation of Breast Ultrasound Image with Pyramid Fuzzy Uncertainty Reduction and Direction Connectedness Feature

Kuan Huang, Yingtao Zhang, Heng-Da Cheng, Ping Xing, Boyu Zhang

Responsive image

Auto-TLDR; Uncertainty-Based Deep Learning for Breast Ultrasound Image Segmentation

Slides Poster Similar

Deep learning approaches have achieved impressive results in breast ultrasound (BUS) image segmentation. However, these methods did not solve uncertainty and noise in BUS images well. To address this issue, we present a novel deep learning structure for BUS image semantic segmentation by analyzing the uncertainty using a pyramid fuzzy block and generating a novel feature based on connectedness. Firstly, feature maps in the proposed network are down-sampled to different resolutions. Fuzzy transformation and uncertainty representation are applied to each resolution to obtain the uncertainty degree on different scales. Meanwhile, the BUS images contain layer structures. From top to bottom, there are skin layer, fat layer, mammary layer, muscle layer, and background area. A spatial recurrent neural network (RNN) is utilized to calculate the connectedness between each pixel and the pixels on the four boundaries in horizontal and vertical lines. The spatial-wise context feature can introduce the characteristic of layer structure to deep neural network. Finally, the original convolutional features are combined with connectedness feature according to the uncertainty degrees. The proposed methods are applied to two datasets: a BUS image benchmark with two categories (background and tumor) and a five-category BUS image dataset with fat layer, mammary layer, muscle layer, background, and tumor. The proposed method achieves the best results on both datasets compared with eight state-of-the-art deep learning-based approaches.

Leveraging Unlabeled Data for Glioma Molecular Subtype and Survival Prediction

Nicholas Nuechterlein, Beibin Li, Mehmet Saygin Seyfioglu, Sachin Mehta, Patrick Cimino, Linda Shapiro

Responsive image

Auto-TLDR; Multimodal Brain Tumor Segmentation Using Unlabeled MR Data and Genomic Data for Cancer Prediction

Slides Poster Similar

In this paper, we address two long-standing challenges in neuro-oncology: (1) how to leverage large amounts of unlabeled magnetic resonance (MR) imaging data for radiogenomic tasks and (2) how to unite glioma MR imaging with genomic data. We examine multi-parametric MR data from 542 patients in the combined training, validation, and testing sets of the 2018 Multimodal Brain Tumor Segmentation Challenge and somatic copy number alteration (SCNA) data from 1090 patients in The Cancer Genome Archive's (TCGA) lower-grade glioma and glioblastoma projects. We propose a novel application of multi-task learning (MTL) that leverages unlabeled MR data by jointly learning tumor segmentation masks with glioma molecular subtype markers and allows for SCNA input when available. There are 235 patients in the intersection of these MR and SCNA datasets, which we divide into an unlabeled training set, a labeled training set, and a validation set. Our MTL model significantly outperforms comparable classification models trained only on labeled MR data for both IDH1/2 mutation and 1p/19q co-deletion glioma subtype marker prediction tasks. We also observe that models trained on genomic and imaging data improve survival prediction results achieved by models trained on either alone. We will release our source code for future research.

Automatic Semantic Segmentation of Structural Elements related to the Spinal Cord in the Lumbar Region by Using Convolutional Neural Networks

Jhon Jairo Sáenz Gamboa, Maria De La Iglesia-Vaya, Jon Ander Gómez

Responsive image

Auto-TLDR; Semantic Segmentation of Lumbar Spine Using Convolutional Neural Networks

Slides Poster Similar

This work addresses the problem of automatically segmenting the MR images corresponding to the lumbar spine. The purpose is to detect and delimit the different structural elements like vertebrae, intervertebral discs, nerves, blood vessels, etc. This task is known as semantic segmentation. The approach proposed in this work is based on convolutional neural networks whose output is a mask where each pixel from the input image is classified into one of the possible classes. Classes were defined by radiologists and correspond to structural elements and tissues. The proposed network architectures are variants of the U-Net. Several complementary blocks were used to define the variants: spatial attention models, deep supervision and multi-kernels at input, this last block type is based on the idea of inception. Those architectures which got the best results are described in this paper, and their results are discussed. Two of the proposed architectures outperform the standard U-Net used as baseline.

Deep Recurrent-Convolutional Model for AutomatedSegmentation of Craniomaxillofacial CT Scans

Francesca Murabito, Simone Palazzo, Federica Salanitri Proietto, Francesco Rundo, Ulas Bagci, Daniela Giordano, Rosalia Leonardi, Concetto Spampinato

Responsive image

Auto-TLDR; Automated Segmentation of Anatomical Structures in Craniomaxillofacial CT Scans using Fully Convolutional Deep Networks

Slides Poster Similar

In this paper we define a deep learning architecture for automated segmentation of anatomical structures in Craniomaxillofacial (CMF) CT scans that leverages the recent success of encoder-decoder models for semantic segmentation of natural images. In particular, we propose a fully convolutional deep network that combines the advantages of recent fully convolutional models, such as Tiramisu, with squeeze-and-excitation blocks for feature recalibration, integrated with convolutional LSTMs to model spatio-temporal correlations between consecutive slices. The proposed segmentation network shows superior performance and generalization capabilities (to different structures and imaging modalities) than state of the art methods on automated segmentation of CMF structures (e.g., mandibles and airways) in several standard benchmarks (e.g., MICCAI datasets) and on new datasets proposed herein, effectively facing shape variability.

NephCNN: A Deep-Learning Framework for Vessel Segmentation in Nephrectomy Laparoscopic Videos

Alessandro Casella, Sara Moccia, Chiara Carlini, Emanuele Frontoni, Elena De Momi, Leonardo Mattos

Responsive image

Auto-TLDR; Adversarial Fully Convolutional Neural Networks for kidney vessel segmentation from nephrectomy laparoscopic videos

Slides Poster Similar

Objective: In the last years, Robot-assisted partial nephrectomy (RAPN) is establishing as elected treatment for renal cell carcinoma (RCC). Reduced field of view, field occlusions by surgical tools, and reduced maneuverability may potentially cause accidents, such as unwanted vessel resection with consequent bleeding. Surgical Data Science (SDS) can provide effective context-aware tools for supporting surgeons. However, currently no tools have been exploited for automatic vessels segmentation from nephrectomy laparoscopic videos. Herein, we propose a new approach based on adversarial Fully Convolutional Neural Networks (FCNNs) to kidney vessel segmentation from nephrectomy laparoscopic vision. Methods: The proposed approach enhances existing segmentation framework by (i) encoding 3D kernels for spatio-temporal features extraction to enforce pixel connectivity in time, and (ii) perform training in adversarial fashion, which constrains vessels shape. Results: We performed a preliminary study using 8 different RAPN videos (1871 frames), the first in the field, achieving a median Dice Similarity Coefficient of 71.76%. Conclusions: Results showed that the proposed approach could be a valuable solution with a view to assist surgeon during RAPN.

BCAU-Net: A Novel Architecture with Binary Channel Attention Module for MRI Brain Segmentation

Yongpei Zhu, Zicong Zhou, Guojun Liao, Kehong Yuan

Responsive image

Auto-TLDR; BCAU-Net: Binary Channel Attention U-Net for MRI brain segmentation

Slides Poster Similar

Recently deep learning-based networks have achieved advanced performance in medical image segmentation. However, the development of deep learning is slow in magnetic resonance image (MRI) segmentation of normal brain tissues. In this paper, inspired by channel attention module, we propose a new architecture, Binary Channel Attention U-Net (BCAU-Net), by introducing a novel Binary Channel Attention Module (BCAM) into skip connection of U-Net, which can take full advantages of the channel information extracted from the encoding path and corresponding decoding path. To better aggregate multi-scale spatial information of the feature map, spatial pyramid pooling (SPP) modules with different pooling operations are used in BCAM instead of original average-pooling and max-pooling operations. We verify this model on two datasets including IBSR and MRBrainS18, and obtain better performance on MRI brain segmentation compared with other methods. We believe the proposed method can advance the performance in brain segmentation and clinical diagnosis.

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

Jiaqi Luo, Zhicheng Zhao, Fei Su, Limei Guo

Responsive image

Auto-TLDR; Triplet-path Network for One-Stage Object Detection and Segmentation in Pathological Images

Slides Similar

Deep learning has been widely applied in the field of medical image processing. However, compared with flourishing visual tasks in natural images, the progress achieved in pathological images is not remarkable, and detection and segmentation, which are among basic tasks of computer vision, are regarded as two independent tasks. In this paper, we make full use of existing datasets and construct a triplet-path network using dilated convolutions to cooperatively accomplish one-stage object detection and nuclei segmentation for general pathological images. First, in order to meet the requirement of detection and segmentation, a novel structure called triplet feature generation (TFG) is designed to extract high-resolution and multiscale features, where features from different layers can be properly integrated. Second, considering that pathological datasets are usually small, a location-aware and partially truncated loss function is proposed to improve the classification accuracy of datasets with few images and widely varying targets. We compare the performance of both object detection and instance segmentation with state-of-the-art methods. Experimental results demonstrate the effectiveness and efficiency of the proposed network on two datasets collected from multiple organs.

MTGAN: Mask and Texture-Driven Generative Adversarial Network for Lung Nodule Segmentation

Wei Chen, Qiuli Wang, Kun Wang, Dan Yang, Xiaohong Zhang, Chen Liu, Yucong Li

Responsive image

Auto-TLDR; Mask and Texture-driven Generative Adversarial Network for Lung Nodule Segmentation

Slides Poster Similar

Accurate segmentation for lung nodules in lung computed tomography (CT) scans plays a key role in the early diagnosis of lung cancer. Many existing methods, especially UNet, have made significant progress in lung nodule segmentation. However, due to the complex shapes of lung nodules and the similarity of visual characteristics between nodules and lung tissues, an accurate segmentation with few false positives of lung nodules is still a challenging problem. Considering the fact that both boundary and texture information of lung nodules are important for obtaining an accurate segmentation result, we propose a novel Mask and Texture-driven Generative Adversarial Network (MTGAN) with a joint multi-scale L1 loss for lung nodule segmentation, which takes full advantages of U-Net and adversarial training. The proposed MTGAN leverages adversarial learning strategy guided by the boundary and texture information of lung nodules to generate more accurate segmentation results with lesser false positives. We validate our model with the LIDC–IDRI dataset, and experimental results show that our method achieves excellent segmentation results for a variety of lung nodules, especially for juxtapleural nodules and low-dense nodules. Without any bells and whistles, the proposed MTGAN achieves significant segmentation performance with the Dice similarity coefficient (DSC) of 85.24% on the LIDC–IDRI dataset.

Deep Learning-Based Type Identification of Volumetric MRI Sequences

Jean Pablo De Mello, Thiago Paixão, Rodrigo Berriel, Mauricio Reyes, Alberto F. De Souza, Claudine Badue, Thiago Oliveira-Santos

Responsive image

Auto-TLDR; Deep Learning for Brain MRI Sequences Identification Using Convolutional Neural Network

Slides Poster Similar

The analysis of Magnetic Resonance Imaging (MRI) sequences enables clinical professionals to monitor the progression of a brain tumor. As the interest for automatizing brain volume MRI analysis increases, it becomes convenient to have each sequence well identified. However, the unstandardized naming of MRI sequences make their identification difficult for automated systems, as well as make it difficult for researches to generate or use datasets for machine learning research. In face of that, we propose a system for identifying types of brain MRI sequences based on deep learning. By training a Convolutional Neural Network (CNN) based on 18-layer ResNet architecture, our system is able to classify a volumetric brain MRI as a T1, T1c, T2 or FLAIR sequence, or whether it does not belong to any of these classes. The network was trained with both pre-processed (BraTS dataset) and non-pre-processed (TCGA-GBM dataset) images with diverse acquisition protocols, requiring only a few layers of the volume for training. Our system is able to classify among sequence types with an accuracy of 96.27%.

DE-Net: Dilated Encoder Network for Automated Tongue Segmentation

Hui Tang, Bin Wang, Jun Zhou, Yongsheng Gao

Responsive image

Auto-TLDR; Automated Tongue Image Segmentation using De-Net

Slides Poster Similar

Automated tongue recognition is a growing research field due to global demand for personal health care. Using mobile devices to take tongue pictures is convenient and of low cost for tongue recognition. It is particularly suitable for self-health evaluation of the public. However, images taken by mobile devices are easily affected by various imaging environment, which makes fine segmentation a more challenging task compared with those taken by specialized acquisition devices. Deep learning approaches are promising for tongue image segmentation because they have powerful feature learning and representation capability. However, the successive pooling operations in these methods lead to loss of information on image details, making them fail when segmenting low-quality images captured by mobile devices. To address this issue, we propose a dilated encoder network (DE-Net) to capture more high-level features and get high-resolution output for automated tongue image segmentation. In addition, we construct two tongue image datasets which contain images taken by specialized devices and mobile devices, respectively, to verify the effectiveness of the proposed method. Experimental results on both datasets demonstrate that the proposed method outperforms the state-of-the-art methods in tongue image segmentation.

Breast Anatomy Enriched Tumor Saliency Estimation

Fei Xu, Yingtao Zhang, Heng-Da Cheng, Jianrui Ding, Boyu Zhang, Chunping Ning, Ying Wang

Responsive image

Auto-TLDR; Tumor Saliency Estimation for Breast Ultrasound using enriched breast anatomy knowledge

Slides Poster Similar

Breast cancer investigation is of great significance and developing tumor detection methodologies is a critical need. However, it is a challenging task for breast cancer detection using breast ultrasound (BUS) images due to the complicated breast structure and poor quality of the images. In this paper, we propose a novel tumor saliency estimation (TSE) model guided by enriched breast anatomy knowledge to localize the tumor. First, the breast anatomy layers are generated by a deep neural network. Then we refine the layers by integrating a non-semantic breast anatomy model to solve the problems of incomplete mammary layers. Meanwhile, a new background map generation method weighted by the semantic probability and spatial distance is proposed to improve the performance. The experiment demonstrates that the proposed method with the new background map outperforms four state-of-the-art TSE models with increasing 10% of F_meansure on the public BUS dataset.

A Lumen Segmentation Method in Ureteroscopy Images Based on a Deep Residual U-Net Architecture

Jorge Lazo, Marzullo Aldo, Sara Moccia, Michele Catellani, Benoit Rosa, Elena De Momi, Michel De Mathelin, Francesco Calimeri

Responsive image

Auto-TLDR; A Deep Neural Network for Ureteroscopy with Residual Units

Slides Poster Similar

Ureteroscopy is becoming the first surgical treatment option for the majority of urinary affections. This procedure is carried out using an endoscope which provides the surgeon with the visual and spatial information necessary to navigate inside the urinary tract. Having in mind the development of surgical assistance systems, that could enhance the performance of surgeon, the task of lumen segmentation is a fundamental part since this is the visual reference which marks the path that the endoscope should follow. This is something that has not been analyzed in ureteroscopy data before. However, this task presents several challenges given the image quality and the conditions itself of ureteroscopy procedures. In this paper, we study the implementation of a Deep Neural Network which exploits the advantage of residual units in an architecture based on U-Net. For the training of these networks, we analyze the use of two different color spaces: gray-scale and RGB data images. We found that training on gray-scale images gives the best results obtaining mean values of Dice Score, Precision, and Recall of 0.73, 0.58, and 0.92 respectively. The results obtained show that the use of residual U-Net could be a suitable model for further development for a computer-aided system for navigation and guidance through the urinary system.

End-To-End Multi-Task Learning for Lung Nodule Segmentation and Diagnosis

Wei Chen, Qiuli Wang, Dan Yang, Xiaohong Zhang, Chen Liu, Yucong Li

Responsive image

Auto-TLDR; A novel multi-task framework for lung nodule diagnosis based on deep learning and medical features

Slides Similar

Computer-Aided Diagnosis (CAD) systems for lung nodule diagnosis based on deep learning have attracted much attention in recent years. However, most existing methods ignore the relationships between the segmentation and classification tasks, which leads to unstable performances. To address this problem, we propose a novel multi-task framework, which can provide lung nodule segmentation mask, malignancy prediction, and medical features for interpretable diagnosis at the same time. Our framework mainly contains two sub-network: (1) Multi-Channel Segmentation Sub-network (MSN) for lung nodule segmentation, and (2) Joint Classification Sub-network (JCN) for interpretable lung nodule diagnosis. In the proposed framework, we use U-Net down-sampling processes for extracting low-level deep learning features, which are shared by two sub-networks. The JCN forces the down-sampling processes to learn better lowlevel deep features, which lead to a better construct of segmentation masks. Meanwhile, two additional channels constructed by OTSU and super-pixel (SLIC) methods, are utilized as the guideline of the feature extraction. The proposed framework takes advantages of deep learning methods and classical methods, which can significantly improve the performances of all tasks. We evaluate the proposed framework on public dataset LIDCIDRI. Our framework achieves a promising Dice score of 86.43% in segmentation, 87.07% in malignancy level prediction, and convincing results in interpretable medical feature predictions.

Aerial Road Segmentation in the Presence of Topological Label Noise

Corentin Henry, Friedrich Fraundorfer, Eleonora Vig

Responsive image

Auto-TLDR; Improving Road Segmentation with Noise-Aware U-Nets for Fine-Grained Topology delineation

Slides Poster Similar

The availability of large-scale annotated datasets has enabled Fully-Convolutional Neural Networks to reach outstanding performance on road extraction in aerial images. However, high-quality pixel-level annotation is expensive to produce and even manually labeled data often contains topological errors. Trading off quality for quantity, many datasets rely on already available yet noisy labels, for example from OpenStreetMap. In this paper, we explore the training of custom U-Nets built with ResNet and DenseNet backbones using noise-aware losses that are robust towards label omission and registration noise. We perform an extensive evaluation of standard and noise-aware losses, including a novel Bootstrapped DICE-Coefficient loss, on two challenging road segmentation benchmarks. Our losses yield a consistent improvement in overall extraction quality and exhibit a strong capacity to cope with severe label noise. Our method generalizes well to two other fine-grained topology delineation tasks: surface crack detection for quality inspection and cell membrane extraction in electron microscopy imagery.

Segmentation of Axillary and Supraclavicular Tumoral Lymph Nodes in PET/CT: A Hybrid CNN/Component-Tree Approach

Diana Lucia Farfan Cabrera, Nicolas Gogin, David Morland, Benoît Naegel, Dimitri Papathanassiou, Nicolas Passat

Responsive image

Auto-TLDR; Coupling Convolutional Neural Networks and Component-Trees for Lymph node Segmentation from PET/CT Images

Slides Similar

The analysis of axillary and supraclavicular lymph nodes is a primary prognostic factor for the staging of breast cancer. However, due to the size of lymph nodes and the low resolution of PET data, their segmentation is challenging. We investigate the relevance of considering axillary and supraclavicular lymph node segmentation from PET/CT images by coupling Convolutional Neural Networks (CNNs) and Component-Trees (C-Trees). Building upon the U-Net architecture, we propose a framework that couples a multi-modal U-Net fed with PET and CT, coupled with a hierarchical model obtained from the PET that provides additional high-level region-based features as input channels. Our working hypotheses are twofold. First, we take advantage of both anatomical information from CT for detecting the nodes, and from functional information from PET for detecting the pathological ones. Second, we consider region-based attributes extracted from C-Tree analysis of 3D PET/CT images to improve the CNN segmentation. We carried out experiments on a dataset of 240 pathological lymph nodes from 52 patients scans, and compared our outputs with human expert-defined ground-truth, leading to promising results.

BiLuNet: A Multi-Path Network for Semantic Segmentation on X-Ray Images

Van Luan Tran, Huei-Yung Lin, Rachel Liu, Chun-Han Tseng, Chun-Han Tseng

Responsive image

Auto-TLDR; BiLuNet: Multi-path Convolutional Neural Network for Semantic Segmentation of Lumbar vertebrae, sacrum,

Similar

Semantic segmentation and shape detection of lumbar vertebrae, sacrum, and femoral heads from clinical X-ray images are important and challenging tasks. In this paper, we propose a new multi-path convolutional neural network, BiLuNet, for semantic segmentation on X-ray images. The network is capable of medical image segmentation with very limited training data. With the shape fitting of the bones, we can identify the location of the target regions very accurately for lumbar vertebra inspection. We collected our dataset and annotated by doctors for model training and performance evaluation. Compared to the state-of-the-art methods, the proposed technique provides better mIoUs and higher success rates with the same training data. The experimental results have demonstrated the feasibility of our network to perform semantic segmentation for lumbar vertebrae, sacrum, and femoral heads.

Confidence Calibration for Deep Renal Biopsy Immunofluorescence Image Classification

Federico Pollastri, Juan Maroñas, Federico Bolelli, Giulia Ligabue, Roberto Paredes, Riccardo Magistroni, Costantino Grana

Responsive image

Auto-TLDR; A Probabilistic Convolutional Neural Network for Immunofluorescence Classification in Renal Biopsy

Slides Poster Similar

With this work we tackle immunofluorescence classification in renal biopsy, employing state-of-the-art Convolutional Neural Networks. In this setting, the aim of the probabilistic model is to assist an expert practitioner towards identifying the location pattern of antibody deposits within a glomerulus. Since modern neural networks often provide overconfident outputs, we stress the importance of having a reliable prediction, demonstrating that Temperature Scaling, a recently introduced re-calibration technique, can be successfully applied to immunofluorescence classification in renal biopsy. Experimental results demonstrate that the designed model yields good accuracy on the specific task, and that Temperature Scaling is able to provide reliable probabilities, which are highly valuable for such a task given the low inter-rater agreement.

Segmenting Kidney on Multiple Phase CT Images Using ULBNet

Yanling Chi, Yuyu Xu, Gang Feng, Jiawei Mao, Sihua Wu, Guibin Xu, Weimin Huang

Responsive image

Auto-TLDR; A ULBNet network for kidney segmentation on multiple phase CT images

Poster Similar

Abstract—Segmentation of kidney on CT images is critical to computer-assisted surgical planning for kidney interventional therapy. Segmenting kidney manually is impractical in clinical, automatic segmentation is desirable. U-Net has been successful in medical image segmentation and is a promising candidate for the task. However, semantic gap still exists, especially when multiple phase images or multiple center images are involved. In this paper, we proposed an ULBNet to reduce the semantic gap and to improve segmentation performance. The proposed architecture includes new skip connections of local binary convolution (LBC). We also proposed a novel strategy of fast retraining a model for a new task without manually labelling required. We evaluated the network for kidney segmentation on multiple phase CT images. ULBNet resulted in an overall accuracy of 98.0% with comparison to Resunet 97.5%. Specifically, on the plain phase CT images, 98.1% resulted from ULBNet and 97.6% from Resunet; on the corticomedullay phase images, 97.8% from ULBNet and 97.2% from Resunet; on the nephrographic phase images, 97.6% from ULBNet and 97.4% from Resunet; on the excretory phase images, 98.1% from ULBNet and 97.4% from Resunet. The proposed network architecture performs better than Resunet on generalizing to multiple phase images.

Dual Encoder Fusion U-Net (DEFU-Net) for Cross-manufacturer Chest X-Ray Segmentation

Zhang Lipei, Aozhi Liu, Jing Xiao

Responsive image

Auto-TLDR; Inception Convolutional Neural Network with Dilation for Chest X-Ray Segmentation

Slides Similar

A number of methods based on the deep learning have been applied to medical image segmentation and have achieved state-of-the-art performance. The most famous technique is U-Net which has been used to many medical datasets including the Chest X-ray. Due to the importance of chest x- ray data in studying COVID-19, there is a demand for state-of- art models capable of precisely segmenting chest x-rays. In this paper, we propose a dual encoder fusion U-Net framework for Chest X-rays based on Inception Convolutional Neural Network with dilation, Densely Connected Recurrent Convolutional Neural Network, which is named DEFU-Net. The densely connected recurrent path extends the network deeper for facilitating context feature extraction. In order to increase the width of network and enrich representation of features, the inception blocks with dilation have been used. The inception blocks can capture globally and locally spatial information with various receptive fields to avoid information loss caused by max-pooling. Meanwhile, the features fusion of two path by summation preserve the context and the spatial information for decoding part. We applied this model in Chest X-ray dataset from two different manufacturers (Montgomery and Shenzhen hospital). The DEFU-Net achieves the better performance than basic U-Net, residual U-Net, BCDU- Net, R2U-Net and attention R2U-Net. This model approaches state-of-the-art in this mixed dataset. The open source code for this proposed framework is public available.

BG-Net: Boundary-Guided Network for Lung Segmentation on Clinical CT Images

Rui Xu, Yi Wang, Tiantian Liu, Xinchen Ye, Lin Lin, Yen-Wei Chen, Shoji Kido, Noriyuki Tomiyama

Responsive image

Auto-TLDR; Boundary-Guided Network for Lung Segmentation on CT Images

Slides Poster Similar

Lung segmentation on CT images is a crucial step for a computer-aided diagnosis system of lung diseases. The existing deep learning based lung segmentation methods are less efficient to segment lungs on clinical CT images, especially that the segmentation on lung boundaries is not accurate enough due to complex pulmonary opacities in practical clinics. In this paper, we propose a boundary-guided network (BG-Net) to address this problem. It contains two auxiliary branches that separately segment lungs and extract the lung boundaries, and an aggregation branch that efficiently exploits lung boundary cues to guide the network for more accurate lung segmentation on clinical CT images. We evaluate the proposed method on a private dataset collected from the Osaka university hospital and four public datasets including StructSeg, HUG, VESSEL12, and a Novel Coronavirus 2019 (COVID-19) dataset. Experimental results show that the proposed method can segment lungs more accurately and outperform several other deep learning based methods.

PCANet: Pyramid Context-Aware Network for Retinal Vessel Segmentation

Yi Zhang, Yixuan Chen, Kai Zhang

Responsive image

Auto-TLDR; PCANet: Adaptive Context-Aware Network for Automated Retinal Vessel Segmentation

Slides Poster Similar

Automated retinal vessel segmentation plays an important role in the diagnosis of some diseases such as diabetes, arteriosclerosis and hypertension. Recent works attempt to improve segmentation performance by exploring either global or local contexts. However, the context demands are varying from regions in each image and different levels of network. To address these problems, we propose Pyramid Context-aware Network (PCANet), which can adaptively capture multi-scale context representations. Specifically, PCANet is composed of multiple Adaptive Context-aware (ACA) blocks arranged in parallel, each of which can adaptively obtain the context-aware features by estimating affinity coefficients at a specific scale under the guidance of global contextual dependencies. Meanwhile, we import ACA blocks with specific scales in different levels of the network to obtain a coarse-to-fine result. Furthermore, an integrated test-time augmentation method is developed to further boost the performance of PCANet. Finally, extensive experiments demonstrate the effectiveness of the proposed PCANet, and state-of-the-art performances are achieved with AUCs of 0.9866, 0.9886 and F1 Scores of 0.8274, 0.8371 on two public datasets, DRIVE and STARE, respectively.

A Transformer-Based Network for Anisotropic 3D Medical Image Segmentation

Guo Danfeng, Demetri Terzopoulos

Responsive image

Auto-TLDR; A transformer-based model to tackle the anisotropy problem in 3D medical image analysis

Slides Poster Similar

A critical challenge of applying neural networks to 3D medical image analysis is to deal with the anisotropy problem. The inter-slice contextual information contained in medical images is important, especially when the structural information of lesions is needed. However, such information often varies with cases because of variable slice spacing. Image anisotropy downgrades model performance especially when slice spacing varies significantly among training and testing datasets. ExsiWe proposed a transformer-based model to tackle the anisotropy problem. It is adaptable to different levels of anisotropy and is computationally efficient. Experiments are conducted on 3D lung cancer segmentation task. Our model achieves an average Dice score of approximately 0.87, which generally outperforms baseline models.

SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation

Changlu Guo, Marton Szemenyei, Yugen Yi, Wenle Wang, Buer Chen, Changqi Fan

Responsive image

Auto-TLDR; Spatial Attention U-Net for Segmentation of Retinal Blood Vessels

Slides Poster Similar

The precise segmentation of retinal blood vessels is of great significance for early diagnosis of eye-related diseases such as diabetes and hypertension. In this work, we propose a lightweight network named Spatial Attention U-Net (SA-UNet) that does not require thousands of annotated training samples and can be utilized in a data augmentation manner to use the available annotated samples more efficiently. SA-UNet introduces a spatial attention module which infers the attention map along the spatial dimension, and multiplies the attention map by the input feature map for adaptive feature refinement. In addition, the proposed network employs structured dropout convolutional blocks instead of the original convolutional blocks of U-Net to prevent the network from overfitting. We evaluate SA-UNet based on two benchmark retinal datasets: the Vascular Extraction (DRIVE) dataset and the Child Heart and Health Study (CHASE_DB1) dataset. The results show that the proposed SA-UNet achieves state-of-the-art performance on both datasets.The implementation and the trained networks are available on Github1.

Accurate Cell Segmentation in Digital Pathology Images Via Attention Enforced Networks

Zeyi Yao, Kaiqi Li, Guanhong Zhang, Yiwen Luo, Xiaoguang Zhou, Muyi Sun

Responsive image

Auto-TLDR; AENet: Attention Enforced Network for Automatic Cell Segmentation

Slides Poster Similar

Automatic cell segmentation is an essential step in the pipeline of computer-aided diagnosis (CAD), such as the detection and grading of breast cancer. Accurate segmentation of cells can not only assist the pathologists to make a more precise diagnosis, but also save much time and labor. However, this task suffers from stain variation, cell inhomogeneous intensities, background clutters and cells from different tissues. To address these issues, we propose an Attention Enforced Network (AENet), which is built on spatial attention module and channel attention module, to integrate local features with global dependencies and weight effective channels adaptively. Besides, we introduce a feature fusion branch to bridge high-level and low-level features. Finally, the marker controlled watershed algorithm is applied to post-process the predicted segmentation maps for reducing the fragmented regions. In the test stage, we present an individual color normalization method to deal with the stain variation problem. We evaluate this model on the MoNuSeg dataset. The quantitative comparisons against several prior methods demonstrate the priority of our approach.

Vesselness Filters: A Survey with Benchmarks Applied to Liver Imaging

Jonas Lamy, Odyssée Merveille, Bertrand Kerautret, Nicolas Passat, Antoine Vacavant

Responsive image

Auto-TLDR; Comparison of Vessel Enhancement Filters for Liver Vascular Network Segmentation

Slides Poster Similar

The accurate knowledge of vascular network geometry is crucial for many clinical applications such as cardiovascular disease diagnosis and surgery planning. Vessel enhancement algorithms are often a key step to improve the robustness of vessel segmentation. A wide variety of enhancement filters exists in the literature, but they are often difficult to compare as the applications and datasets differ from a paper to another and the code is rarely available. In this article, we compare seven vessel enhancement filters covering the last twenty years literature in a unique common framework. We focus our study on the liver vascular network which is under-represented in the literature. The evaluation is made from three points of view: in the whole liver, in the vessel neighborhood and near the bifurcations. The study is performed on two publicly available datasets: the Ircad dataset (CT images) and the VascuSynth dataset adapted for MRI simulation. We discuss the strengths and weaknesses of each method in the hepatic context. In addition, the benchmark framework including a C++ implementation of each compared method is provided. An online demonstration ensures the reproducibility of the results without requiring any additional software.

Dual Stream Network with Selective Optimization for Skin Disease Recognition in Consumer Grade Images

Krishnam Gupta, Jaiprasad Rampure, Monu Krishnan, Ajit Narayanan, Nikhil Narayan

Responsive image

Auto-TLDR; A Deep Network Architecture for Skin Disease Localisation and Classification on Consumer Grade Images

Slides Poster Similar

Skin disease localisation and classification on consumer-grade images is more challenging compared to that on dermoscopic imaging. Consumer grade images refer to the images taken using commonly available imaging devices such as a mobile camera or a hand held digital camera. Such images, in addition to having the skin condition of interest in a very small area of the image, has other noisy non-clinical details introduced due to the lighting conditions and the distance of the hand held device from the anatomy at the time of acquisition. We propose a novel deep network architecture \& a new optimization strategy for classification with implicit localisation of skin diseases from clinical/consumer grade images. A weakly supervised segmentation algorithm is first employed to extract Region of Interests (RoI) from the image, the RoI and the original image form the two input streams of the proposed architecture. Each stream of the architecture learns high level and low level features from the original image and the RoI, respectively. The two streams are independently optimised until the loss stops decreasing after which both the streams are optimised collectively with the help of a third combiner sub-network. Such a strategy resulted in a 5% increase of accuracy over the current state-of-the-art methods on SD-198 dataset, which is publicly available. The proposed algorithm is also validated on a new dataset containing over 12,000 images across 75 different skin conditions. We intend to release this dataset as SD-75 to aid in the advancement of research on skin condition classification on consumer grade images.

Multi-Label Contrastive Focal Loss for Pedestrian Attribute Recognition

Xiaoqiang Zheng, Zhenxia Yu, Lin Chen, Fan Zhu, Shilong Wang

Responsive image

Auto-TLDR; Multi-label Contrastive Focal Loss for Pedestrian Attribute Recognition

Slides Poster Similar

Pedestrian Attribute Recognition (PAR) has received extensive attention during the past few years. With the advances of deep constitutional neural networks (CNNs), the performance of PAR has been significantly improved. Existing methods tend to acquire attribute-specific features by designing various complex network structures with additional modules. Such additional modules, however, dramatically increase the number of parameters. Meanwhile, the problems of class imbalance and hard attribute retrieving remain underestimated in PAR. In this paper, we explore the optimization mechanism of the training processing to account for these problems and propose a new loss function called Multi-label Contrastive Focal Loss (MCFL). This proposed MCFL emphasizes the hard and minority attributes by using a separated re-weighting mechanism for different positive and negative classes to alleviate the impact of the imbalance. MCFL is also able to enlarge the gaps between the intra-class of multi-label attributes, to force CNNs to extract more subtle discriminative features. We evaluate the proposed MCFL on three large public pedestrian datasets, including RAP, PA-100K, and PETA. The experimental results indicate that the proposed MCFL with the ResNet-50 backbone is able to outperform other state-of-the-art approaches in comparison.

Graph Signal Active Contours

Olivier Lezoray

Responsive image

Auto-TLDR; Adaptation of Active Contour Without Edges for Graph Signal Processing

Slides Similar

With the advent of data living on vertices of graphs, there is much interest in processing the so-called graph signals for partitioning tasks. As active contours have had much impact in the image processing community, their formulation on graphs is of importance to the field of graph signal processing. This paper proposes an adaptation on graphs of a model that combines the Geodesic Active Contour and the Active Contour Without Edges models. In addition, specific terms depending on graphs are introduced in the formulation. This adaptation is solved using a level set formulation with a gradient descent that can be expressed as a morphological front evolution process. Experimental results on different kinds of graphs signals show the benefit of the approach.

Deep Superpixel Cut for Unsupervised Image Segmentation

Qinghong Lin, Weichan Zhong

Responsive image

Auto-TLDR; Deep Superpixel Cut for Deep Unsupervised Image Segmentation

Slides Poster Similar

Image segmentation, one of the most critical vision tasks, has been studied for many years. Most of the early algorithms are unsupervised methods, which use hand-crafted features to divide the image into many regions. Recently, owing to the great success of deep learning technology, CNNs based methods showing superior performance in image segmentation. However, these methods rely on a large number of human annotations, which are expensive to collect. In this paper, we propose a deep unsupervised method for image segmentation, which borrowed the ideas of classical graph partitioning. Our approach contains the following two stages. First, a Superpixel Guided Autoencoder (SGAE) is designed to learn the deep embedding and smooth the image simultaneously, then the smoothed image passed to generate superpixels. Second, based on the learned embedding, we propose a novel segmentation algorithm called Deep Superpixel Cut(DSC), which measures the deep similarity between superpixels and then adaptively partitions the superpixels into perceptual regions. Experimental results on the BSDS500 dataset demonstrate the effectiveness of the proposed method

OCT Image Segmentation Using NeuralArchitecture Search and SRGAN

Saba Heidari, Omid Dehzangi, Nasser M. Nasarabadi, Ali Rezai

Responsive image

Auto-TLDR; Automatic Segmentation of Retinal Layers in Optical Coherence Tomography using Neural Architecture Search

Poster Similar

Alzheimer’s disease (AD) diagnosis is one of the major research areas in computational medicine. Optical coherence tomography (OCT) is a non-invasive, inexpensive, and timely efficient method that scans the human’s retina with depth. It has been hypothesized that the thickness of the retinal layers extracted from OCTs could be an efficient and effective biomarker for early diagnosis of AD. In this work, we aim to design a self-training model architecture for the task of segmenting the retinal layers in OCT scans. Neural architecture search (NAS) is a subfield of AutoML domain, which has a significant impact on improving the accuracy of machine vision tasks. We integrate the NAS algorithm with a Unet auto-encoder architecture as its backbone. Then, we employ our proposed model to segment the retinal nerve fiber layer in our preprocessed OCT images with the aim of AD diagnosis. In this work, we trained a super-resolution generative adversarial network on the raw OCT scans to improve the quality of the images before the modeling stage. In our architecture search strategy, different primitive operations suggested to find down- \& up-sampling Unet cell blocks and the binary gate method has been applied to make the search strategy more practical. Our architecture search method is empirically evaluated by training on the Unet and NAS-Unet from scratch. Specifically, the proposed NAS-Unet training significantly outperforms the baseline human-designed architecture by achieving 95.1\% in the mean Intersection over Union metric and 79.1\% in the Dice similarity coefficient.

DA-RefineNet: Dual-Inputs Attention RefineNet for Whole Slide Image Segmentation

Ziqiang Li, Rentuo Tao, Qianrun Wu, Bin Li

Responsive image

Auto-TLDR; DA-RefineNet: A dual-inputs attention network for whole slide image segmentation

Slides Poster Similar

Automatic medical image segmentation techniques have wide applications for disease diagnosing, however, its much more challenging than natural optical image segmentation tasks due to the high-resolution of medical images and the corresponding huge computation cost. Sliding window was a commonly used technique for whole slide image (WSI) segmentation, however, for these methods that based on sliding window, the main drawback was lacking of global contextual information for supervision. In this paper, we proposed a dual-inputs attention network (denoted as DA-RefineNet) for WSI segmentation, where both local fine-grained information and global coarse information can be efficiently utilized. Sufficient comparative experiments were conducted to evaluate the effectiveness of the proposed method, the results proved that the proposed method can achieve better performance on WSI segmentation tasks compared to methods rely on single-input.

Walk the Lines: Object Contour Tracing CNN for Contour Completion of Ships

André Peter Kelm, Udo Zölzer

Responsive image

Auto-TLDR; Walk the Lines: A Convolutional Neural Network trained to follow object contours

Slides Poster Similar

We develop a new contour tracing algorithm to enhance the results of the latest object contour detectors. The goal is to achieve a perfectly closed, single-pixel wide and detailed object contour, since this type of contour could be analyzed using methods such as Fourier descriptors. Convolutional Neural Networks (CNNs) are rarely used for contour tracing, and we see great potential in using their capabilities for this task. Therefore we present the Walk the Lines (WtL) algorithm: A standard regression CNN trained to follow object contours. As initial step, we train the CNN only on ship contours, but the principle is applicable to other objects. Input data are the image and the associated object contour prediction of the recently published RefineContourNet (RCN). The WtL gets the center pixel coordinates, which defines an input section, plus an angle for rotating this section. Ideally, the center pixel moves on the contour, while the angle describes upcoming directional contour changes. The WtL predicts its steps pixelwise in a selfrouting way. To obtain a complete object contour the WtL runs in parallel at different image locations and the traces of its individual paths are summed. In contrast to the comparable Non-Maximum Suppression (NMS) method, our approach produces connected contours with finer details. Finally, the object contour is binarized under the condition of being closed. In case all procedures work as desired, excellent ship segmentations with high IoUs are produced, showing details such as antennas and ship superstructures that are easily omitted by other segmentation methods.

Boundary-Aware Graph Convolution for Semantic Segmentation

Hanzhe Hu, Jinshi Cui, Jinshi Hongbin Zha

Responsive image

Auto-TLDR; Boundary-Aware Graph Convolution for Semantic Segmentation

Slides Poster Similar

Recent works have made great progress in semantic segmentation by exploiting contextual information in a local or global manner with dilated convolutions, pyramid pooling or self-attention mechanism. However, few works have focused on harvesting boundary information to improve the segmentation performance. In order to enhance the feature similarity within the object and keep discrimination from other objects, we propose a boundary-aware graph convolution (BGC) module to propagate features within the object. The graph reasoning is performed among pixels of the same object apart from the boundary pixels. Based on the proposed BGC module, we further introduce the Boundary-aware Graph Convolution Network(BGCNet), which consists of two main components including a basic segmentation network and the BGC module, forming a coarse-to-fine paradigm. Specifically, the BGC module takes the coarse segmentation feature map as node features and boundary prediction to guide graph construction. After graph convolution, the reasoned feature and the input feature are fused together to get the refined feature, producing the refined segmentation result. We conduct extensive experiments on three popular semantic segmentation benchmarks including Cityscapes, PASCAL VOC 2012 and COCO Stuff, and achieve state-of-the-art performance on all three benchmarks.

Neural Machine Registration for Motion Correction in Breast DCE-MRI

Federica Aprea, Stefano Marrone, Carlo Sansone

Responsive image

Auto-TLDR; A Neural Registration Network for Dynamic Contrast Enhanced-Magnetic Resonance Imaging

Slides Poster Similar

Cancer is one of the leading causes of death in the western world, with medical imaging playing a key role for early diagnosis. Focusing on breast cancer, one of the emerging imaging methodologies is Dynamic Contrast Enhanced-Magnetic Resonance Imaging (DCE-MRI). The flip side of using DCE-MRI is in its long acquisition times, often causing the patient to move, resulting in motion artefacts, namely distortions in the acquired image that can affect DCE-MRI analysis. A possible solution consists in the use of Motion Correction Techniques (MCTs), i.e. procedures intended to re-align the post-contrast image to the corresponding pre-contrast (reference) one. This task is particularly critic in DCE-MRI, due to brightness variations introduced in post-contrast images by the contrast-agent flowing. To face this problem, in this work we introduce a new MCT for breast DCE-MRI leveraging Physiologically Based PharmacoKinetic (PBPK) modelling and Artificial Neural Networks (ANN) to determine the most suitable physiologically-compliant transformation. To this aim, we propose a Neural Registration Network relying on a very task-specific loss function explicitly designed to take into account the contrast agent flowing while enforcing a correct re-alignment. We compared the obtained results against some conventional motion correction techniques, evaluating the performance on a patient-by-patient basis. Results clearly show the effectiveness of the proposed approach, resulting as the best performing even when compares against other techniques designed to take into account for brightness variations.

Weakly Supervised Geodesic Segmentation of Egyptian Mummy CT Scans

Avik Hati, Matteo Bustreo, Diego Sona, Vittorio Murino, Alessio Del Bue

Responsive image

Auto-TLDR; A Weakly Supervised and Efficient Interactive Segmentation of Ancient Egyptian Mummies CT Scans Using Geodesic Distance Measure and GrabCut

Slides Poster Similar

In this paper, we tackle the task of automatically analyzing 3D volumetric scans obtained from computed tomography (CT) devices. In particular, we address a particular task for which data is very limited: the segmentation of ancient Egyptian mummies CT scans. We aim at digitally unwrapping the mummy and identify different segments such as body, bandages and jewelry. The problem is complex because of the lack of annotated data for the different semantic regions to segment, thus discouraging the use of strongly supervised approaches. We, therefore, propose a weakly supervised and efficient interactive segmentation method to solve this challenging problem. After segmenting the wrapped mummy from its exterior region using histogram analysis and template matching, we first design a voxel distance measure to find an approximate solution for the body and bandage segments. Here, we use geodesic distances since voxel features as well as spatial relationship among voxels is incorporated in this measure. Next, we refine the solution using a GrabCut based segmentation together with a tracking method on the slices of the scan that assigns labels to different regions in the volume, using limited supervision in the form of scribbles drawn by the user. The efficiency of the proposed method is demonstrated using visualizations and validated through quantitative measures and qualitative unwrapping of the mummy.