Planar 3D Transfer Learning for End to End Unimodal MRI Unbalanced Data Segmentation

Martin Kolarik, Radim Burget, Carlos M. Travieso-Gonzalez, Jan Kocica

Responsive image

Auto-TLDR; Planar 3D Res-U-Net Network for Unbalanced 3D Image Segmentation using Fluid Attenuation Inversion Recover

Slides

We present a novel approach of 2D to 3D transfer learning based on mapping pre-trained 2D convolutional neural network weights into planar 3D kernels. The method is validated by proposed planar 3D res-u-net network with encoder transferred from the 2D VGG-16 which is applied for a single-stage unbalanced 3D image data segmentation. In particular, we evaluate the method on the MICCAI 2016 MS lesion segmentation challenge dataset utilizing solely Fluid Attenuation Inversion Recover (FLAIR) sequence without brain extraction for training and inference to simulate real medical praxis. The planar 3D res-u-net network performed the best both in sensitivity and Dice score amongst end to end methods processing raw MRI scans and achieved comparable Dice score to a state-of-the-art unimodal not end to end approach. Complete source code was released under the open-source license and this paper is in compliance with the Machine learning Reproducibility Checklist. By implementing practical transfer learning for 3D data representation we were able to successfully segment heavily unbalanced data without selective sampling and achieved more reliable results using less training data in single modality. From medical perspective, the unimodal approach gives an advantage in real praxis as it does not require co-registration nor additional scanning time during examination. Although modern medical imaging methods capture high resolution 3D anatomy scans suitable for computer aided detection system processing, deployment of automatic systems for interpretation of radiology imaging is still rather theoretical in many medical areas. Our work aims to bridge the gap offering solution for partial research questions.

Similar papers

Automatic Semantic Segmentation of Structural Elements related to the Spinal Cord in the Lumbar Region by Using Convolutional Neural Networks

Jhon Jairo Sáenz Gamboa, Maria De La Iglesia-Vaya, Jon Ander Gómez

Responsive image

Auto-TLDR; Semantic Segmentation of Lumbar Spine Using Convolutional Neural Networks

Slides Poster Similar

This work addresses the problem of automatically segmenting the MR images corresponding to the lumbar spine. The purpose is to detect and delimit the different structural elements like vertebrae, intervertebral discs, nerves, blood vessels, etc. This task is known as semantic segmentation. The approach proposed in this work is based on convolutional neural networks whose output is a mask where each pixel from the input image is classified into one of the possible classes. Classes were defined by radiologists and correspond to structural elements and tissues. The proposed network architectures are variants of the U-Net. Several complementary blocks were used to define the variants: spatial attention models, deep supervision and multi-kernels at input, this last block type is based on the idea of inception. Those architectures which got the best results are described in this paper, and their results are discussed. Two of the proposed architectures outperform the standard U-Net used as baseline.

A Benchmark Dataset for Segmenting Liver, Vasculature and Lesions from Large-Scale Computed Tomography Data

Bo Wang, Zhengqing Xu, Wei Xu, Qingsen Yan, Liang Zhang, Zheng You

Responsive image

Auto-TLDR; The Biggest Treatment-Oriented Liver Cancer Dataset for Segmentation

Slides Poster Similar

How to build a high-performance liver-related computer assisted diagnosis system is an open question of great interest. However, the performance of the state-of-art algorithm is always limited by the amount of data and quality of the label. To address this problem, we propose the biggest treatment-oriented liver cancer dataset for liver surgery and treatment planning. This dataset provides 216 cases (totally about 268K frames) scanned images in contrast-enhanced computed tomography (CT). We labeled all the CT images with the liver, liver vasculature and liver tumor segmentation ground truth for train and tune segmentation algorithms in advance. Based on that, we evaluate several recent and state-of-the-art segmentation algorithms, including 7 deep learning methods, on CT sequences. All results are compared to reference segmentations five error metrics that highlight different aspects of segmentation accuracy. In general, compared with previous datasets, our dataset is really a challenging dataset. To our knowledge, the proposed dataset and benchmark allow for the first time systematic exploration of such issues, and will be made available to allow for further research in this field.

Deep Recurrent-Convolutional Model for AutomatedSegmentation of Craniomaxillofacial CT Scans

Francesca Murabito, Simone Palazzo, Federica Salanitri Proietto, Francesco Rundo, Ulas Bagci, Daniela Giordano, Rosalia Leonardi, Concetto Spampinato

Responsive image

Auto-TLDR; Automated Segmentation of Anatomical Structures in Craniomaxillofacial CT Scans using Fully Convolutional Deep Networks

Slides Poster Similar

In this paper we define a deep learning architecture for automated segmentation of anatomical structures in Craniomaxillofacial (CMF) CT scans that leverages the recent success of encoder-decoder models for semantic segmentation of natural images. In particular, we propose a fully convolutional deep network that combines the advantages of recent fully convolutional models, such as Tiramisu, with squeeze-and-excitation blocks for feature recalibration, integrated with convolutional LSTMs to model spatio-temporal correlations between consecutive slices. The proposed segmentation network shows superior performance and generalization capabilities (to different structures and imaging modalities) than state of the art methods on automated segmentation of CMF structures (e.g., mandibles and airways) in several standard benchmarks (e.g., MICCAI datasets) and on new datasets proposed herein, effectively facing shape variability.

A Multi-Task Contextual Atrous Residual Network for Brain Tumor Detection & Segmentation

Ngan Le, Kashu Yamazaki, Quach Kha Gia, Thanh-Dat Truong, Marios Savvides

Responsive image

Auto-TLDR; Contextual Brain Tumor Segmentation Using 3D atrous Residual Networks and Cascaded Structures

Poster Similar

In recent years, deep neural networks have achieved state-of-the-art performance in a variety of recognition and segmentation tasks in medical imaging including brain tumor segmentation. We investigate that segmenting brain tumor is facing to the imbalanced data problem where the number of pixels belonging to background class (non tumor pixel) is much larger than the number of pixels belonging to foreground class (tumor pixel). To address this problem, we propose a multi-task network which is formed as a cascaded structure and designed to share the feature maps. Our model consists of two targets, i.e., (i) effectively differentiating brain tumor regions and (ii) estimating brain tumor masks. The first task is performed by our proposed contextual brain tumor detection network, which plays the role of an attention gate and focuses on the region around brain tumor only while ignore the background (non tumor area). Instead of processing every pixel, our contextual brain tumor detection network only processes contextual regions around ground-truth instances and this strategy helps to produce meaningful regions proposals. The second task is built upon a 3D atrous residual network and under an encode-decode network in order to effectively segment both large and small objects (brain tumor). Our 3D atrous residual network is designed with a skip connection to enables the gradient from the deep layers to be directly propagated to shallow layers, thus, features of different depths are preserved and used for refining each other. In order to incorporate larger contextual information in volume MRI data, our network is designed by 3D atrous convolution with various kernel sizes, which enlarges the receptive field of filters. Our proposed network has been evaluated on various datasets including BRATS2015, BRATS2017 and BRATS2018 datasets with both validation set and testing set. Our performance has been benchmarked by both region-based metrics and surface-based metrics. We also have conducted comparisons against state-of-the-art approaches.

A Deep Learning Approach for the Segmentation of Myocardial Diseases

Khawala Brahim, Abdull Qayyum, Alain Lalande, Arnaud Boucher, Anis Sakly, Fabrice Meriaudeau

Responsive image

Auto-TLDR; Segmentation of Myocardium Infarction Using Late GADEMRI and SegU-Net

Slides Poster Similar

Cardiac left ventricular (LV) segmentation is of paramount essential step for both diagnosis and treatment of cardiac pathologies such as ischemia, myocardial infarction, arrhythmia and myocarditis. However, this segmentation is challenging due to high variability across patients and the potential lack of contrast between structures. In this work, we propose and evaluate a (2.5D) SegU-Net model based on the fusion of two deep learning techniques (U-Net and Seg-Net) for automated LGEMRI (Late gadolinium enhanced magnetic resonance imaging) myocardial disease (infarct core and no reflow region) quantification in a new multifield expert annotated dataset. Given that the scar tissue represents a small part of the whole MRI slices, we focused on myocardium area. Segmentation results show that this preprocessing step facilitate the learning procedure. In order to solve the class imbalance problem, we propose to apply the Jaccard loss and the Focal Loss as optimization loss function and to integrate a class weights strategy into the objective function. Late combination has been used to merge the output of the best trained models on a different set of hyperparameters. The final network segmentation performances will be useful for future comparison of new method to the current related work for this task. A total number of 2237 of slices (320 cases) were used for training/validation and 210 slices (35 cases) were used for testing. Experiments over our proposed dataset, using several evaluation metrics such Jaccard distance (IOU), Accuracy and Dice similarity coefficient (DSC), demonstrate efficiency performance in quantifying different zones of myocardium infarction across various patients. As compared to the second intra-observer study, our testing results showed that the SegUNet prediction model leads to these average dice coefficients over all segmented tissue classes, respectively : 'Background': 0.99999, 'Myocardium': 0.99434, 'Infarctus': 0.95587, 'Noreflow': 0.78187.

FOANet: A Focus of Attention Network with Application to Myocardium Segmentation

Zhou Zhao, Elodie Puybareau, Nicolas Boutry, Thierry Geraud

Responsive image

Auto-TLDR; FOANet: A Hybrid Loss Function for Myocardium Segmentation of Cardiac Magnetic Resonance Images

Slides Poster Similar

In myocardium segmentation of cardiac magnetic resonance images, ambiguities often appear near the boundaries of the target domains due to tissue similarities. To address this issue, we propose a new architecture, called FOANet, which can be decomposed in three main steps: a localization step, a Gaussian-based contrast enhancement step, and a segmentation step. This architecture is supplied with a hybrid loss function that guides the FOANet to study the transformation relationship between the input image and the corresponding label in a threelevel hierarchy (pixel-, patch- and map-level), which is helpful to improve segmentation and recovery of the boundaries. We demonstrate the efficiency of our approach on two public datasets in terms of regional and boundary segmentations.

A Systematic Investigation on Deep Architectures for Automatic Skin Lesions Classification

Pierluigi Carcagni, Marco Leo, Andrea Cuna, Giuseppe Celeste, Cosimo Distante

Responsive image

Auto-TLDR; RegNet: Deep Investigation of Convolutional Neural Networks for Automatic Classification of Skin Lesions

Slides Poster Similar

Computer vision-based techniques are more and more employed in healthcare and medical fields nowadays in order, principally, to be as a support to the experienced medical staff to help them to make a quick and correct diagnosis. One of the hot topics in this arena concerns the automatic classification of skin lesions. Several promising works exist about it, mainly leveraging Convolutional Neural Networks (CNN), but proposed pipeline mainly rely on complex data preprocessing and there is no systematic investigation about how available deep models can actually reach the accuracy needed for real applications. In order to overcome these drawbacks, in this work, an end-to-end pipeline is introduced and some of the most recent Convolutional Neural Networks (CNNs) architectures are included in it and compared on the largest common benchmark dataset recently introduced. To this aim, for the first time in this application context, a new network design paradigm, namely RegNet, has been exploited to get the best models among a population of configurations. The paper introduces a threefold level of contribution and novelty with respect the previous literature: the deep investigation of several CNN architectures driving to a consistent improvement of the lesions recognition accuracy, the exploitation of a new network design paradigm able to study the behavior of populations of models and a deep discussion about pro and cons of each analyzed method paving the path towards new research lines.

Do Not Treat Boundaries and Regions Differently: An Example on Heart Left Atrial Segmentation

Zhou Zhao, Elodie Puybareau, Nicolas Boutry, Thierry Geraud

Responsive image

Auto-TLDR; Attention Full Convolutional Network for Atrial Segmentation using ResNet-101 Architecture

Slides Similar

Atrial fibrillation is the most common heart rhythm disease. Due to a lack of understanding in matter of underlying atrial structures, current treatments are still not satisfying. Recently, with the popularity of deep learning, many segmentation methods based on fully convolutional networks have been proposed to analyze atrial structures, especially from late gadolinium-enhanced magnetic resonance imaging. However, two problems still occur: 1) segmentation results include the atrial-like background; 2) boundaries are very hard to segment. Most segmentation approaches design a specific network that mainly focuses on the regions, to the detriment of the boundaries. Therefore, this paper proposes an attention full convolutional network framework based on the ResNet-101 architecture, which focuses on boundaries as much as on regions. The additional attention module is added to have the network pay more attention on regions and then to reduce the impact of the misleading similarity of neighboring tissues. We also use a hybrid loss composed of a region loss and a boundary loss to treat boundaries and regions at the same time. We demonstrate the efficiency of the proposed approach on the MICCAI 2018 Atrial Segmentation Challenge public dataset.

Segmentation of Intracranial Aneurysm Remnant in MRA Using Dual-Attention Atrous Net

Subhashis Banerjee, Ashis Kumar Dhara, Johan Wikström, Robin Strand

Responsive image

Auto-TLDR; Dual-Attention Atrous Net for Segmentation of Intracranial Aneurysm Remnant from MRA Images

Slides Poster Similar

Due to the advancement of non-invasive medical imaging modalities like Magnetic Resonance Angiography (MRA), an increasing number of Intracranial Aneurysm (IA) cases are being reported in recent years. The IAs are typically treated by so-called endovascular coiling, where blood flow in the IA is prevented by embolization with a platinum coil. Accurate quantification of the IA Remnant (IAR), i.e. the volume with blood flow present post treatment is the utmost important factor in choosing the right treatment planning. This is typically done by manually segmenting the aneurysm remnant from the MRA volume. Since manual segmentation of volumetric images is a labour-intensive and error-prone process, development of an automatic volumetric segmentation method is required. Segmentation of small structures such as IA, that may largely vary in size, shape, and location is considered extremely difficult. Similar intensity distribution of IAs and surrounding blood vessels makes it more challenging and susceptible to false positive. In this paper we propose a novel 3D CNN architecture called Dual-Attention Atrous Net (DAtt-ANet), which can efficiently segment IAR volumes from MRA images by reconciling features at different scales using the proposed Parallel Atrous Unit (PAU) along with the use of self-attention mechanism for extracting fine-grained features and intra-class correlation. The proposed DAtt-ANet model is trained and evaluated on a clinical MRA image dataset (prospective research project, approved by the local ethical committee) of IAR consisting of 46 subjects, annotated by an expert radiologist from our group. We compared the proposed DAtt-ANet with five state-of-the-art CNN models based on their segmentation performance. The proposed DAtt-ANet outperformed all other methods and was able to achieve a five-fold cross-validation DICE score of $0.73\pm0.06$.

Bridging the Gap between Natural and Medical Images through Deep Colorization

Lia Morra, Luca Piano, Fabrizio Lamberti, Tatiana Tommasi

Responsive image

Auto-TLDR; Transfer Learning for Diagnosis on X-ray Images Using Color Adaptation

Slides Poster Similar

Deep learning has thrived by training on large-scale datasets. However, in many applications, as for medical image diagnosis, getting massive amount of data is still prohibitive due to privacy, lack of acquisition homogeneity and annotation cost. In this scenario transfer learning from natural image collections is a standard practice that attempts to tackle shape, texture and color discrepancy all at once through pretrained model fine-tuning. In this work we propose to disentangle those challenges and design a dedicated network module that focuses on color adaptation. We combine learning from scratch of the color module with transfer learning of different classification backbones obtaining an end-to-end, easy-to-train architecture for diagnostic image recognition on X-ray images. Extensive experiments show how our approach is particularly efficient in case of data scarcity and provides a new path for further transferring the learned color information across multiple medical datasets.

Transfer Learning through Weighted Loss Function and Group Normalization for Vessel Segmentation from Retinal Images

Abdullah Sarhan, Jon Rokne, Reda Alhajj, Andrew Crichton

Responsive image

Auto-TLDR; Deep Learning for Segmentation of Blood Vessels in Retinal Images

Slides Poster Similar

The vascular structure of blood vessels is important in diagnosing retinal conditions such as glaucoma and diabetic retinopathy. Accurate segmentation of these vessels can help in detecting retinal objects such as the optic disc and optic cup and hence determine if there are damages to these areas. Moreover, the structure of the vessels can help in diagnosing glaucoma. The rapid development of digital imaging and computer-vision techniques has increased the potential for developing approaches for segmenting retinal vessels. In this paper, we propose an approach for segmenting retinal vessels that uses deep learning along with transfer learning. We adapted the U-Net structure to use a customized InceptionV3 as the encoder and used multiple skip connections to form the decoder. Moreover, we used a weighted loss function to handle the issue of class imbalance in retinal images. Furthermore, we contributed a new dataset to this field. We tested our approach on six publicly available datasets and a newly created dataset. We achieved an average accuracy of 95.60\% and a Dice coefficient of 80.98\%. The results obtained from comprehensive experiments demonstrate the robustness of our approach to the segmentation of blood vessels in retinal images obtained from different sources. Our approach results in greater segmentation accuracy than other approaches.

Segmenting Kidney on Multiple Phase CT Images Using ULBNet

Yanling Chi, Yuyu Xu, Gang Feng, Jiawei Mao, Sihua Wu, Guibin Xu, Weimin Huang

Responsive image

Auto-TLDR; A ULBNet network for kidney segmentation on multiple phase CT images

Poster Similar

Abstract—Segmentation of kidney on CT images is critical to computer-assisted surgical planning for kidney interventional therapy. Segmenting kidney manually is impractical in clinical, automatic segmentation is desirable. U-Net has been successful in medical image segmentation and is a promising candidate for the task. However, semantic gap still exists, especially when multiple phase images or multiple center images are involved. In this paper, we proposed an ULBNet to reduce the semantic gap and to improve segmentation performance. The proposed architecture includes new skip connections of local binary convolution (LBC). We also proposed a novel strategy of fast retraining a model for a new task without manually labelling required. We evaluated the network for kidney segmentation on multiple phase CT images. ULBNet resulted in an overall accuracy of 98.0% with comparison to Resunet 97.5%. Specifically, on the plain phase CT images, 98.1% resulted from ULBNet and 97.6% from Resunet; on the corticomedullay phase images, 97.8% from ULBNet and 97.2% from Resunet; on the nephrographic phase images, 97.6% from ULBNet and 97.4% from Resunet; on the excretory phase images, 98.1% from ULBNet and 97.4% from Resunet. The proposed network architecture performs better than Resunet on generalizing to multiple phase images.

A Transformer-Based Network for Anisotropic 3D Medical Image Segmentation

Guo Danfeng, Demetri Terzopoulos

Responsive image

Auto-TLDR; A transformer-based model to tackle the anisotropy problem in 3D medical image analysis

Slides Poster Similar

A critical challenge of applying neural networks to 3D medical image analysis is to deal with the anisotropy problem. The inter-slice contextual information contained in medical images is important, especially when the structural information of lesions is needed. However, such information often varies with cases because of variable slice spacing. Image anisotropy downgrades model performance especially when slice spacing varies significantly among training and testing datasets. ExsiWe proposed a transformer-based model to tackle the anisotropy problem. It is adaptable to different levels of anisotropy and is computationally efficient. Experiments are conducted on 3D lung cancer segmentation task. Our model achieves an average Dice score of approximately 0.87, which generally outperforms baseline models.

Learn to Segment Retinal Lesions and Beyond

Qijie Wei, Xirong Li, Weihong Yu, Xiao Zhang, Yongpeng Zhang, Bojie Hu, Bin Mo, Di Gong, Ning Chen, Dayong Ding, Youxin Chen

Responsive image

Auto-TLDR; Multi-task Lesion Segmentation and Disease Classification for Diabetic Retinopathy Grading

Poster Similar

Towards automated retinal screening, this paper makes an endeavor to simultaneously achieve pixel-level retinal lesion segmentation and image-level disease classification. Such a multi-task approach is crucial for accurate and clinically interpretable disease diagnosis. Prior art is insufficient due to three challenges, i.e., lesions lacking objective boundaries, clinical importance of lesions irrelevant to their size, and the lack of one-to-one correspondence between lesion and disease classes. This paper attacks the three challenges in the context of diabetic retinopathy (DR) grading. We propose Lesion-Net, a new variant of fully convolutional networks, with its expansive path re- designed to tackle the first challenge. A dual Dice loss that leverages both semantic segmentation and image classification losses is introduced to resolve the second challenge. Lastly, we build a multi-task network that employs Lesion-Net as a side- attention branch for both DR grading and result interpretation. A set of 12K fundus images is manually segmented by 45 ophthalmologists for 8 DR-related lesions, resulting in 290K manual segments in total. Extensive experiments on this large- scale dataset show that our proposed approach surpasses the prior art for multiple tasks including lesion segmentation, lesion classification and DR grading.

3D Medical Multi-Modal Segmentation Network Guided by Multi-Source Correlation Constraint

Tongxue Zhou, Stéphane Canu, Pierre Vera, Su Ruan

Responsive image

Auto-TLDR; Multi-modality Segmentation with Correlation Constrained Network

Slides Poster Similar

In the field of multimodal segmentation, the correlation between different modalities can be considered for improving the segmentation results. In this paper, we propose a multi-modality segmentation network with a correlation constraint. Our network includes N model-independent encoding paths with N image sources, a correlation constrain block, a feature fusion block, and a decoding path. The model-independent encoding path can capture modality-specific features from the N modalities. Since there exists a strong correlation between different modalities, we first propose a linear correlation block to learn the correlation between modalities, then a loss function is used to guide the network to learn the correlated features based on the correlation representation block. This block forces the network to learn the latent correlated features which are more relevant for segmentation. Considering that not all the features extracted from the encoders are useful for segmentation, we propose to use dual attention based fusion block to recalibrate the features along the modality and spatial paths, which can suppress less informative features and emphasize the useful ones. The fused feature representation is finally projected by the decoder to obtain the segmentation result. Our experiment results tested on BraTS-2018 dataset for brain tumor segmentation demonstrate the effectiveness of our proposed method.

Deep Learning-Based Type Identification of Volumetric MRI Sequences

Jean Pablo De Mello, Thiago Paixão, Rodrigo Berriel, Mauricio Reyes, Alberto F. De Souza, Claudine Badue, Thiago Oliveira-Santos

Responsive image

Auto-TLDR; Deep Learning for Brain MRI Sequences Identification Using Convolutional Neural Network

Slides Poster Similar

The analysis of Magnetic Resonance Imaging (MRI) sequences enables clinical professionals to monitor the progression of a brain tumor. As the interest for automatizing brain volume MRI analysis increases, it becomes convenient to have each sequence well identified. However, the unstandardized naming of MRI sequences make their identification difficult for automated systems, as well as make it difficult for researches to generate or use datasets for machine learning research. In face of that, we propose a system for identifying types of brain MRI sequences based on deep learning. By training a Convolutional Neural Network (CNN) based on 18-layer ResNet architecture, our system is able to classify a volumetric brain MRI as a T1, T1c, T2 or FLAIR sequence, or whether it does not belong to any of these classes. The network was trained with both pre-processed (BraTS dataset) and non-pre-processed (TCGA-GBM dataset) images with diverse acquisition protocols, requiring only a few layers of the volume for training. Our system is able to classify among sequence types with an accuracy of 96.27%.

CAggNet: Crossing Aggregation Network for Medical Image Segmentation

Xu Cao, Yanghao Lin

Responsive image

Auto-TLDR; Crossing Aggregation Network for Medical Image Segmentation

Slides Poster Similar

In this paper, we present Crossing Aggregation Network (CAggNet), a novel densely connected semantic segmentation method for medical image analysis. The crossing aggregation network absorbs the idea of deep layer aggregation and makes significant innovations in layer connection and semantic information fusion. In this architecture, the traditional skip-connection structure of general U-Net is replaced by aggregations of multi-level down-sampling and up-sampling layers. This enables the network to fuse information interactively flows at different levels of layers in semantic segmentation. It also introduces weighted aggregation module to aggregate multi-scale output information. We have evaluated and compared our CAggNet with several advanced U-Net based methods in two public medical image datasets, including the 2018 Data Science Bowl nuclei detection dataset and the 2015 MICCAI gland segmentation competition dataset. Experimental results indicate that CAggNet improves medical object recognition and achieves a more accurate and efficient segmentation compared to existing improved U-Net and UNet++ structure.

MTGAN: Mask and Texture-Driven Generative Adversarial Network for Lung Nodule Segmentation

Wei Chen, Qiuli Wang, Kun Wang, Dan Yang, Xiaohong Zhang, Chen Liu, Yucong Li

Responsive image

Auto-TLDR; Mask and Texture-driven Generative Adversarial Network for Lung Nodule Segmentation

Slides Poster Similar

Accurate segmentation for lung nodules in lung computed tomography (CT) scans plays a key role in the early diagnosis of lung cancer. Many existing methods, especially UNet, have made significant progress in lung nodule segmentation. However, due to the complex shapes of lung nodules and the similarity of visual characteristics between nodules and lung tissues, an accurate segmentation with few false positives of lung nodules is still a challenging problem. Considering the fact that both boundary and texture information of lung nodules are important for obtaining an accurate segmentation result, we propose a novel Mask and Texture-driven Generative Adversarial Network (MTGAN) with a joint multi-scale L1 loss for lung nodule segmentation, which takes full advantages of U-Net and adversarial training. The proposed MTGAN leverages adversarial learning strategy guided by the boundary and texture information of lung nodules to generate more accurate segmentation results with lesser false positives. We validate our model with the LIDC–IDRI dataset, and experimental results show that our method achieves excellent segmentation results for a variety of lung nodules, especially for juxtapleural nodules and low-dense nodules. Without any bells and whistles, the proposed MTGAN achieves significant segmentation performance with the Dice similarity coefficient (DSC) of 85.24% on the LIDC–IDRI dataset.

DARN: Deep Attentive Refinement Network for Liver Tumor Segmentation from 3D CT Volume

Yao Zhang, Jiang Tian, Cheng Zhong, Yang Zhang, Zhongchao Shi, Zhiqiang He

Responsive image

Auto-TLDR; Deep Attentive Refinement Network for Liver Tumor Segmentation from 3D Computed Tomography Using Multi-Level Features

Slides Poster Similar

Automatic liver tumor segmentation from 3D Computed Tomography (CT) is a necessary prerequisite in the interventions of hepatic abnormalities and surgery planning. However, accurate liver tumor segmentation remains challenging due to the large variability of tumor sizes and inhomogeneous texture. Recent advances based on Fully Convolutional Network (FCN) in liver tumor segmentation draw on success of learning discriminative multi-level features. In this paper, we propose a Deep Attentive Refinement Network (DARN) for improved liver tumor segmentation from CT volumes by fully exploiting both low and high level features embedded in different layers of FCN. Different from existing works, we exploit attention mechanism to leverage the relation of different levels of features encoded in different layers of FCN. Specifically, we introduce a Semantic Attention Refinement (SemRef) module to selectively emphasize global semantic information in low level features with the guidance of high level ones, and a Spatial Attention Refinement (SpaRef) module to adaptively enhance spatial details in high level features with the guidance of low level ones. We evaluate our network on the public MICCAI 2017 Liver Tumor Segmentation Challenge dataset (LiTS dataset) and it achieves state-of-the-art performance. The proposed refinement modules are an effective strategy to exploit multi-level features and has great potential to generalize to other medical image segmentation tasks.

BCAU-Net: A Novel Architecture with Binary Channel Attention Module for MRI Brain Segmentation

Yongpei Zhu, Zicong Zhou, Guojun Liao, Kehong Yuan

Responsive image

Auto-TLDR; BCAU-Net: Binary Channel Attention U-Net for MRI brain segmentation

Slides Poster Similar

Recently deep learning-based networks have achieved advanced performance in medical image segmentation. However, the development of deep learning is slow in magnetic resonance image (MRI) segmentation of normal brain tissues. In this paper, inspired by channel attention module, we propose a new architecture, Binary Channel Attention U-Net (BCAU-Net), by introducing a novel Binary Channel Attention Module (BCAM) into skip connection of U-Net, which can take full advantages of the channel information extracted from the encoding path and corresponding decoding path. To better aggregate multi-scale spatial information of the feature map, spatial pyramid pooling (SPP) modules with different pooling operations are used in BCAM instead of original average-pooling and max-pooling operations. We verify this model on two datasets including IBSR and MRBrainS18, and obtain better performance on MRI brain segmentation compared with other methods. We believe the proposed method can advance the performance in brain segmentation and clinical diagnosis.

The DeepHealth Toolkit: A Unified Framework to Boost Biomedical Applications

Michele Cancilla, Laura Canalini, Federico Bolelli, Stefano Allegretti, Salvador Carrión, Roberto Paredes, Jon Ander Gómez, Simone Leo, Marco Enrico Piras, Luca Pireddu, Asaf Badouh, Santiago Marco-Sola, Lluc Alvarez, Miquel Moreto, Costantino Grana

Responsive image

Auto-TLDR; DeepHealth Toolkit: An Open Source Deep Learning Toolkit for Cloud Computing and HPC

Slides Poster Similar

Given the overwhelming impact of machine learning on the last decade, several libraries and frameworks have been developed in recent years to simplify the design and training of neural networks, providing array-based programming, automatic differentiation and user-friendly access to hardware accelerators. None of those tools, however, was designed with native and transparent support for Cloud Computing or heterogeneous High-Performance Computing (HPC). The DeepHealth Toolkit is an open source deep learning toolkit aimed at boosting productivity of data scientists operating in the medical field by providing a unified framework for the distributed training of neural networks, that is able to leverage hybrid HPC and Cloud environments in a way transparent to the user. The toolkit is composed of a computer vision library, a deep learning library, and a front-end for non-expert users; all of the components are focused on the medical domain, but they are general purpose and can be applied to any other field. In this paper, the principles driving the design of the DeepHealth libraries are described, along with details about the implementation and the interaction between the different elements composing the toolkit. Finally, experiments on common benchmarks prove the efficiency of each separate component, and of the DeepHealth Toolkit overall.

Confidence Calibration for Deep Renal Biopsy Immunofluorescence Image Classification

Federico Pollastri, Juan Maroñas, Federico Bolelli, Giulia Ligabue, Roberto Paredes, Riccardo Magistroni, Costantino Grana

Responsive image

Auto-TLDR; A Probabilistic Convolutional Neural Network for Immunofluorescence Classification in Renal Biopsy

Slides Poster Similar

With this work we tackle immunofluorescence classification in renal biopsy, employing state-of-the-art Convolutional Neural Networks. In this setting, the aim of the probabilistic model is to assist an expert practitioner towards identifying the location pattern of antibody deposits within a glomerulus. Since modern neural networks often provide overconfident outputs, we stress the importance of having a reliable prediction, demonstrating that Temperature Scaling, a recently introduced re-calibration technique, can be successfully applied to immunofluorescence classification in renal biopsy. Experimental results demonstrate that the designed model yields good accuracy on the specific task, and that Temperature Scaling is able to provide reliable probabilities, which are highly valuable for such a task given the low inter-rater agreement.

DE-Net: Dilated Encoder Network for Automated Tongue Segmentation

Hui Tang, Bin Wang, Jun Zhou, Yongsheng Gao

Responsive image

Auto-TLDR; Automated Tongue Image Segmentation using De-Net

Slides Poster Similar

Automated tongue recognition is a growing research field due to global demand for personal health care. Using mobile devices to take tongue pictures is convenient and of low cost for tongue recognition. It is particularly suitable for self-health evaluation of the public. However, images taken by mobile devices are easily affected by various imaging environment, which makes fine segmentation a more challenging task compared with those taken by specialized acquisition devices. Deep learning approaches are promising for tongue image segmentation because they have powerful feature learning and representation capability. However, the successive pooling operations in these methods lead to loss of information on image details, making them fail when segmenting low-quality images captured by mobile devices. To address this issue, we propose a dilated encoder network (DE-Net) to capture more high-level features and get high-resolution output for automated tongue image segmentation. In addition, we construct two tongue image datasets which contain images taken by specialized devices and mobile devices, respectively, to verify the effectiveness of the proposed method. Experimental results on both datasets demonstrate that the proposed method outperforms the state-of-the-art methods in tongue image segmentation.

Unsupervised Detection of Pulmonary Opacities for Computer-Aided Diagnosis of COVID-19 on CT Images

Rui Xu, Xiao Cao, Yufeng Wang, Yen-Wei Chen, Xinchen Ye, Lin Lin, Wenchao Zhu, Chao Chen, Fangyi Xu, Yong Zhou, Hongjie Hu, Shoji Kido, Noriyuki Tomiyama

Responsive image

Auto-TLDR; A computer-aided diagnosis of COVID-19 from CT images using unsupervised pulmonary opacity detection

Slides Poster Similar

COVID-19 emerged towards the end of 2019 which was identified as a global pandemic by the world heath organization (WHO). With the rapid spread of COVID-19, the number of infected and suspected patients has increased dramatically. Chest computed tomography (CT) has been recognized as an efficient tool for the diagnosis of COVID-19. However, the huge CT data make it difficult for radiologist to fully exploit them on the diagnosis. In this paper, we propose a computer-aided diagnosis system that can automatically analyze CT images to distinguish the COVID-19 against to community-acquired pneumonia (CAP). The proposed system is based on an unsupervised pulmonary opacity detection method that locates opacity regions by a detector unsupervisedly trained from CT images with normal lung tissues. Radiomics based features are extracted insides the opacity regions, and fed into classifiers for classification. We evaluate the proposed CAD system by using 200 CT images collected from different patients in several hospitals. The accuracy, precision, recall, f1-score and AUC achieved are 95.5%, 100%, 91%, 95.1% and 95.9% respectively, exhibiting the promising capacity on the differential diagnosis of COVID-19 from CT images.

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

Jiaqi Luo, Zhicheng Zhao, Fei Su, Limei Guo

Responsive image

Auto-TLDR; Triplet-path Network for One-Stage Object Detection and Segmentation in Pathological Images

Slides Similar

Deep learning has been widely applied in the field of medical image processing. However, compared with flourishing visual tasks in natural images, the progress achieved in pathological images is not remarkable, and detection and segmentation, which are among basic tasks of computer vision, are regarded as two independent tasks. In this paper, we make full use of existing datasets and construct a triplet-path network using dilated convolutions to cooperatively accomplish one-stage object detection and nuclei segmentation for general pathological images. First, in order to meet the requirement of detection and segmentation, a novel structure called triplet feature generation (TFG) is designed to extract high-resolution and multiscale features, where features from different layers can be properly integrated. Second, considering that pathological datasets are usually small, a location-aware and partially truncated loss function is proposed to improve the classification accuracy of datasets with few images and widely varying targets. We compare the performance of both object detection and instance segmentation with state-of-the-art methods. Experimental results demonstrate the effectiveness and efficiency of the proposed network on two datasets collected from multiple organs.

Supporting Skin Lesion Diagnosis with Content-Based Image Retrieval

Stefano Allegretti, Federico Bolelli, Federico Pollastri, Sabrina Longhitano, Giovanni Pellacani, Costantino Grana

Responsive image

Auto-TLDR; Skin Images Retrieval Using Convolutional Neural Networks for Skin Lesion Classification and Segmentation

Slides Poster Similar

Given the relevance of skin cancer, many attempts have been dedicated to the creation of automated devices that could assist both expert and beginner dermatologists towards fast and early diagnosis of skin lesions. In recent years, tasks such as skin lesion classification and segmentation have been extensively addressed with deep learning algorithms, which in some cases reach a diagnostic accuracy comparable to that of expert physicians. However, the general lack of interpretability and reliability severely hinders the ability of those approaches to actually support dermatologists in the diagnosis process. In this paper a novel skin images retrieval system is presented, which exploits features extracted by Convolutional Neural Networks to gather similar images from a publicly available dataset, in order to assist the diagnosis process of both expert and novice practitioners. In the proposed framework, Resnet-50 is initially trained for the classification of dermoscopic images; then, the feature extraction part is isolated, and an embedding network is build on top of it. The embedding learns an alternative representation, which allows to check image similarity by means of a distance measure. Experimental results reveal that the proposed method is able to select meaningful images, which can effectively boost the classification accuracy of human dermatologists.

Fine-Tuning Convolutional Neural Networks: A Comprehensive Guide and Benchmark Analysis for Glaucoma Screening

Amed Mvoulana, Rostom Kachouri, Mohamed Akil

Responsive image

Auto-TLDR; Fine-tuning Convolutional Neural Networks for Glaucoma Screening

Slides Poster Similar

This work aimed at giving a comprehensive and in-detailed guide on the route to fine-tuning Convolutional Neural Networks (CNNs) for glaucoma screening. Transfer learning consists in a promising alternative to train CNNs from stratch, to avoid the huge data and resources requirements. After a thorough study of five state-of-the-art CNNs architectures, a complete and well-explained strategy for fine-tuning these networks is proposed, using hyperparameter grid-searching and two-phase training approach. Excellent performance is reached on model evaluation, with a 0.9772 AUROC validation rate, giving arise to reliable glaucoma diagosis-help systems. Also, a benchmark analysis is conducted across all fine-tuned models, studying them according to performance indices such as model complexity and size, AUROC density and inference time. This in-depth analysis allows a rigorous comparison between model characteristics, and is useful for giving practioners important trademarks for prospective applications and deployments.

Deep Multi-Stage Model for Automated Landmarking of Craniomaxillofacial CT Scans

Simone Palazzo, Giovanni Bellitto, Luca Prezzavento, Francesco Rundo, Ulas Bagci, Daniela Giordano, Rosalia Leonardi, Concetto Spampinato

Responsive image

Auto-TLDR; Automated Landmarking of Craniomaxillofacial CT Images Using Deep Multi-Stage Architecture

Slides Similar

In this paper we define a deep multi-stage architecture for automated landmarking of craniomaxillofacial (CMF) CT images. Our model is composed of three subnetworks that first localize, on reduced-resolution images, areas where land-marks may be found and then refine the search, at full-resolution scale, through a hierarchical structure aiming at increasing the granularity of the investigated region. The multi-stage pipeline is designed to deal with full resolution data and does not require any additional pre-processing step to reduce search space, as opposed to existing methods that can be only adopted for searching landmarks located in well-defined anatomical structures (e.g.,mandibles). The automated landmarking system is tested on identifying landmarks located in several CMF regions, achieving an average error of 0.8 mm, significantly lower than expert readings. The proposed model also outperforms baselines and is on par with existing models that employ additional upstream segmentation, on state-of-the-art benchmarks.

End-To-End Multi-Task Learning for Lung Nodule Segmentation and Diagnosis

Wei Chen, Qiuli Wang, Dan Yang, Xiaohong Zhang, Chen Liu, Yucong Li

Responsive image

Auto-TLDR; A novel multi-task framework for lung nodule diagnosis based on deep learning and medical features

Slides Similar

Computer-Aided Diagnosis (CAD) systems for lung nodule diagnosis based on deep learning have attracted much attention in recent years. However, most existing methods ignore the relationships between the segmentation and classification tasks, which leads to unstable performances. To address this problem, we propose a novel multi-task framework, which can provide lung nodule segmentation mask, malignancy prediction, and medical features for interpretable diagnosis at the same time. Our framework mainly contains two sub-network: (1) Multi-Channel Segmentation Sub-network (MSN) for lung nodule segmentation, and (2) Joint Classification Sub-network (JCN) for interpretable lung nodule diagnosis. In the proposed framework, we use U-Net down-sampling processes for extracting low-level deep learning features, which are shared by two sub-networks. The JCN forces the down-sampling processes to learn better lowlevel deep features, which lead to a better construct of segmentation masks. Meanwhile, two additional channels constructed by OTSU and super-pixel (SLIC) methods, are utilized as the guideline of the feature extraction. The proposed framework takes advantages of deep learning methods and classical methods, which can significantly improve the performances of all tasks. We evaluate the proposed framework on public dataset LIDCIDRI. Our framework achieves a promising Dice score of 86.43% in segmentation, 87.07% in malignancy level prediction, and convincing results in interpretable medical feature predictions.

A Comparison of Neural Network Approaches for Melanoma Classification

Maria Frasca, Michele Nappi, Michele Risi, Genoveffa Tortora, Alessia Auriemma Citarella

Responsive image

Auto-TLDR; Classification of Melanoma Using Deep Neural Network Methodologies

Slides Poster Similar

Melanoma is the deadliest form of skin cancer and it is diagnosed mainly visually, starting from initial clinical screening and followed by dermoscopic analysis, biopsy and histopathological examination. A dermatologist’s recognition of melanoma may be subject to errors and may take some time to diagnose it. In this regard, deep learning can be useful in the study and classification of skin cancer. In particular, by classifying images with Deep Neural Network methodologies, it is possible to obtain comparable or even superior results compared to those of dermatologists. In this paper, we propose a methodology for the classification of melanoma by adopting different deep learning techniques applied to a common dataset, composed of images from the ISIC dataset and consisting of different types of skin diseases, including melanoma on which we applied a specific pre-processing phase. In particular, a comparison of the results is performed in order to select the best effective neural network to be applied to the problem of recognition and classification of melanoma. Moreover, we also evaluate the impact of the pre- processing phase on the final classification. Different metrics such as accuracy, sensitivity, and specificity have been selected to assess the goodness of the adopted neural networks and compare them also with the manual classification of dermatologists.

DA-RefineNet: Dual-Inputs Attention RefineNet for Whole Slide Image Segmentation

Ziqiang Li, Rentuo Tao, Qianrun Wu, Bin Li

Responsive image

Auto-TLDR; DA-RefineNet: A dual-inputs attention network for whole slide image segmentation

Slides Poster Similar

Automatic medical image segmentation techniques have wide applications for disease diagnosing, however, its much more challenging than natural optical image segmentation tasks due to the high-resolution of medical images and the corresponding huge computation cost. Sliding window was a commonly used technique for whole slide image (WSI) segmentation, however, for these methods that based on sliding window, the main drawback was lacking of global contextual information for supervision. In this paper, we proposed a dual-inputs attention network (denoted as DA-RefineNet) for WSI segmentation, where both local fine-grained information and global coarse information can be efficiently utilized. Sufficient comparative experiments were conducted to evaluate the effectiveness of the proposed method, the results proved that the proposed method can achieve better performance on WSI segmentation tasks compared to methods rely on single-input.

Trainable Spectrally Initializable Matrix Transformations in Convolutional Neural Networks

Michele Alberti, Angela Botros, Schuetz Narayan, Rolf Ingold, Marcus Liwicki, Mathias Seuret

Responsive image

Auto-TLDR; Trainable and Spectrally Initializable Matrix Transformations for Neural Networks

Slides Poster Similar

In this work, we introduce a new architectural component to Neural Networks (NN), i.e., trainable and spectrally initializable matrix transformations on feature maps. While previous literature has already demonstrated the possibility of adding static spectral transformations as feature processors, our focus is on more general trainable transforms. We study the transforms in various architectural configurations on four datasets of different nature: from medical (ColorectalHist, HAM10000) and natural (Flowers) images to historical documents (CB55). With rigorous experiments that control for the number of parameters and randomness, we show that networks utilizing the introduced matrix transformations outperform vanilla neural networks. The observed accuracy increases appreciably across all datasets. In addition, we show that the benefit of spectral initialization leads to significantly faster convergence, as opposed to randomly initialized matrix transformations. The transformations are implemented as auto-differentiable PyTorch modules that can be incorporated into any neural network architecture. The entire code base is open-source.

Dual Encoder Fusion U-Net (DEFU-Net) for Cross-manufacturer Chest X-Ray Segmentation

Zhang Lipei, Aozhi Liu, Jing Xiao

Responsive image

Auto-TLDR; Inception Convolutional Neural Network with Dilation for Chest X-Ray Segmentation

Slides Similar

A number of methods based on the deep learning have been applied to medical image segmentation and have achieved state-of-the-art performance. The most famous technique is U-Net which has been used to many medical datasets including the Chest X-ray. Due to the importance of chest x- ray data in studying COVID-19, there is a demand for state-of- art models capable of precisely segmenting chest x-rays. In this paper, we propose a dual encoder fusion U-Net framework for Chest X-rays based on Inception Convolutional Neural Network with dilation, Densely Connected Recurrent Convolutional Neural Network, which is named DEFU-Net. The densely connected recurrent path extends the network deeper for facilitating context feature extraction. In order to increase the width of network and enrich representation of features, the inception blocks with dilation have been used. The inception blocks can capture globally and locally spatial information with various receptive fields to avoid information loss caused by max-pooling. Meanwhile, the features fusion of two path by summation preserve the context and the spatial information for decoding part. We applied this model in Chest X-ray dataset from two different manufacturers (Montgomery and Shenzhen hospital). The DEFU-Net achieves the better performance than basic U-Net, residual U-Net, BCDU- Net, R2U-Net and attention R2U-Net. This model approaches state-of-the-art in this mixed dataset. The open source code for this proposed framework is public available.

Enhancing Semantic Segmentation of Aerial Images with Inhibitory Neurons

Ihsan Ullah, Sean Reilly, Michael Madden

Responsive image

Auto-TLDR; Lateral Inhibition in Deep Neural Networks for Object Recognition and Semantic Segmentation

Slides Poster Similar

In a Convolutional Neural Network, each neuron in the output feature map takes input from the neurons in its receptive field. This receptive field concept plays a vital role in today's deep neural networks. However, inspired by neuro-biological research, it has been proposed to add inhibitory neurons outside the receptive field, which may enhance the performance of neural network models. In this paper, we begin with deep network architectures such as VGG and ResNet, and propose an approach to add lateral inhibition in each output neuron to reduce its impact on its neighbours, both in fine-tuning pre-trained models and training from scratch. Our experiments show that notable improvements upon prior baseline deep models can be achieved. A key feature of our approach is that it is easy to add to baseline models; it can be adopted in any model containing convolution layers, and we demonstrate its value in applications including object recognition and semantic segmentation of aerial images, where we show state-of-the-art result on the Aeroscape dataset. On semantic segmentation tasks, our enhancement shows 17.43% higher mIoU than a single baseline model on a single source (the Aeroscape dataset), 13.43% higher performance than an ensemble model on the same single source, and 7.03% higher than an ensemble model on multiple sources (segmentation datasets). Our experiments illustrate the potential impact of using inhibitory neurons in deep learning models, and they also show better results than the baseline models that have standard convolutional layer.

RescueNet: Joint Building Segmentation and Damage Assessment from Satellite Imagery

Rohit Gupta, Mubarak Shah

Responsive image

Auto-TLDR; RescueNet: End-to-End Building Segmentation and Damage Classification for Humanitarian Aid and Disaster Response

Slides Poster Similar

Accurate and fine-grained information about the extent of damage to buildings is essential for directing Humanitarian Aid and Disaster Response (HADR) operations in the immediate aftermath of any natural calamity. In recent years, satellite and UAV (drone) imagery has been used for this purpose, sometimes aided by computer vision algorithms. Existing Computer Vision approaches for building damage assessment typically rely on a two stage approach, consisting of building detection using an object detection model, followed by damage assessment through classification of the detected building tiles. These multi-stage methods are not end-to-end trainable, and suffer from poor overall results. We propose RescueNet, a unified model that can simultaneously segment buildings and assess the damage levels to individual buildings and can be trained end-to end. In order to to model the composite nature of this problem, we propose a novel localization aware loss function, which consists of a Binary Cross Entropy loss for building segmentation, and a foreground only selective Categorical Cross-Entropy loss for damage classification, and show significant improvement over the widely used Cross-Entropy loss. RescueNet is tested on the large scale and diverse xBD dataset and achieves significantly better building segmentation and damage classification performance than previous methods and achieves generalization across varied geographical regions and disaster types.

Investigating and Exploiting Image Resolution for Transfer Learning-Based Skin Lesion Classification

Amirreza Mahbod, Gerald Schaefer, Chunliang Wang, Rupert Ecker, Georg Dorffner, Isabella Ellinger

Responsive image

Auto-TLDR; Fine-tuned Neural Networks for Skin Lesion Classification Using Dermoscopic Images

Slides Poster Similar

Skin cancer is among the most common cancer types. Dermoscopic image analysis improves the diagnostic accuracy for detection of malignant melanoma and other pigmented skin lesions when compared to unaided visual inspection. Hence, computer-based methods to support medical experts in the diagnostic procedure are of great interest. Fine-tuning pre-trained convolutional neural networks (CNNs) has been shown to work well for skin lesion classification. Pre-trained CNNs are usually trained with natural images of a fixed image size which is typically significantly smaller than captured skin lesion images and consequently dermoscopic images are downsampled for fine-tuning. However, useful medical information may be lost during this transformation. In this paper, we explore the effect of input image size on skin lesion classification performance of fine-tuned CNNs. For this, we resize dermoscopic images to different resolutions, ranging from 64x64 to 768x768 pixels and investigate the resulting classification performance of three well-established CNNs, namely DenseNet-121, ResNet-18, and ResNet-50. Our results show that using very small images (of size 64x64 pixels) degrades the classification performance, while images of size 128x128 pixels and above support good performance with larger image sizes leading to slightly improved classification. We further propose a novel fusion approach based on a three-level ensemble strategy that exploits multiple fine-tuned networks trained with dermoscopic images at various sizes. When applied on the ISIC 2017 skin lesion classification challenge, our fusion approach yields an area under the receiver operating characteristic curve of 89.2% and 96.6% for melanoma classification and seborrheic keratosis classification, respectively, outperforming state-of-the-art algorithms.

Leveraging Unlabeled Data for Glioma Molecular Subtype and Survival Prediction

Nicholas Nuechterlein, Beibin Li, Mehmet Saygin Seyfioglu, Sachin Mehta, Patrick Cimino, Linda Shapiro

Responsive image

Auto-TLDR; Multimodal Brain Tumor Segmentation Using Unlabeled MR Data and Genomic Data for Cancer Prediction

Slides Poster Similar

In this paper, we address two long-standing challenges in neuro-oncology: (1) how to leverage large amounts of unlabeled magnetic resonance (MR) imaging data for radiogenomic tasks and (2) how to unite glioma MR imaging with genomic data. We examine multi-parametric MR data from 542 patients in the combined training, validation, and testing sets of the 2018 Multimodal Brain Tumor Segmentation Challenge and somatic copy number alteration (SCNA) data from 1090 patients in The Cancer Genome Archive's (TCGA) lower-grade glioma and glioblastoma projects. We propose a novel application of multi-task learning (MTL) that leverages unlabeled MR data by jointly learning tumor segmentation masks with glioma molecular subtype markers and allows for SCNA input when available. There are 235 patients in the intersection of these MR and SCNA datasets, which we divide into an unlabeled training set, a labeled training set, and a validation set. Our MTL model significantly outperforms comparable classification models trained only on labeled MR data for both IDH1/2 mutation and 1p/19q co-deletion glioma subtype marker prediction tasks. We also observe that models trained on genomic and imaging data improve survival prediction results achieved by models trained on either alone. We will release our source code for future research.

OCT Image Segmentation Using NeuralArchitecture Search and SRGAN

Saba Heidari, Omid Dehzangi, Nasser M. Nasarabadi, Ali Rezai

Responsive image

Auto-TLDR; Automatic Segmentation of Retinal Layers in Optical Coherence Tomography using Neural Architecture Search

Poster Similar

Alzheimer’s disease (AD) diagnosis is one of the major research areas in computational medicine. Optical coherence tomography (OCT) is a non-invasive, inexpensive, and timely efficient method that scans the human’s retina with depth. It has been hypothesized that the thickness of the retinal layers extracted from OCTs could be an efficient and effective biomarker for early diagnosis of AD. In this work, we aim to design a self-training model architecture for the task of segmenting the retinal layers in OCT scans. Neural architecture search (NAS) is a subfield of AutoML domain, which has a significant impact on improving the accuracy of machine vision tasks. We integrate the NAS algorithm with a Unet auto-encoder architecture as its backbone. Then, we employ our proposed model to segment the retinal nerve fiber layer in our preprocessed OCT images with the aim of AD diagnosis. In this work, we trained a super-resolution generative adversarial network on the raw OCT scans to improve the quality of the images before the modeling stage. In our architecture search strategy, different primitive operations suggested to find down- \& up-sampling Unet cell blocks and the binary gate method has been applied to make the search strategy more practical. Our architecture search method is empirically evaluated by training on the Unet and NAS-Unet from scratch. Specifically, the proposed NAS-Unet training significantly outperforms the baseline human-designed architecture by achieving 95.1\% in the mean Intersection over Union metric and 79.1\% in the Dice similarity coefficient.

Fast and Accurate Real-Time Semantic Segmentation with Dilated Asymmetric Convolutions

Leonel Rosas-Arias, Gibran Benitez-Garcia, Jose Portillo-Portillo, Gabriel Sanchez-Perez, Keiji Yanai

Responsive image

Auto-TLDR; FASSD-Net: Dilated Asymmetric Pyramidal Fusion for Real-Time Semantic Segmentation

Slides Poster Similar

Recent works have shown promising results applied to real-time semantic segmentation tasks. To maintain fast inference speed, most of the existing networks make use of light decoders, or they simply do not use them at all. This strategy helps to maintain a fast inference speed; however, their accuracy performance is significantly lower in comparison to non-real-time semantic segmentation networks. In this paper, we introduce two key modules aimed to design a high-performance decoder for real-time semantic segmentation for reducing the accuracy gap between real-time and non-real-time segmentation networks. Our first module, Dilated Asymmetric Pyramidal Fusion (DAPF), is designed to substantially increase the receptive field on the top of the last stage of the encoder, obtaining richer contextual features. Our second module, Multi-resolution Dilated Asymmetric (MDA) module, fuses and refines detail and contextual information from multi-scale feature maps coming from early and deeper stages of the network. Both modules exploit contextual information without excessively increasing the computational complexity by using asymmetric convolutions. Our proposed network entitled “FASSD-Net” reaches 78.8% of mIoU accuracy on the Cityscapes validation dataset at 41.1 FPS on full resolution images (1024x2048). Besides, with a light version of our network, we reach 74.1% of mIoU at 133.1 FPS (full resolution) on a single NVIDIA GTX 1080Ti card with no additional acceleration techniques. The source code and pre-trained models are available at https://github.com/GibranBenitez/FASSD-Net.

Offset Curves Loss for Imbalanced Problem in Medical Segmentation

Ngan Le, Duc Toan Bui, Khoa Luu, Marios Savvides

Responsive image

Auto-TLDR; Offset Curves Loss for Medical Image Segmentation

Poster Similar

Medical image segmentation has played an important role in medical analysis and widely developed for many clinical applications. Deep learning-based approaches have achieved high performance in semantic segmentation but they are limited to pixel-wise setting and imbalanced classes data problem. In this paper, we tackle those limitations by developing a new deep learning-based model which takes into account both higher feature level i.e. region inside contour, intermediate feature level i.e. offset curves around the contour and lower feature level i.e. contour. Our proposed Offset Curves (OsC) loss consists of three main fitting terms. The first fitting term focuses on pixel-wise level segmentation whereas the second fitting term acts as attention model which pays attention to the area around the boundaries (offset curves). The third terms plays a role as regularization term which takes the length of boundaries into account. We evaluate our proposed OsC loss on both 2D network and 3D network. Two common medical datasets, i.e. retina DRIVE and brain tumor BRATS 2018 datasets are used to benchmark our proposed loss performance. The experiments have showed that our proposed OsC loss function outperforms other mainstream loss functions such as Cross-Entropy, Dice, Focal on the most common segmentation networks Unet, FCN.

EM-Net: Deep Learning for Electron Microscopy Image Segmentation

Afshin Khadangi, Thomas Boudier, Vijay Rajagopal

Responsive image

Auto-TLDR; EM-net: Deep Convolutional Neural Network for Electron Microscopy Image Segmentation

Similar

Recent high-throughput electron microscopy techniques such as focused ion-beam scanning electron microscopy (FIB-SEM) provide thousands of serial sections which assist the biologists in studying sub-cellular structures at high resolution and large volume. Low contrast of such images hinder image segmentation and 3D visualisation of these datasets. With recent advances in computer vision and deep learning, such datasets can be segmented and reconstructed in 3D with greater ease and speed than with previous approaches. However, these methods still rely on thousands of ground-truth samples for training and electron microscopy datasets require significant amounts of time for carefully curated manual annotations. We address these bottlenecks with EM-net, a scalable deep convolutional neural network for EM image segmentation. We have evaluated EM-net using two datasets, one of which belongs to an ongoing competition on EM stack segmentation since 2012. We show that EM-net variants achieve better performances than current deep learning methods using small- and medium-sized ground-truth datasets. We also show that the ensemble of top EM-net base classifiers outperforms other methods across a wide variety of evaluation metrics.

A Lumen Segmentation Method in Ureteroscopy Images Based on a Deep Residual U-Net Architecture

Jorge Lazo, Marzullo Aldo, Sara Moccia, Michele Catellani, Benoit Rosa, Elena De Momi, Michel De Mathelin, Francesco Calimeri

Responsive image

Auto-TLDR; A Deep Neural Network for Ureteroscopy with Residual Units

Slides Poster Similar

Ureteroscopy is becoming the first surgical treatment option for the majority of urinary affections. This procedure is carried out using an endoscope which provides the surgeon with the visual and spatial information necessary to navigate inside the urinary tract. Having in mind the development of surgical assistance systems, that could enhance the performance of surgeon, the task of lumen segmentation is a fundamental part since this is the visual reference which marks the path that the endoscope should follow. This is something that has not been analyzed in ureteroscopy data before. However, this task presents several challenges given the image quality and the conditions itself of ureteroscopy procedures. In this paper, we study the implementation of a Deep Neural Network which exploits the advantage of residual units in an architecture based on U-Net. For the training of these networks, we analyze the use of two different color spaces: gray-scale and RGB data images. We found that training on gray-scale images gives the best results obtaining mean values of Dice Score, Precision, and Recall of 0.73, 0.58, and 0.92 respectively. The results obtained show that the use of residual U-Net could be a suitable model for further development for a computer-aided system for navigation and guidance through the urinary system.

Neural Machine Registration for Motion Correction in Breast DCE-MRI

Federica Aprea, Stefano Marrone, Carlo Sansone

Responsive image

Auto-TLDR; A Neural Registration Network for Dynamic Contrast Enhanced-Magnetic Resonance Imaging

Slides Poster Similar

Cancer is one of the leading causes of death in the western world, with medical imaging playing a key role for early diagnosis. Focusing on breast cancer, one of the emerging imaging methodologies is Dynamic Contrast Enhanced-Magnetic Resonance Imaging (DCE-MRI). The flip side of using DCE-MRI is in its long acquisition times, often causing the patient to move, resulting in motion artefacts, namely distortions in the acquired image that can affect DCE-MRI analysis. A possible solution consists in the use of Motion Correction Techniques (MCTs), i.e. procedures intended to re-align the post-contrast image to the corresponding pre-contrast (reference) one. This task is particularly critic in DCE-MRI, due to brightness variations introduced in post-contrast images by the contrast-agent flowing. To face this problem, in this work we introduce a new MCT for breast DCE-MRI leveraging Physiologically Based PharmacoKinetic (PBPK) modelling and Artificial Neural Networks (ANN) to determine the most suitable physiologically-compliant transformation. To this aim, we propose a Neural Registration Network relying on a very task-specific loss function explicitly designed to take into account the contrast agent flowing while enforcing a correct re-alignment. We compared the obtained results against some conventional motion correction techniques, evaluating the performance on a patient-by-patient basis. Results clearly show the effectiveness of the proposed approach, resulting as the best performing even when compares against other techniques designed to take into account for brightness variations.

Dual Stream Network with Selective Optimization for Skin Disease Recognition in Consumer Grade Images

Krishnam Gupta, Jaiprasad Rampure, Monu Krishnan, Ajit Narayanan, Nikhil Narayan

Responsive image

Auto-TLDR; A Deep Network Architecture for Skin Disease Localisation and Classification on Consumer Grade Images

Slides Poster Similar

Skin disease localisation and classification on consumer-grade images is more challenging compared to that on dermoscopic imaging. Consumer grade images refer to the images taken using commonly available imaging devices such as a mobile camera or a hand held digital camera. Such images, in addition to having the skin condition of interest in a very small area of the image, has other noisy non-clinical details introduced due to the lighting conditions and the distance of the hand held device from the anatomy at the time of acquisition. We propose a novel deep network architecture \& a new optimization strategy for classification with implicit localisation of skin diseases from clinical/consumer grade images. A weakly supervised segmentation algorithm is first employed to extract Region of Interests (RoI) from the image, the RoI and the original image form the two input streams of the proposed architecture. Each stream of the architecture learns high level and low level features from the original image and the RoI, respectively. The two streams are independently optimised until the loss stops decreasing after which both the streams are optimised collectively with the help of a third combiner sub-network. Such a strategy resulted in a 5% increase of accuracy over the current state-of-the-art methods on SD-198 dataset, which is publicly available. The proposed algorithm is also validated on a new dataset containing over 12,000 images across 75 different skin conditions. We intend to release this dataset as SD-75 to aid in the advancement of research on skin condition classification on consumer grade images.

A Heuristic-Based Decision Tree for Connected Components Labeling of 3D Volumes

Maximilian Söchting, Stefano Allegretti, Federico Bolelli, Costantino Grana

Responsive image

Auto-TLDR; Entropy Partitioning Decision Tree for Connected Components Labeling

Slides Poster Similar

Connected Components Labeling represents a fundamental step for many Computer Vision and Image Processing pipelines. Since the first appearance of the task in the sixties, many algorithmic solutions to optimize the computational load needed to label an image have been proposed. Among them, block-based scan approaches and decision trees revealed to be some of the most valuable strategies. However, due to the cost of the manual construction of optimal decision trees and the computational limitations of automatic strategies employed in the past, the application of blocks and decision trees has been restricted to small masks, and thus to 2D algorithms. With this paper we present a novel heuristic algorithm based on decision tree learning methodology, called Entropy Partitioning Decision Tree (EPDT). It allows to compute near-optimal decision trees for large scan masks. Experimental results demonstrate that algorithms based on the generated decision trees outperform state-of-the-art competitors.

MedZip: 3D Medical Images Lossless Compressor Using Recurrent Neural Network (LSTM)

Omniah Nagoor, Joss Whittle, Jingjing Deng, Benjamin Mora, Mark W. Jones

Responsive image

Auto-TLDR; Recurrent Neural Network for Lossless Medical Image Compression using Long Short-Term Memory

Poster Similar

As scanners produce higher-resolution and more densely sampled images, this raises the challenge of data storage, transmission and communication within healthcare systems. Since the quality of medical images plays a crucial role in diagnosis accuracy, medical imaging compression techniques are desired to reduce scan bitrate while guaranteeing lossless reconstruction. This paper presents a lossless compression method that integrates a Recurrent Neural Network (RNN) as a 3D sequence prediction model. The aim is to learn the long dependencies of the voxel's neighbourhood in 3D using Long Short-Term Memory (LSTM) network then compress the residual error using arithmetic coding. Experiential results reveal that our method obtains a higher compression ratio achieving 15% saving compared to the state-of-the-art lossless compression standards, including JPEG-LS, JPEG2000, JP3D, HEVC, and PPMd. Our evaluation demonstrates that the proposed method generalizes well to unseen modalities CT and MRI for the lossless compression scheme. To the best of our knowledge, this is the first lossless compression method that uses LSTM neural network for 16-bit volumetric medical image compression.

BG-Net: Boundary-Guided Network for Lung Segmentation on Clinical CT Images

Rui Xu, Yi Wang, Tiantian Liu, Xinchen Ye, Lin Lin, Yen-Wei Chen, Shoji Kido, Noriyuki Tomiyama

Responsive image

Auto-TLDR; Boundary-Guided Network for Lung Segmentation on CT Images

Slides Poster Similar

Lung segmentation on CT images is a crucial step for a computer-aided diagnosis system of lung diseases. The existing deep learning based lung segmentation methods are less efficient to segment lungs on clinical CT images, especially that the segmentation on lung boundaries is not accurate enough due to complex pulmonary opacities in practical clinics. In this paper, we propose a boundary-guided network (BG-Net) to address this problem. It contains two auxiliary branches that separately segment lungs and extract the lung boundaries, and an aggregation branch that efficiently exploits lung boundary cues to guide the network for more accurate lung segmentation on clinical CT images. We evaluate the proposed method on a private dataset collected from the Osaka university hospital and four public datasets including StructSeg, HUG, VESSEL12, and a Novel Coronavirus 2019 (COVID-19) dataset. Experimental results show that the proposed method can segment lungs more accurately and outperform several other deep learning based methods.

Revisiting Sequence-To-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory

Fatemeh Azimi, Benjamin Bischke, Sebastian Palacio, Federico Raue, Jörn Hees, Andreas Dengel

Responsive image

Auto-TLDR; Sequence-to-Sequence Learning for Video Object Segmentation

Slides Poster Similar

Video Object Segmentation (VOS) is an active research area of the visual domain. One of its fundamental sub-tasks is semi-supervised / one-shot learning: given only the segmentation mask for the first frame, the task is to provide pixel-accurate masks for the object over the rest of the sequence. Despite much progress in the last years, we noticed that many of the existing approaches lose objects in longer sequences, especially when the object is small or briefly occluded. In this work, we build upon a sequence-to-sequence approach that employs an encoder-decoder architecture together with a memory module for exploiting the sequential data. We further improve this approach by proposing a model that manipulates multi-scale spatio-temporal information using memory-equipped skip connections. Furthermore, we incorporate an auxiliary task based on distance classification which greatly enhances the quality of edges in segmentation masks. We compare our approach to the state of the art and show considerable improvement in the contour accuracy metric and the overall segmentation accuracy.