Search CORE

85 research outputs found

Ensemble deep learning: A review

Author: Ganaie M. A.
Hu Minghui
Malik A. K.
Suganthan P. N.
Tanveer M.
Publication venue
Publication date: 06/04/2021
Field of study

Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning models with multilayer processing architecture is showing better performance as compared to the shallow or traditional classification models. Deep ensemble learning models combine the advantages of both the deep learning models as well as the ensemble learning such that the final model has better generalization performance. This paper reviews the state-of-art deep ensemble models and hence serves as an extensive summary for the researchers. The ensemble models are broadly categorised into ensemble models like bagging, boosting and stacking, negative correlation based deep ensemble models, explicit/implicit ensembles, homogeneous /heterogeneous ensemble, decision fusion strategies, unsupervised, semi-supervised, reinforcement learning and online/incremental, multilabel based deep ensemble models. Application of deep ensemble models in different domains is also briefly discussed. Finally, we conclude this paper with some future recommendations and research directions

arXiv.org e-Print Archive

Qatar University Institutional Repository

KernelBoost: Supervised Learning of Image Features For Classification

Author: Becker Carlos Joaquin
Fua Pascal
Lepetit Vincent
Rigamonti Roberto
Publication venue
Publication date: 04/02/2013
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Learning Approach to Delineation of Curvilinear Structures in 2D and 3D Images

Author: Mosinska Agata Justyna
Publication venue: Lausanne, EPFL
Publication date: 05/06/2019
Field of study

Detection of curvilinear structures has long been of interest due to its wide range of applications. Large amounts of imaging data could be readily used in many fields, but it is practically not possible to analyze them manually. Hence, the need for automated delineation approaches. In the recent years Computer Vision witnessed a paradigm shift from mathematical modelling to data-driven methods based on Machine Learning. This led to improvements in performance and robustness of the detection algorithms. Nonetheless, most Machine Learning methods are general-purpose and they do not exploit the specificity of the delineation problem. In this thesis, we present learning methods suited for this task and we apply them to various kinds of microscopic and natural images, proving the general applicability of the presented solutions. First, we introduce a topology loss - a new training loss term, which captures higher-level features of curvilinear networks such as smoothness, connectivity and continuity. This is in contrast to most Deep Learning segmentation methods that do not take into account the geometry of the resulting prediction. In order to compute the new loss term, we extract topology features of prediction and ground-truth using a pre-trained network, whose filters are activated by structures at different scales and orientations. We show that this approach yields better results in terms of conventional segmentation metrics and overall topology of the resulting delineation. Although segmentation of curvilinear structures provides useful information, it is not always sufficient. In many cases, such as neuroscience and cartography, it is crucial to estimate the network connectivity. In order to find the graph representation of the structure depicted in the image, we propose an approach for joint segmentation and connection classification. Apart from pixel probabilities, this approach also returns the likelihood of a proposed path being a part of the reconstructed network. We show that segmentation and path classification are closely related tasks and can benefit from the synergy. The aforementioned methods rely on Machine Learning, which requires significant amounts of annotated ground-truth data to train models. The labelling process often requires expertise, it is costly and tiresome. To alleviate this problem, we introduce an Active Learning method that significantly decreases the time spent on annotating images. It queries the annotator only about the most informative examples, in this case the hypothetical paths belonging to the structure of interest. Contrary to conventional Active Learning methods, our approach exploits local consistency of linear paths to pick the ones that stand out from their neighborhood. Our final contribution is a method suited for both Active Learning and proofreading the result, which often requires more time than the automated delineation itself. It investigates edges of the delineation graph and determines the ones that are especially significant for the global reconstruction by perturbing their weights. Our Active Learning and proofreading strategies are combined with a new efficient formulation of an optimal subgraph computation and reduce the annotation effort by up to 80%

Infoscience - École polytechnique fédérale de Lausanne

Advanced Representation Learning for Dense Prediction Tasks in Medical Image Analysis

Author: Song Sifan
Publication venue
Publication date
Field of study

Machine learning is a rapidly growing field of artificial intelligence that allows computers to learn and make predictions using human labels. However, traditional machine learning methods have many drawbacks, such as being time-consuming, inefficient, task-specific biased, and requiring a large amount of domain knowledge. A subfield of machine learning, representation learning, focuses on learning meaningful and useful features or representations from input data. It aims to automatically learn relevant features from raw data, saving time, increasing efficiency and generalization, and reducing reliance on expert knowledge. Recently, deep learning has further accelerated the development of representation learning. It leverages deep architectures to extract complex and abstract representations, resulting in significant outperformance in many areas. In the field of computer vision, deep learning has made remarkable progress, particularly in high-level and real-world computer vision tasks. Since deep learning methods do not require handcrafted features and have the ability to understand complex visual information, they facilitate researchers to design automated systems that make accurate diagnoses and interpretations, especially in the field of medical image analysis. Deep learning has achieved state-of-the-art performance in many medical image analysis tasks, such as medical image regression/classification, generation and segmentation tasks. Compared to regression/classification tasks, medical image generation and segmentation tasks are more complex dense prediction tasks that understand semantic representations and generate pixel-level predictions. This thesis focuses on designing representation learning methods to improve the performance of dense prediction tasks in the field of medical image analysis. With advances in imaging technology, more complex medical images become available for use in this field. In contrast to traditional machine learning algorithms, current deep learning-based representation learning methods provide an end-to-end approach to automatically extract representations without the need for manual feature engineering from the complex data. In the field of medical image analysis, there are three unique challenges requiring the design of advanced representation learning architectures, \ie, limited labeled medical images, overfitting with limited data, and lack of interpretability. To address these challenges, we aim to design robust representation learning architectures for the two main directions of dense prediction tasks, namely medical image generation and segmentation. For medical image generation, the specific topic that we focus on is chromosome straightening. This task involves generating a straightened chromosome image from a curved chromosome input. In addition, the challenges of this task include insufficient training images and corresponding ground truth, as well as the non-rigid nature of chromosomes, leading to distorted details and shapes after straightening. We first propose a study for the chromosome straightening task. We introduce a novel framework using image-to-image translation and demonstrate its efficacy and robustness in generating straightened chromosomes. The framework addresses the challenges of limited training data and outperforms existing studies. We then present a subsequent study to address the limitations of our previous framework, resulting in new state-of-the-art performance and better interpretability and generalization capability. We propose a new robust chromosome straightening framework, named Vit-Patch GAN, which instead learns the motion representation of chromosomes for straightening while retaining more details of shape and banding patterns. For medical image segmentation, we focus on the fovea localization task, which is transferred from localization to small region segmentation. Accurate segmentation of the fovea region is crucial for monitoring and analyzing retinal diseases to prevent irreversible vision loss. This task also requires the incorporation of global features to effectively identify the fovea region and overcome hard cases associated with retinal diseases and non-standard fovea locations. We first propose a novel two-branch architecture, Bilateral-ViT, for fovea localization in retina image segmentation. This vision-transformer-based architecture incorporates global image context and blood vessel structure. It surpasses existing methods and achieves state-of-the-art results on two public datasets. We then propose a subsequent method to further improve the performance of fovea localization. We design a novel dual-stream deep learning architecture called Bilateral-Fuser. In contrast to our previous Bilateral-ViT, Bilateral-Fuser globally incorporates long-range connections from multiple cues, including fundus and vessel distribution. Moreover, with the newly designed Bilateral Token Incorporation module, Bilateral-Fuser learns anatomical-aware tokens, significantly reducing computational costs while achieving new state-of-the-art performance. Our comprehensive experiments also demonstrate that Bilateral-Fuser achieves better accuracy and robustness on both normal and diseased retina images, with excellent generalization capability

University of Liverpool Repository

Multiscale Centerline Detection

Author: Fua Pascal
Lepetit Vincent
Sironi Amos
Türetken Engin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/11/2014
Field of study

Finding the centerline and estimating the radius of linear structures is a critical first step in many applications, ranging from road delineation in 2D aerial images to modeling blood vessels, lung bronchi, and dendritic arbors in 3D biomedical image stacks. Existing techniques rely either on filters designed to respond to ideal cylindrical structures or on classification techniques. The former tend to become unreliable when the linear structures are very irregular while the latter often has difficulties distinguishing centerline locations from neighboring ones, thus losing accuracy. We solve this problem by reformulating centerline detection in terms of a \emph{regression} problem. We first train regressors to return the distances to the closest centerline in scale-space, and we apply them to the input images or volumes. The centerlines and the corresponding scale then correspond to the regressors local maxima, which can be easily identified. We show that our method outperforms state-of-the-art techniques for various 2D and 3D datasets. Moreover, our approach is very generic and also performs well on contour detection. We show an improvement above recent contour detection algorithms on the BSDS500 dataset

Infoscience - École polytechnique fédérale de Lausanne

Automated retinal layer segmentation and pre-apoptotic monitoring for three-dimensional optical coherence tomography

Author: Kajic Vedran
Publication venue
Publication date: 01/01/2011
Field of study

The aim of this PhD thesis was to develop segmentation algorithm adapted and optimized to retinal OCT data that will provide objective 3D layer thickness which might be used to improve diagnosis and monitoring of retinal pathologies. Additionally, a 3D stack registration method was produced by modifying an existing algorithm. A related project was to develop a pre-apoptotic retinal monitoring based on the changes in texture parameters of the OCT scans in order to enable treatment before the changes become irreversible; apoptosis refers to the programmed cell death that can occur in retinal tissue and lead to blindness. These issues can be critical for the examination of tissues within the central nervous system. A novel statistical model for segmentation has been created and successfully applied to a large data set. A broad range of future research possibilities into advanced pathologies has been created by the results obtained. A separate model has been created for choroid segmentation located deep in retina, as the appearance of choroid is very different from the top retinal layers. Choroid thickness and structure is an important index of various pathologies (diabetes etc.). As part of the pre-apoptotic monitoring project it was shown that an increase in proportion of apoptotic cells in vitro can be accurately quantified. Moreover, the data obtained indicates a similar increase in neuronal scatter in retinal explants following axotomy (removal of retinas from the eye), suggesting that UHR-OCT can be a novel non-invasive technique for the in vivo assessment of neuronal health. Additionally, an independent project within the computer science department in collaboration with the school of psychology has been successfully carried out, improving analysis of facial dynamics and behaviour transfer between individuals. Also, important improvements to a general signal processing algorithm, dynamic time warping (DTW), have been made, allowing potential application in a broad signal processing field.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository

Multiscale Centerline Extraction Based on Regression and Projection onto the Set of Elongated Structures

Author: Sironi Amos
Publication venue: Lausanne, EPFL
Publication date: 21/11/2016
Field of study

Automatically extracting linear structures from images is a fundamental low-level vision problem with numerous applications in different domains. Centerline detection and radial estimation are the first crucial steps in most Computer Vision pipelines aiming to reconstruct linear structures. Existing techniques rely either on hand-crafted filters, designed to respond to ideal profiles of the linear structure, or on classification-based approaches, which automatically learn to detect centerline points from data. Hand-crafted methods are the most accurate when the content of the image fulfills the ideal model they rely on. However, they lose accuracy in the presence of noise or when the linear structures are irregular and deviate from the ideal case. Machine learning techniques can alleviate this problem. However, they are mainly based on a classification framework. In this thesis, we show that classification is not the best formalism to solve the centerline detection problem. In fact, since the appearance of a centerline point is very similar to the points immediately next to it, the output of a classifier trained to detect centerlines presents low localization accuracy and double responses on the body of the linear structure. To solve this problem, we propose a regression-based formulation for centerline detection. We rely on the distance transform of the centerlines to automatically learn a function whose local maxima correspond to centerline points. The output of our method can be used to directly estimate the location of the centerline, by a simple Non-Maximum Suppression operation, or it can be used as input to a tracing pipeline to reconstruct the graph of the linear structure. In both cases, our method gives more accurate results than state-of-the-art techniques on challenging 2D and 3D datasets. Our method relies on features extracted by means of convolutional filters. In order to process large amount of data efficiently, we introduce a general filter bank approximation scheme. In particular, we show that a generic filter bank can be approximated by a linear combination of a smaller set of separable filters. Thanks to this method, we can greatly reduce the computation time of the convolutions, without loss of accuracy. Our approach is general, and we demonstrate its effectiveness by applying it to different Computer Vision problems, such as linear structure detection and image classification with Convolutional Neural Networks. We further improve our regression-based method for centerline detection by taking advantage of contextual image information. We adopt a multiscale iterative regression approach to efficiently include a large image context in our algorithm. Compared to previous approaches, we use context both in the spatial domain and in the radial one. In this way, our method is also able to return an accurate estimation of the radii of the linear structures. The idea of using regression can also be beneficial for solving other related Computer Vision problems. For example, we show an improvement compared to previous works when applying it to boundary and membrane detection. Finally, we focus on the particular geometric properties of the linear structures. We observe that most methods for detecting them treat each pixel independently and do not model the strong relation that exists between neighboring pixels. As a consequence, their output is geometrically inconsistent. In this thesis, we address this problem by considering the projection of the score map returned by our regressor onto the set of all geometrically admissible ground truth images. We propose an efficient patch-wise approximation scheme to compute the projection. Moreover, we provide conditions under which the projection is exact. We demonstrate the advantage of our method by applying it to four different problems

Infoscience - École polytechnique fédérale de Lausanne