1,694 research outputs found

    Stochastic Methods for Fine-Grained Image Segmentation and Uncertainty Estimation in Computer Vision

    Get PDF
    In this dissertation, we exploit concepts of probability theory, stochastic methods and machine learning to address three existing limitations of deep learning-based models for image understanding. First, although convolutional neural networks (CNN) have substantially improved the state of the art in image understanding, conventional CNNs provide segmentation masks that poorly adhere to object boundaries, a critical limitation for many potential applications. Second, training deep learning models requires large amounts of carefully selected and annotated data, but large-scale annotation of image segmentation datasets is often prohibitively expensive. And third, conventional deep learning models also lack the capability of uncertainty estimation, which compromises both decision making and model interpretability. To address these limitations, we introduce the Region Growing Refinement (RGR) algorithm, an unsupervised post-processing algorithm that exploits Monte Carlo sampling and pixel similarities to propagate high-confidence labels into regions of low-confidence classification. The probabilistic Region Growing Refinement (pRGR) provides RGR with a rigorous mathematical foundation that exploits concepts of Bayesian estimation and variance reduction techniques. Experiments demonstrate both the effectiveness of (p)RGR for the refinement of segmentation predictions, as well as its suitability for uncertainty estimation, since its variance estimates obtained in the Monte Carlo iterations are highly correlated with segmentation accuracy. We also introduce FreeLabel, an intuitive open-source web interface that exploits RGR to allow users to obtain high-quality segmentation masks with just a few freehand scribbles, in a matter of seconds. Designed to benefit the computer vision community, FreeLabel can be used for both crowdsourced or private annotation and has a modular structure that can be easily adapted for any image dataset. The practical relevance of methods developed in this dissertation are illustrated through applications on agricultural and healthcare-related domains. We have combined RGR and modern CNNs for fine segmentation of fruit flowers, motivated by the importance of automated bloom intensity estimation for optimization of fruit orchard management and, possibly, automatizing procedures such as flower thinning and pollination. We also exploited an early version of FreeLabel to annotate novel datasets for segmentation of fruit flowers, which are currently publicly available. Finally, this dissertation also describes works on fine segmentation and gaze estimation for images collected from assisted living environments, with the ultimate goal of assisting geriatricians in evaluating health status of patients in such facilities

    Deep active learning for suggestive segmentation of biomedical image stacks via optimisation of Dice scores and traced boundary length

    Get PDF
    Manual segmentation of stacks of 2D biomedical images (e.g., histology) is a time-consuming task which can be sped up with semi-automated techniques. In this article, we present a suggestive deep active learning framework that seeks to minimise the annotation effort required to achieve a certain level of accuracy when labelling such a stack. The framework suggests, at every iteration, a specific region of interest (ROI) in one of the images for manual delineation. Using a deep segmentation neural network and a mixed cross-entropy loss function, we propose a principled strategy to estimate class probabilities for the whole stack, conditioned on heterogeneous partial segmentations of the 2D images, as well as on weak supervision in the form of image indices that bound each ROI. Using the estimated probabilities, we propose a novel active learning criterion based on predictions for the estimated segmentation performance and delineation effort, measured with average Dice scores and total delineated boundary length, respectively, rather than common surrogates such as entropy. The query strategy suggests the ROI that is expected to maximise the ratio between performance and effort, while considering the adjacency of structures that may have already been labelled – which decrease the length of the boundary to trace. We provide quantitative results on synthetically deformed MRI scans and real histological data, showing that our framework can reduce labelling effort by up to 60–70% without compromising accuracy

    Image Annotation and Topic Extraction Using Super-Word Latent Dirichlet

    Get PDF
    This research presents a multi-domain solution that uses text and images to iteratively improve automated information extraction. Stage I uses local text surrounding an embedded image to provide clues that help rank-order possible image annotations. These annotations are forwarded to Stage II, where the image annotations from Stage I are used as highly-relevant super-words to improve extraction of topics. The model probabilities from the super-words in Stage II are forwarded to Stage III where they are used to refine the automated image annotation developed in Stage I. All stages demonstrate improvement over existing equivalent algorithms in the literature

    Reasoning with Uncertainty in Deep Learning for Safer Medical Image Computing

    Get PDF
    Deep learning is now ubiquitous in the research field of medical image computing. As such technologies progress towards clinical translation, the question of safety becomes critical. Once deployed, machine learning systems unavoidably face situations where the correct decision or prediction is ambiguous. However, the current methods disproportionately rely on deterministic algorithms, lacking a mechanism to represent and manipulate uncertainty. In safety-critical applications such as medical imaging, reasoning under uncertainty is crucial for developing a reliable decision making system. Probabilistic machine learning provides a natural framework to quantify the degree of uncertainty over different variables of interest, be it the prediction, the model parameters and structures, or the underlying data (images and labels). Probability distributions are used to represent all the uncertain unobserved quantities in a model and how they relate to the data, and probability theory is used as a language to compute and manipulate these distributions. In this thesis, we explore probabilistic modelling as a framework to integrate uncertainty information into deep learning models, and demonstrate its utility in various high-dimensional medical imaging applications. In the process, we make several fundamental enhancements to current methods. We categorise our contributions into three groups according to the types of uncertainties being modelled: (i) predictive; (ii) structural and (iii) human uncertainty. Firstly, we discuss the importance of quantifying predictive uncertainty and understanding its sources for developing a risk-averse and transparent medical image enhancement application. We demonstrate how a measure of predictive uncertainty can be used as a proxy for the predictive accuracy in the absence of ground-truths. Furthermore, assuming the structure of the model is flexible enough for the task, we introduce a way to decompose the predictive uncertainty into its orthogonal sources i.e. aleatoric and parameter uncertainty. We show the potential utility of such decoupling in providing a quantitative “explanations” into the model performance. Secondly, we introduce our recent attempts at learning model structures directly from data. One work proposes a method based on variational inference to learn a posterior distribution over connectivity structures within a neural network architecture for multi-task learning, and share some preliminary results in the MR-only radiotherapy planning application. Another work explores how the training algorithm of decision trees could be extended to grow the architecture of a neural network to adapt to the given availability of data and the complexity of the task. Lastly, we develop methods to model the “measurement noise” (e.g., biases and skill levels) of human annotators, and integrate this information into the learning process of the neural network classifier. In particular, we show that explicitly modelling the uncertainty involved in the annotation process not only leads to an improvement in robustness to label noise, but also yields useful insights into the patterns of errors that characterise individual experts

    Extraction of Unfoliaged Trees from Terrestrial Image Sequences

    Get PDF
    This thesis presents a generative statistical approach for the fully automatic three-dimensional (3D) extraction and reconstruction of unfoliaged deciduous trees from wide-baseline image sequences. Tree models improve the realism of 3D Geoinformation systems (GIS) by adding a natural touch. Unfoliaged trees are, however, difficult to reconstruct from images due to partially weak contrast, background clutter, occlusions, and particularly the possibly varying order of branches in images from different viewpoints. The proposed approach combines generative modeling by L-systems and statistical maximum a posteriori (MAP) estimation for the extraction of the 3D branching structure of trees. Background estimation is conducted by means of mathematical (gray scale) morphology as basis for generative modeling. A Gaussian likelihood function based on intensity differences is employed to evaluate the hypotheses. A mechanism has been devised to control the sampling sequence of multiple parameters in the Markov Chain considering their characteristics and the performance in the previous step. A tree is classified into three typical branching types after the extraction of the first level of branches and more specific Production Rules of L-systems are used accordingly. Generic prior distributions for parameters are refined based on already extracted branches in a Bayesian framework and integrated into the MAP estimation. By these means most of the branching structure besides tiny twigs can be reconstructed. Results are presented in the form of VRML (Virtual Reality Modeling Language) models demonstrating the potential of the approach as well as its current shortcomings.Diese Dissertationsschrift stellt einen generativen statistischen Ansatz für die vollautomatische drei-dimensionale (3D) Extraktion und Rekonstruktion unbelaubter Laubbäume aus Bildsequenzen mit großer Basis vor. Modelle für Bäume verbessern den Realismus von 3D Geoinformationssystemen (GIS), indem sie Letzteren eine natürliche Note geben. Wegen z.T. schwachem Kontrast, Störobjekten im Hintergrund, Verdeckungen und insbesondere der möglicherweise unterschiedlichen Ordnung der Äste in Bildern von verschiedenen Blickpunkten sind unbelaubte Bäume aber schwierig zu rekonstruieren. Der vorliegende Ansatz kombiniert generative Modellierung mittels L-Systemen und statistische Maximum A Posteriori (MAP) Schätzung für die Extraktion der 3D Verzweigungsstruktur von Bäumen. Hintergrund-Schätzung wird auf Grundlage von mathematischer (Grauwert) Morphologie als Basis für die generative Modellierung durchgeführt. Für die Bewertung der Hypothesen wird eine Gaußsche Likelihood-Funktion basierend auf Intensitätsunterschieden benutzt. Es wurde ein Mechanismus entworfen, der die Reihenfolge der Verwendung mehrerer Parameter für die Markoff-Kette basierend auf deren Charakteristik und Performance im letzten Schritt kontrolliert. Ein Baum wird nach der Extraktion der ersten Stufe von Ästen in drei typische Verzweigungstypen klassifiziert und es werden entsprechend Produktionsregeln von spezifischen L-Systemen verwendet. Basierend auf bereits extrahierten Ästen werden generische Prior-Verteilungen für die Parameter in einem Bayes’schen Rahmen verfeinert und in die MAP Schätzung integriert. Damit kann ein großer Teil der Verzweigungsstruktur außer kleinen Ästen extrahiert werden. Die Ergebnisse werden als VRML (Virtual Reality Modeling Language) Modelle dargestellt. Sie zeigen das Potenzial aber auch die noch vorhandenen Defizite des Ansatzes
    • …
    corecore