184 research outputs found

    Manifold Learning Approaches to Compressing Latent Spaces of Unsupervised Feature Hierarchies

    Get PDF
    Field robots encounter dynamic unstructured environments containing a vast array of unique objects. In order to make sense of the world in which they are placed, they collect large quantities of unlabelled data with a variety of sensors. Producing robust and reliable applications depends entirely on the ability of the robot to understand the unlabelled data it obtains. Deep Learning techniques have had a high level of success in learning powerful unsupervised representations for a variety of discriminative and generative models. Applying these techniques to problems encountered in field robotics remains a challenging endeavour. Modern Deep Learning methods are typically trained with a substantial labelled dataset, while datasets produced in a field robotics context contain limited labelled training data. The primary motivation for this thesis stems from the problem of applying large scale Deep Learning models to field robotics datasets that are label poor. While the lack of labelled ground truth data drives the desire for unsupervised methods, the need for improving the model scaling is driven by two factors, performance and computational requirements. When utilising unsupervised layer outputs as representations for classification, the classification performance increases with layer size. Scaling up models with multiple large layers of features is problematic, as the sizes of subsequent hidden layers scales with the size of the previous layer. This quadratic scaling, and the associated time required to train such networks has prevented adoption of large Deep Learning models beyond cluster computing. The contributions in this thesis are developed from the observation that parameters or filter el- ements learnt in Deep Learning systems are typically highly structured, and contain related ele- ments. Firstly, the structure of unsupervised filters is utilised to construct a mapping from the high dimensional filter space to a low dimensional manifold. This creates a significantly smaller repre- sentation for subsequent feature learning. This mapping, and its effect on the resulting encodings, highlights the need for the ability to learn highly overcomplete sets of convolutional features. Driven by this need, the unsupervised pretraining of Deep Convolutional Networks is developed to include a number of modern training and regularisation methods. These pretrained models are then used to provide initialisations for supervised convolutional models trained on low quantities of labelled data. By utilising pretraining, a significant increase in classification performance on a number of publicly available datasets is achieved. In order to apply these techniques to outdoor 3D Laser Illuminated Detection And Ranging data, we develop a set of resampling techniques to provide uniform input to Deep Learning models. The features learnt in these systems outperform the high effort hand engineered features developed specifically for 3D data. The representation of a given signal is then reinterpreted as a combination of modes that exist on the learnt low dimensional filter manifold. From this, we develop an encoding technique that allows the high dimensional layer output to be represented as a combination of low dimensional components. This allows the growth of subsequent layers to only be dependent on the intrinsic dimensionality of the filter manifold and not the number of elements contained in the previous layer. Finally, the resulting unsupervised convolutional model, the encoding frameworks and the em- bedding methodology are used to produce a new unsupervised learning stratergy that is able to encode images in terms of overcomplete filter spaces, without producing an explosion in the size of the intermediate parameter spaces. This model produces classification results on par with state of the art models, yet requires significantly less computational resources and is suitable for use in the constrained computation environment of a field robot

    A retinal vasculature tracking system guided by a deep architecture

    Get PDF
    Many diseases such as diabetic retinopathy (DR) and cardiovascular diseases show their early signs on retinal vasculature. Analysing the vasculature in fundus images may provide a tool for ophthalmologists to diagnose eye-related diseases and to monitor their progression. These analyses may also facilitate the discovery of new relations between changes on retinal vasculature and the existence or progression of related diseases or to validate present relations. In this thesis, a data driven method, namely a Translational Deep Belief Net (a TDBN), is adapted to vasculature segmentation. The segmentation performance of the TDBN on low resolution images was found to be comparable to that of the best-performing methods. Later, this network is used for the implementation of super-resolution for the segmentation of high resolution images. This approach provided an acceleration during segmentation, which relates to down-sampling ratio of an input fundus image. Finally, the TDBN is extended for the generation of probability maps for the existence of vessel parts, namely vessel interior, centreline, boundary and crossing/bifurcation patterns in centrelines. These probability maps are used to guide a probabilistic vasculature tracking system. Although segmentation can provide vasculature existence in a fundus image, it does not give quantifiable measures for vasculature. The latter has more practical value in medical clinics. In the second half of the thesis, a retinal vasculature tracking system is presented. This system uses Particle Filters to describe vessel morphology and topology. Apart from previous studies, the guidance for tracking is provided with the combination of probability maps generated by the TDBN. The experiments on a publicly available dataset, REVIEW, showed that the consistency of vessel widths predicted by the proposed method was better than that obtained from observers. Moreover, very noisy and low contrast vessel boundaries, which were hardly identifiable to the naked eye, were accurately estimated by the proposed tracking system. Also, bifurcation/crossing locations during the course of tracking were detected almost completely. Considering these promising initial results, future work involves analysing the performance of the tracking system on automatic detection of complete vessel networks in fundus images.Open Acces

    MRI Artefact Augmentation: Robust Deep Learning Systems and Automated Quality Control

    Get PDF
    Quality control (QC) of magnetic resonance imaging (MRI) is essential to establish whether a scan or dataset meets a required set of standards. In MRI, many potential artefacts must be identified so that problematic images can either be excluded or accounted for in further image processing or analysis. To date, the gold standard for the identification of these issues is visual inspection by experts. A primary source of MRI artefacts is caused by patient movement, which can affect clinical diagnosis and impact the accuracy of Deep Learning systems. In this thesis, I present a method to simulate motion artefacts from artefact-free images to augment convolutional neural networks (CNNs), increasing training appearance variability and robustness to motion artefacts. I show that models trained with artefact augmentation generalise better and are more robust to real-world artefacts, with negligible cost to performance on clean data. I argue that it is often better to optimise frameworks end-to-end with artefact augmentation rather than learning to retrospectively remove artefacts, thus enforcing robustness to artefacts at the feature level representation of the data. The labour-intensive and subjective nature of QC has increased interest in automated methods. To address this, I approach MRI quality estimation as the uncertainty in performing a downstream task, using probabilistic CNNs to predict segmentation uncertainty as a function of the input data. Extending this framework, I introduce a novel decoupled uncertainty model, enabling separate uncertainty predictions for different types of image degradation. Training with an extended k-space artefact augmentation pipeline, the model provides informative measures of uncertainty on problematic real-world scans classified by QC raters and enables sources of segmentation uncertainty to be identified. Suitable quality for algorithmic processing may differ from an image's perceptual quality. Exploring this, I pose MRI visual quality assessment as an image restoration task. Using Bayesian CNNs to recover clean images from noisy data, I show that the uncertainty indicates the possible recoverability of an image. A multi-task network combining uncertainty-aware artefact recovery with tissue segmentation highlights the distinction between visual and algorithmic quality, which has the impact that, depending on the downstream task, less data should be discarded for purely visual quality reasons

    DEF: Deep Estimation of Sharp Geometric Features in 3D Shapes

    Full text link
    Sharp feature lines carry essential information about human-made objects, enabling compact 3D shape representations, high-quality surface reconstruction, and are a signal source for mesh processing. While extracting high-quality lines from noisy and undersampled data is challenging for traditional methods, deep learning-powered algorithms can leverage global and semantic information from the training data to aid in the process. We propose Deep Estimators of Features (DEFs), a learning-based framework for predicting sharp geometric features in sampled 3D shapes. Differently from existing data-driven methods, which reduce this problem to feature classification, we propose to regress a scalar field representing the distance from point samples to the closest feature line on local patches. By fusing the result of individual patches, we can process large 3D models, which are impossible to process for existing data-driven methods due to their size and complexity. Extensive experimental evaluation of DEFs is implemented on synthetic and real-world 3D shape datasets and suggests advantages of our image- and point-based estimators over competitor methods, as well as improved noise robustness and scalability of our approach

    3D exemplar-based image inpainting in electron microscopy

    Get PDF
    In electron microscopy (EM) a common problem is the non-availability of data, which causes artefacts in reconstructions. In this thesis the goal is to generate artificial data where missing in EM by using exemplar-based inpainting (EBI). We implement an accelerated 3D version tailored to applications in EM, which reduces reconstruction times from days to minutes. We develop intelligent sampling strategies to find optimal data as input for reconstruction methods. Further, we investigate approaches to reduce electron dose and acquisition time. Sparse sampling followed by inpainting is the most promising approach. As common evaluation measures may lead to misinterpretation of results in EM and falsify a subsequent analysis, we propose to use application driven metrics and demonstrate this in a segmentation task. A further application of our technique is the artificial generation of projections in tiltbased EM. EBI is used to generate missing projections, such that the full angular range is covered. Subsequent reconstructions are significantly enhanced in terms of resolution, which facilitates further analysis of samples. In conclusion, EBI proves promising when used as an additional data generation step to tackle the non-availability of data in EM, which is evaluated in selected applications. Enhancing adaptive sampling methods and refining EBI, especially considering the mutual influence, promotes higher throughput in EM using less electron dose while not lessening quality.Ein häufig vorkommendes Problem in der Elektronenmikroskopie (EM) ist die Nichtverfügbarkeit von Daten, was zu Artefakten in Rekonstruktionen führt. In dieser Arbeit ist es das Ziel fehlende Daten in der EM künstlich zu erzeugen, was durch Exemplar-basiertes Inpainting (EBI) realisiert wird. Wir implementieren eine auf EM zugeschnittene beschleunigte 3D Version, welche es ermöglicht, Rekonstruktionszeiten von Tagen auf Minuten zu reduzieren. Wir entwickeln intelligente Abtaststrategien, um optimale Datenpunkte für die Rekonstruktion zu erhalten. Ansätze zur Reduzierung von Elektronendosis und Aufnahmezeit werden untersucht. Unterabtastung gefolgt von Inpainting führt zu den besten Resultaten. Evaluationsmaße zur Beurteilung der Rekonstruktionsqualität helfen in der EM oft nicht und können zu falschen Schlüssen führen, weswegen anwendungsbasierte Metriken die bessere Wahl darstellen. Dies demonstrieren wir anhand eines Beispiels. Die künstliche Erzeugung von Projektionen in der neigungsbasierten Elektronentomographie ist eine weitere Anwendung. EBI wird verwendet um fehlende Projektionen zu generieren. Daraus resultierende Rekonstruktionen weisen eine deutlich erhöhte Auflösung auf. EBI ist ein vielversprechender Ansatz, um nicht verfügbare Daten in der EM zu generieren. Dies wird auf Basis verschiedener Anwendungen gezeigt und evaluiert. Adaptive Aufnahmestrategien und EBI können also zu einem höheren Durchsatz in der EM führen, ohne die Bildqualität merklich zu verschlechtern

    PatchMatch Belief Propagation for Correspondence Field Estimation and its Applications

    Get PDF
    Correspondence fields estimation is an important process that lies at the core of many different applications. Is it often seen as an energy minimisation problem, which is usually decomposed into the combined minimisation of two energy terms. The first is the unary energy, or data term, which reflects how well the solution agrees with the data. The second is the pairwise energy, or smoothness term, and ensures that the solution displays a certain level of smoothness, which is crucial for many applications. This thesis explores the possibility of combining two well-established algorithms for correspondence field estimation, PatchMatch and Belief Propagation, in order to benefit from the strengths of both and overcome some of their weaknesses. Belief Propagation is a common algorithm that can be used to optimise energies comprising both unary and pairwise terms. It is however computational expensive and thus not adapted to continuous spaces which are often needed in imaging applications. On the other hand, PatchMatch is a simple, yet very efficient method for optimising the unary energy of such problems on continuous and high dimensional spaces. The algorithm has two main components: the update of the solution space by sampling and the use of the spatial neighbourhood to propagate samples. We show how these components are related to the components of a specific form of Belief Propagation, called Particle Belief Propagation (PBP). PatchMatch however suffers from the lack of an explicit smoothness term. We show that unifying the two approaches yields a new algorithm, PMBP, which has improved performance compared to PatchMatch and is orders of magnitude faster than PBP. We apply our new optimiser to two different applications: stereo matching and optical flow. We validate the benefits of PMBP through series of experiments and show that we consistently obtain lower errors than both PatchMatch and Belief Propagation

    Just-in-time deep learning for real-time X-ray computed tomography

    Get PDF
    Real-time X-ray tomography pipelines, such as implemented by RECAST3D, compute and visualize tomographic reconstructions in milliseconds, and enable the observation of dynamic experiments in synchrotron beamlines and laboratory scanners. For extending real-time reconstruction by image processing and analysis components, Deep Neural Networks (DNNs) are a promising technology, due to their strong performance and much faster run-times compared to conventional algorithms. DNNs may prevent experiment repetition by simplifying real-time steering and optimization of the ongoing experiment. The main challenge of integrating DNNs into real-time tomography pipelines, however, is that they need to learn their task from representative data before the start of the experiment. In scientific environments, such training data may not exist, and other uncertain and variable factors, such as the set-up configuration, reconstruction parameters, or user interaction, cannot easily be anticipated beforehand, either. To overcome these problems, we developed just-in-time learning, an online DNN training strategy that takes advantage of the spatio-temporal continuity of consecutive reconstructions in the tomographic pipeline. This allows training and deploying comparatively small DNNs during the experiment. We provide software implementations, and study the feasibility and challenges of the approach by training the self-supervised Noise2Inverse denoising task with X-ray data replayed from real-world dynamic experiments

    Generalizable automated pixel-level structural segmentation of medical and biological data

    Get PDF
    Over the years, the rapid expansion in imaging techniques and equipments has driven the demand for more automation in handling large medical and biological data sets. A wealth of approaches have been suggested as optimal solutions for their respective imaging types. These solutions span various image resolutions, modalities and contrast (staining) mechanisms. Few approaches generalise well across multiple image types, contrasts or resolution. This thesis proposes an automated pixel-level framework that addresses 2D, 2D+t and 3D structural segmentation in a more generalizable manner, yet has enough adaptability to address a number of specific image modalities, spanning retinal funduscopy, sequential fluorescein angiography and two-photon microscopy. The pixel-level segmentation scheme involves: i ) constructing a phase-invariant orientation field of the local spatial neighbourhood; ii ) combining local feature maps with intensity-based measures in a structural patch context; iii ) using a complex supervised learning process to interpret the combination of all the elements in the patch in order to reach a classification decision. This has the advantage of transferability from retinal blood vessels in 2D to neural structures in 3D. To process the temporal components in non-standard 2D+t retinal angiography sequences, we first introduce a co-registration procedure: at the pairwise level, we combine projective RANSAC with a quadratic homography transformation to map the coordinate systems between any two frames. At the joint level, we construct a hierarchical approach in order for each individual frame to be registered to the global reference intra- and inter- sequence(s). We then take a non-training approach that searches in both the spatial neighbourhood of each pixel and the filter output across varying scales to locate and link microvascular centrelines to (sub-) pixel accuracy. In essence, this \link while extract" piece-wise segmentation approach combines the local phase-invariant orientation field information with additional local phase estimates to obtain a soft classification of the centreline (sub-) pixel locations. Unlike retinal segmentation problems where vasculature is the main focus, 3D neural segmentation requires additional exibility, allowing a variety of structures of anatomical importance yet with different geometric properties to be differentiated both from the background and against other structures. Notably, cellular structures, such as Purkinje cells, neural dendrites and interneurons, all display certain elongation along their medial axes, yet each class has a characteristic shape captured by an orientation field that distinguishes it from other structures. To take this into consideration, we introduce a 5D orientation mapping to capture these orientation properties. This mapping is incorporated into the local feature map description prior to a learning machine. Extensive performance evaluations and validation of each of the techniques presented in this thesis is carried out. For retinal fundus images, we compute Receiver Operating Characteristic (ROC) curves on existing public databases (DRIVE & STARE) to assess and compare our algorithms with other benchmark methods. For 2D+t retinal angiography sequences, we compute the error metrics ("Centreline Error") of our scheme with other benchmark methods. For microscopic cortical data stacks, we present segmentation results on both surrogate data with known ground-truth and experimental rat cerebellar cortex two-photon microscopic tissue stacks.Open Acces
    corecore