694 research outputs found

    Visual Odometry Using Line Features and Machine Learning Enhanced Line Description

    Get PDF
    The research on 2D lines in images has increased strongly in the last decade; on the one hand, due to more computing power available, on the other hand, due to an increased interest in odometry methods and autonomous systems. Line features have some advantages over the more thoroughly researched point features. Lines are detected on gradients, they do not need texture to be found. Thus, as long as there are gradients between homogeneous regions, they can cope with difficult situations in which mostly homogeneous areas are present. By being detected on gradients, they are also well suited to represent structure. Furthermore, lines have a very high accuracy orthogonal to their direction, as they consist of numerous points which all lie on the gradient contributing to this locational accuracy. First, we introduce a visual odometry approach which achieves real-time performance and runs solely using lines features, it does not require point features. We developed a heuristic filter algorithm which takes neighbouring line features into account and thereby improves tracking of lines and matching of lines in images taken from arbitrary camera locations. This increases the number of tracked lines and is especially beneficial in difficult scenes where it is hard to match lines by tracking them. Additionally, we employed the Cayley representation for 3D lines to avoid overparameterization in the optimization. To show the advancement of the method, it is benchmarked on commonly used datasets and compared to other state of the art approaches. Second, we developed a machine learning based line feature descriptor for line matching. This descriptor can be used to match lines from arbitrary camera locations. The training data was created synthetically using the Unreal Engine 4. We trained a model based on the ResNet architecture using a triplet loss. We evaluated the descriptor on real world scenes and show its improvement over the famous Line Band Descriptor. Third, we build upon our previous descriptor to create an improved version. Therefor, we added an image pyramid, gabor wavelets and increased the descriptor size. The evaluation of the new descriptor additionally contains competing new approaches which are also machine learning based. It shows that our improved approach outperforms them. Finally, we provide an extended evaluation of our descriptor which shows the influences of different settings and processing steps. And we present an analysis of settings for practical usage scenarios. The influence of a maximum descriptor distance threshold, of a Left-Right consistency check and of a descriptor distance ratio threshold between the first and second best match were investigated. It turns out that, for the ratio of true to false matches, it is almost always better to use a descriptor distance ratio threshold than a maximum descriptor distance threshold

    Robust thalamic nuclei segmentation method based on local diffusion magnetic resonance properties.

    Get PDF
    The thalamus is an essential relay station in the cortical-subcortical connections. It is characterized by a complex anatomical architecture composed of numerous small nuclei, which mediate the involvement of the thalamus in a wide range of neurological functions. We present a novel framework for segmenting the thalamic nuclei, which explores the orientation distribution functions (ODFs) from diffusion magnetic resonance images at 3 T. The differentiation of the complex intra-thalamic microstructure is improved by using the spherical harmonic (SH) representation of the ODFs, which provides full angular characterization of the diffusion process in each voxel. The clustering was performed using the k-means algorithm initialized in a data-driven manner. The method was tested on 35 healthy volunteers and our results show a robust, reproducible and accurate segmentation of the thalamus in seven nuclei groups. Six of them closely matched the anatomy and were labeled as anterior, ventral anterior, medio-dorsal, ventral latero-ventral, ventral latero-dorsal and pulvinar, while the seventh cluster included the centro-lateral and the latero-posterior nuclei. Results were evaluated both qualitatively, by comparing the segmented nuclei to the histological atlas of Morel, and quantitatively, by measuring the clusters' extent and the clusters' spatial distribution across subjects and hemispheres. We also showed the robustness of our approach across different sequences and scanners, as well as intra-subject reproducibility of the segmented clusters using additional two scan-rescan datasets. We also observed an overlap between the path of the main long-connection tracts passing through the thalamus and the spatial distribution of the nuclei identified with our clustering algorithm. Our approach, based on SH representations of the ODFs, outperforms the one based on angular differences between the principle diffusion directions, which is considered so far as state-of-the-art method. Our findings show an anatomically reliable segmentation of the main groups of thalamic nuclei that could be of potential use in many clinical applications

    From Fully-Supervised Single-Task to Semi-Supervised Multi-Task Deep Learning Architectures for Segmentation in Medical Imaging Applications

    Get PDF
    Medical imaging is routinely performed in clinics worldwide for the diagnosis and treatment of numerous medical conditions in children and adults. With the advent of these medical imaging modalities, radiologists can visualize both the structure of the body as well as the tissues within the body. However, analyzing these high-dimensional (2D/3D/4D) images demands a significant amount of time and effort from radiologists. Hence, there is an ever-growing need for medical image computing tools to extract relevant information from the image data to help radiologists perform efficiently. Image analysis based on machine learning has pivotal potential to improve the entire medical imaging pipeline, providing support for clinical decision-making and computer-aided diagnosis. To be effective in addressing challenging image analysis tasks such as classification, detection, registration, and segmentation, specifically for medical imaging applications, deep learning approaches have shown significant improvement in performance. While deep learning has shown its potential in a variety of medical image analysis problems including segmentation, motion estimation, etc., generalizability is still an unsolved problem and many of these successes are achieved at the cost of a large pool of datasets. For most practical applications, getting access to a copious dataset can be very difficult, often impossible. Annotation is tedious and time-consuming. This cost is further amplified when annotation must be done by a clinical expert in medical imaging applications. Additionally, the applications of deep learning in the real-world clinical setting are still limited due to the lack of reliability caused by the limited prediction capabilities of some deep learning models. Moreover, while using a CNN in an automated image analysis pipeline, it’s critical to understand which segmentation results are problematic and require further manual examination. To this extent, the estimation of uncertainty calibration in a semi-supervised setting for medical image segmentation is still rarely reported. This thesis focuses on developing and evaluating optimized machine learning models for a variety of medical imaging applications, ranging from fully-supervised, single-task learning to semi-supervised, multi-task learning that makes efficient use of annotated training data. The contributions of this dissertation are as follows: (1) developing a fully-supervised, single-task transfer learning for the surgical instrument segmentation from laparoscopic images; and (2) utilizing supervised, single-task, transfer learning for segmenting and digitally removing the surgical instruments from endoscopic/laparoscopic videos to allow the visualization of the anatomy being obscured by the tool. The tool removal algorithms use a tool segmentation mask and either instrument-free reference frames or previous instrument-containing frames to fill in (inpaint) the instrument segmentation mask; (3) developing fully-supervised, single-task learning via efficient weight pruning and learned group convolution for accurate left ventricle (LV), right ventricle (RV) blood pool and myocardium localization and segmentation from 4D cine cardiac MR images; (4) demonstrating the use of our fully-supervised memory-efficient model to generate dynamic patient-specific right ventricle (RV) models from cine cardiac MRI dataset via an unsupervised learning-based deformable registration field; and (5) integrating a Monte Carlo dropout into our fully-supervised memory-efficient model with inherent uncertainty estimation, with the overall goal to estimate the uncertainty associated with the obtained segmentation and error, as a means to flag regions that feature less than optimal segmentation results; (6) developing semi-supervised, single-task learning via self-training (through meta pseudo-labeling) in concert with a Teacher network that instructs the Student network by generating pseudo-labels given unlabeled input data; (7) proposing largely-unsupervised, multi-task learning to demonstrate the power of a simple combination of a disentanglement block, variational autoencoder (VAE), generative adversarial network (GAN), and a conditioning layer-based reconstructor for performing two of the foremost critical tasks in medical imaging — segmentation of cardiac structures and reconstruction of the cine cardiac MR images; (8) demonstrating the use of 3D semi-supervised, multi-task learning for jointly learning multiple tasks in a single backbone module – uncertainty estimation, geometric shape generation, and cardiac anatomical structure segmentation of the left atrial cavity from 3D Gadolinium-enhanced magnetic resonance (GE-MR) images. This dissertation summarizes the impact of the contributions of our work in terms of demonstrating the adaptation and use of deep learning architectures featuring different levels of supervision to build a variety of image segmentation tools and techniques that can be used across a wide spectrum of medical image computing applications centered on facilitating and promoting the wide-spread computer-integrated diagnosis and therapy data science

    Descriptor Based Analysis of Digital 3D Shapes

    Get PDF

    3D object reconstruction using computer vision : reconstruction and characterization applications for external human anatomical structures

    Get PDF
    Tese de doutoramento. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 201

    Multi-scale active shape description in medical imaging

    Get PDF
    Shape description in medical imaging has become an increasingly important research field in recent years. Fast and high-resolution image acquisition methods like Magnetic Resonance (MR) imaging produce very detailed cross-sectional images of the human body - shape description is then a post-processing operation which abstracts quantitative descriptions of anatomically relevant object shapes. This task is usually performed by clinicians and other experts by first segmenting the shapes of interest, and then making volumetric and other quantitative measurements. High demand on expert time and inter- and intra-observer variability impose a clinical need of automating this process. Furthermore, recent studies in clinical neurology on the correspondence between disease status and degree of shape deformations necessitate the use of more sophisticated, higher-level shape description techniques. In this work a new hierarchical tool for shape description has been developed, combining two recently developed and powerful techniques in image processing: differential invariants in scale-space, and active contour models. This tool enables quantitative and qualitative shape studies at multiple levels of image detail, exploring the extra image scale degree of freedom. Using scale-space continuity, the global object shape can be detected at a coarse level of image detail, and finer shape characteristics can be found at higher levels of detail or scales. New methods for active shape evolution and focusing have been developed for the extraction of shapes at a large set of scales using an active contour model whose energy function is regularized with respect to scale and geometric differential image invariants. The resulting set of shapes is formulated as a multiscale shape stack which is analysed and described for each scale level with a large set of shape descriptors to obtain and analyse shape changes across scales. This shape stack leads naturally to several questions in regard to variable sampling and appropriate levels of detail to investigate an image. The relationship between active contour sampling precision and scale-space is addressed. After a thorough review of modem shape description, multi-scale image processing and active contour model techniques, the novel framework for multi-scale active shape description is presented and tested on synthetic images and medical images. An interesting result is the recovery of the fractal dimension of a known fractal boundary using this framework. Medical applications addressed are grey-matter deformations occurring for patients with epilepsy, spinal cord atrophy for patients with Multiple Sclerosis, and cortical impairment for neonates. Extensions to non-linear scale-spaces, comparisons to binary curve and curvature evolution schemes as well as other hierarchical shape descriptors are discussed

    Integrity Determination for Image Rendering Vision Navigation

    Get PDF
    This research addresses the lack of quantitative integrity approaches for vision navigation, relying on the use of image or image rendering techniques. The ability to provide quantifiable integrity is a critical aspect for utilization of vision systems as a viable means of precision navigation. This research describes the development of two unique approaches for determining uncertainty and integrity for a vision based, precision, relative navigation system, and is based on the concept of using a single camera vision system, such as an electro-optical (EO) or infrared imaging (IR) sensor, to monitor for unacceptably large and potentially unsafe relative navigation errors. The first approach formulates the integrity solution by means of discrete detection methods, for which the systems monitors for conditions when the platform is outside of a defined operational area, thus preventing hazardously misleading information (HMI). The second approach utilizes a generalized Bayesian inference approach, in which a full pdf determination of the estimated navigation state is realized. These integrity approaches are demonstrated, in the context of an aerial refueling application, to provide extremely high levels (10-6) of navigation integrity. Additionally, various sensitivities analyzes show the robustness of these integrity approaches to various vision sensor effects and sensor trade-offs

    Federated Domain Generalization: A Survey

    Full text link
    Machine learning typically relies on the assumption that training and testing distributions are identical and that data is centrally stored for training and testing. However, in real-world scenarios, distributions may differ significantly and data is often distributed across different devices, organizations, or edge nodes. Consequently, it is imperative to develop models that can effectively generalize to unseen distributions where data is distributed across different domains. In response to this challenge, there has been a surge of interest in federated domain generalization (FDG) in recent years. FDG combines the strengths of federated learning (FL) and domain generalization (DG) techniques to enable multiple source domains to collaboratively learn a model capable of directly generalizing to unseen domains while preserving data privacy. However, generalizing the federated model under domain shifts is a technically challenging problem that has received scant attention in the research area so far. This paper presents the first survey of recent advances in this area. Initially, we discuss the development process from traditional machine learning to domain adaptation and domain generalization, leading to FDG as well as provide the corresponding formal definition. Then, we categorize recent methodologies into four classes: federated domain alignment, data manipulation, learning strategies, and aggregation optimization, and present suitable algorithms in detail for each category. Next, we introduce commonly used datasets, applications, evaluations, and benchmarks. Finally, we conclude this survey by providing some potential research topics for the future
    corecore