362 research outputs found

    Multiscale Centerline Extraction Based on Regression and Projection onto the Set of Elongated Structures

    Get PDF
    Automatically extracting linear structures from images is a fundamental low-level vision problem with numerous applications in different domains. Centerline detection and radial estimation are the first crucial steps in most Computer Vision pipelines aiming to reconstruct linear structures. Existing techniques rely either on hand-crafted filters, designed to respond to ideal profiles of the linear structure, or on classification-based approaches, which automatically learn to detect centerline points from data. Hand-crafted methods are the most accurate when the content of the image fulfills the ideal model they rely on. However, they lose accuracy in the presence of noise or when the linear structures are irregular and deviate from the ideal case. Machine learning techniques can alleviate this problem. However, they are mainly based on a classification framework. In this thesis, we show that classification is not the best formalism to solve the centerline detection problem. In fact, since the appearance of a centerline point is very similar to the points immediately next to it, the output of a classifier trained to detect centerlines presents low localization accuracy and double responses on the body of the linear structure. To solve this problem, we propose a regression-based formulation for centerline detection. We rely on the distance transform of the centerlines to automatically learn a function whose local maxima correspond to centerline points. The output of our method can be used to directly estimate the location of the centerline, by a simple Non-Maximum Suppression operation, or it can be used as input to a tracing pipeline to reconstruct the graph of the linear structure. In both cases, our method gives more accurate results than state-of-the-art techniques on challenging 2D and 3D datasets. Our method relies on features extracted by means of convolutional filters. In order to process large amount of data efficiently, we introduce a general filter bank approximation scheme. In particular, we show that a generic filter bank can be approximated by a linear combination of a smaller set of separable filters. Thanks to this method, we can greatly reduce the computation time of the convolutions, without loss of accuracy. Our approach is general, and we demonstrate its effectiveness by applying it to different Computer Vision problems, such as linear structure detection and image classification with Convolutional Neural Networks. We further improve our regression-based method for centerline detection by taking advantage of contextual image information. We adopt a multiscale iterative regression approach to efficiently include a large image context in our algorithm. Compared to previous approaches, we use context both in the spatial domain and in the radial one. In this way, our method is also able to return an accurate estimation of the radii of the linear structures. The idea of using regression can also be beneficial for solving other related Computer Vision problems. For example, we show an improvement compared to previous works when applying it to boundary and membrane detection. Finally, we focus on the particular geometric properties of the linear structures. We observe that most methods for detecting them treat each pixel independently and do not model the strong relation that exists between neighboring pixels. As a consequence, their output is geometrically inconsistent. In this thesis, we address this problem by considering the projection of the score map returned by our regressor onto the set of all geometrically admissible ground truth images. We propose an efficient patch-wise approximation scheme to compute the projection. Moreover, we provide conditions under which the projection is exact. We demonstrate the advantage of our method by applying it to four different problems

    Inferring Geodesic Cerebrovascular Graphs: Image Processing, Topological Alignment and Biomarkers Extraction

    Get PDF
    A vectorial representation of the vascular network that embodies quantitative features - location, direction, scale, and bifurcations - has many potential neuro-vascular applications. Patient-specific models support computer-assisted surgical procedures in neurovascular interventions, while analyses on multiple subjects are essential for group-level studies on which clinical prediction and therapeutic inference ultimately depend. This first motivated the development of a variety of methods to segment the cerebrovascular system. Nonetheless, a number of limitations, ranging from data-driven inhomogeneities, the anatomical intra- and inter-subject variability, the lack of exhaustive ground-truth, the need for operator-dependent processing pipelines, and the highly non-linear vascular domain, still make the automatic inference of the cerebrovascular topology an open problem. In this thesis, brain vessels’ topology is inferred by focusing on their connectedness. With a novel framework, the brain vasculature is recovered from 3D angiographies by solving a connectivity-optimised anisotropic level-set over a voxel-wise tensor field representing the orientation of the underlying vasculature. Assuming vessels joining by minimal paths, a connectivity paradigm is formulated to automatically determine the vascular topology as an over-connected geodesic graph. Ultimately, deep-brain vascular structures are extracted with geodesic minimum spanning trees. The inferred topologies are then aligned with similar ones for labelling and propagating information over a non-linear vectorial domain, where the branching pattern of a set of vessels transcends a subject-specific quantized grid. Using a multi-source embedding of a vascular graph, the pairwise registration of topologies is performed with the state-of-the-art graph matching techniques employed in computer vision. Functional biomarkers are determined over the neurovascular graphs with two complementary approaches. Efficient approximations of blood flow and pressure drop account for autoregulation and compensation mechanisms in the whole network in presence of perturbations, using lumped-parameters analog-equivalents from clinical angiographies. Also, a localised NURBS-based parametrisation of bifurcations is introduced to model fluid-solid interactions by means of hemodynamic simulations using an isogeometric analysis framework, where both geometry and solution profile at the interface share the same homogeneous domain. Experimental results on synthetic and clinical angiographies validated the proposed formulations. Perspectives and future works are discussed for the group-wise alignment of cerebrovascular topologies over a population, towards defining cerebrovascular atlases, and for further topological optimisation strategies and risk prediction models for therapeutic inference. Most of the algorithms presented in this work are available as part of the open-source package VTrails

    Generalizations of the Multicut Problem for Computer Vision

    Get PDF
    Graph decomposition has always been a very important concept in machine learning and computer vision. Many tasks like image and mesh segmentation, community detection in social networks, as well as object tracking and human pose estimation can be formulated as a graph decomposition problem. The multicut problem in particular is a popular model to optimize for a decomposition of a given graph. Its main advantage is that no prior knowledge about the number of components or their sizes is required. However, it has several limitations, which we address in this thesis: Firstly, the multicut problem allows to specify only cost or reward for putting two direct neighbours into distinct components. This limits the expressibility of the cost function. We introduce special edges into the graph that allow to define cost or reward for putting any two vertices into distinct components, while preserving the original set of feasible solutions. We show that this considerably improves the quality of image and mesh segmentations. Second, multicut is notorious to be NP-hard for general graphs, that limits its applications to small super-pixel graphs. We define and implement two primal feasible heuristics to solve the problem. They do not provide any guarantees on the runtime or quality of solutions, but in practice show good convergence behaviour. We perform an extensive comparison on multiple graphs of different sizes and properties. Third, we extend the multicut framework by introducing node labels, so that we can jointly optimize for graph decomposition and nodes classification by means of exactly the same optimization algorithm, thus eliminating the need to hand-tune optimizers for a particular task. To prove its universality we applied it to diverse computer vision tasks, including human pose estimation, multiple object tracking, and instance-aware semantic segmentation. We show that we can improve the results over the prior art using exactly the same data as in the original works. Finally, we use employ multicuts in two applications: 1) a client-server tool for interactive video segmentation: After the pre-processing of the video a user draws strokes on several frames and a time-coherent segmentation of the entire video is performed on-the-fly. 2) we formulate a method for simultaneous segmentation and tracking of living cells in microscopy data. This task is challenging as cells split and our algorithm accounts for this, creating parental hierarchies. We also present results on multiple model fitting. We find models in data heavily corrupted by noise by finding components defining these models using higher order multicuts. We introduce an interesting extension that allows our optimization to pick better hyperparameters for each discovered model. In summary, this thesis extends the multicut problem in different directions, proposes algorithms for optimization, and applies it to novel data and settings.Die Zerlegung von Graphen ist ein sehr wichtiges Konzept im maschinellen Lernen und maschinellen Sehen. Viele Aufgaben wie Bild- und Gittersegmentierung, Kommunitätserkennung in sozialen Netzwerken, sowie Objektverfolgung und Schätzung von menschlichen Posen können als Graphzerlegungsproblem formuliert werden. Der Mehrfachschnitt-Ansatz ist ein populäres Mittel um über die Zerlegungen eines gegebenen Graphen zu optimieren. Sein größter Vorteil ist, dass kein Vorwissen über die Anzahl an Komponenten und deren Größen benötigt wird. Dennoch hat er mehrere ernsthafte Limitierungen, welche wir in dieser Arbeit behandeln: Erstens erlaubt der klassische Mehrfachschnitt nur die Spezifikation von Kosten oder Belohnungen für die Trennung von zwei Nachbarn in verschiedene Komponenten. Dies schränkt die Ausdrucksfähigkeit der Kostenfunktion ein und führt zu suboptimalen Ergebnissen. Wir fügen dem Graphen spezielle Kanten hinzu, welche es erlauben, Kosten oder Belohnungen für die Trennung von beliebigen Paaren von Knoten in verschiedene Komponenten zu definieren, ohne die Menge an zulässigen Lösungen zu verändern. Wir zeigen, dass dies die Qualität von Bild- und Gittersegmentierungen deutlich verbessert. Zweitens ist das Mehrfachschnittproblem berüchtigt dafür NP-schwer für allgemeine Graphen zu sein, was die Anwendungen auf kleine superpixel-basierte Graphen einschränkt. Wir definieren und implementieren zwei primal-zulässige Heuristiken um das Problem zu lösen. Diese geben keine Garantien bezüglich der Laufzeit oder der Qualität der Lösungen, zeigen in der Praxis jedoch gutes Konvergenzverhalten. Wir führen einen ausführlichen Vergleich auf vielen Graphen verschiedener Größen und Eigenschaften durch. Drittens erweitern wir den Mehrfachschnitt-Ansatz um Knoten-Kennzeichnungen, sodass wir gemeinsam über Zerlegungen und Knoten-Klassifikationen mit dem gleichen Optimierungs-Algorithmus optimieren können. Dadurch wird der Bedarf der Feinabstimmung einzelner aufgabenspezifischer Löser aus dem Weg geräumt. Um die Allgemeingültigkeit dieses Ansatzes zu überprüfen, haben wir ihn auf verschiedenen Aufgaben des maschinellen Sehens, einschließlich menschliche Posenschätzung, Mehrobjektverfolgung und instanz-bewusste semantische Segmentierung, angewandt. Wir zeigen, dass wir Resultate von vorherigen Arbeiten mit exakt den gleichen Daten verbessern können. Abschließend benutzen wir Mehrfachschnitte in zwei Anwendungen: 1) Ein Nutzer-Server-Werkzeug für interaktive Video Segmentierung: Nach der Vorbearbeitung eines Videos zeichnet der Nutzer Striche auf mehrere Einzelbilder und eine zeit-kohärente Segmentierung des gesamten Videos wird in Echtzeit berechnet. 2) Wir formulieren eine Methode für simultane Segmentierung und Verfolgung von lebenden Zellen in Mikroskopie-Aufnahmen. Diese Aufgabe ist anspruchsvoll, da Zellen sich aufteilen und unser Algorithmus dies in der Erstellung von Eltern-Hierarchien mitberücksichtigen muss. Wir präsentieren außerdem Resultate zur Mehrmodellanpassung. Wir berechnen Modelle in stark verrauschten Daten indem wir mithilfe von Mehrfachschnitten höherer Ordnung Komponenten finden, die diesen Modellen entsprechen. Wir führen eine interessante Erweiterung ein, die es unserer Optimierung erlaubt, bessere Hyperparameter für jedes entdeckte Modell auszuwählen. Zusammenfassend erweitert diese Arbeit den Mehrfachschnitt-Ansatz in unterschiedlichen Richtungen, schlägt Algorithmen zur Inferenz in den resultierenden Modellen vor und wendet ihn auf neuartigen Daten und Umgebungen an

    Pixel-level semantic understanding of ophthalmic images and beyond

    Get PDF
    Computer-assisted semantic image understanding constitutes the substrate of applications that range from biomarker detection to intraoperative guidance or street scene understanding for self-driving systems. This PhD thesis is on the development of deep learning-based, pixel-level, semantic segmentation methods for medical and natural images. For vessel segmentation in OCT-A, a method comprising iterative refinement of the extracted vessel maps and an auxiliary loss function that penalizes structural inaccuracies, is proposed and tested on data captured from real clinical conditions comprising various pathological cases. Ultimately, the presented method enables the extraction of a detailed vessel map of the retina with potential applications to diagnostics or intraoperative localization. Furthermore, for scene segmentation in cataract surgery, the major challenge of class imbalance is identified among several factors. Subsequently, a method addressing it is proposed, achieving state-of-the-art performance on a challenging public dataset. Accurate semantic segmentation in this domain can be used to monitor interactions between tools and anatomical parts for intraoperative guidance and safety. Finally, this thesis proposes a novel contrastive learning framework for supervised semantic segmentation, that aims to improve the discriminative power of features in deep neural networks. The proposed approach leverages contrastive loss function applied both at multiple model layers and across them. Importantly, the proposed framework is easy to combine with various model architectures and is experimentally shown to significantly improve performance on both natural and medical domain

    Automatic Autism Spectrum Disorder Detection Using Artificial Intelligence Methods with MRI Neuroimaging: A Review

    Full text link
    Autism spectrum disorder (ASD) is a brain condition characterized by diverse signs and symptoms that appear in early childhood. ASD is also associated with communication deficits and repetitive behavior in affected individuals. Various ASD detection methods have been developed, including neuroimaging modalities and psychological tests. Among these methods, magnetic resonance imaging (MRI) imaging modalities are of paramount importance to physicians. Clinicians rely on MRI modalities to diagnose ASD accurately. The MRI modalities are non-invasive methods that include functional (fMRI) and structural (sMRI) neuroimaging methods. However, the process of diagnosing ASD with fMRI and sMRI for specialists is often laborious and time-consuming; therefore, several computer-aided design systems (CADS) based on artificial intelligence (AI) have been developed to assist the specialist physicians. Conventional machine learning (ML) and deep learning (DL) are the most popular schemes of AI used for diagnosing ASD. This study aims to review the automated detection of ASD using AI. We review several CADS that have been developed using ML techniques for the automated diagnosis of ASD using MRI modalities. There has been very limited work on the use of DL techniques to develop automated diagnostic models for ASD. A summary of the studies developed using DL is provided in the appendix. Then, the challenges encountered during the automated diagnosis of ASD using MRI and AI techniques are described in detail. Additionally, a graphical comparison of studies using ML and DL to diagnose ASD automatically is discussed. We conclude by suggesting future approaches to detecting ASDs using AI techniques and MRI neuroimaging

    Multispectral Image Road Extraction Based Upon Automated Map Conflation

    Get PDF
    Road network extraction from remotely sensed imagery enables many important and diverse applications such as vehicle tracking, drone navigation, and intelligent transportation studies. There are, however, a number of challenges to road detection from an image. Road pavement material, width, direction, and topology vary across a scene. Complete or partial occlusions caused by nearby buildings, trees, and the shadows cast by them, make maintaining road connectivity difficult. The problems posed by occlusions are exacerbated with the increasing use of oblique imagery from aerial and satellite platforms. Further, common objects such as rooftops and parking lots are made of materials similar or identical to road pavements. This problem of common materials is a classic case of a single land cover material existing for different land use scenarios. This work addresses these problems in road extraction from geo-referenced imagery by leveraging the OpenStreetMap digital road map to guide image-based road extraction. The crowd-sourced cartography has the advantages of worldwide coverage that is constantly updated. The derived road vectors follow only roads and so can serve to guide image-based road extraction with minimal confusion from occlusions and changes in road material. On the other hand, the vector road map has no information on road widths and misalignments between the vector map and the geo-referenced image are small but nonsystematic. Properly correcting misalignment between two geospatial datasets, also known as map conflation, is an essential step. A generic framework requiring minimal human intervention is described for multispectral image road extraction and automatic road map conflation. The approach relies on the road feature generation of a binary mask and a corresponding curvilinear image. A method for generating the binary road mask from the image by applying a spectral measure is presented. The spectral measure, called anisotropy-tunable distance (ATD), differs from conventional measures and is created to account for both changes of spectral direction and spectral magnitude in a unified fashion. The ATD measure is particularly suitable for differentiating urban targets such as roads and building rooftops. The curvilinear image provides estimates of the width and orientation of potential road segments. Road vectors derived from OpenStreetMap are then conflated to image road features by applying junction matching and intermediate point matching, followed by refinement with mean-shift clustering and morphological processing to produce a road mask with piecewise width estimates. The proposed approach is tested on a set of challenging, large, and diverse image data sets and the performance accuracy is assessed. The method is effective for road detection and width estimation of roads, even in challenging scenarios when extensive occlusion occurs

    ISCR Annual Report: Fical Year 2004

    Full text link
    corecore