869 research outputs found

    Variational segmentation problems using prior knowledge in imaging and vision

    Get PDF

    Continuous Max-Flow for Image Segmentation with Shape Priors

    Get PDF
    In this thesis we propose a stable method for image segmentation with shape priors. The original Chan-Vese intensity based segmentation model with regularisation term is extended to include shape prior information. We study shape priors which are pose invariant under the group of similarity transformations, that is under rotation, scaling and translation. In order to solve this problem robustly and effectively, an algorithm based on the theory of max-flow and min-cut is used in addition to a gradient descent procedure for updating the pose parameters. Comprehensive experiments are provided to demonstrate the behaviour of the proposed method on different images.Master i Anvendt og beregningsorientert matematikkMAMN-MABMAB39

    SEGMENTATION, RECOGNITION, AND ALIGNMENT OF COLLABORATIVE GROUP MOTION

    Get PDF
    Modeling and recognition of human motion in videos has broad applications in behavioral biometrics, content-based visual data analysis, security and surveillance, as well as designing interactive environments. Significant progress has been made in the past two decades by way of new models, methods, and implementations. In this dissertation, we focus our attention on a relatively less investigated sub-area called collaborative group motion analysis. Collaborative group motions are those that typically involve multiple objects, wherein the motion patterns of individual objects may vary significantly in both space and time, but the collective motion pattern of the ensemble allows characterization in terms of geometry and statistics. Therefore, the motions or activities of an individual object constitute local information. A framework to synthesize all local information into a holistic view, and to explicitly characterize interactions among objects, involves large scale global reasoning, and is of significant complexity. In this dissertation, we first review relevant previous contributions on human motion/activity modeling and recognition, and then propose several approaches to answer a sequence of traditional vision questions including 1) which of the motion elements among all are the ones relevant to a group motion pattern of interest (Segmentation); 2) what is the underlying motion pattern (Recognition); and 3) how two motion ensembles are similar and how we can 'optimally' transform one to match the other (Alignment). Our primary practical scenario is American football play, where the corresponding problems are 1) who are offensive players; 2) what are the offensive strategy they are using; and 3) whether two plays are using the same strategy and how we can remove the spatio-temporal misalignment between them due to internal or external factors. The proposed approaches discard traditional modeling paradigm but explore either concise descriptors, hierarchies, stochastic mechanism, or compact generative model to achieve both effectiveness and efficiency. In particular, the intrinsic geometry of the spaces of the involved features/descriptors/quantities is exploited and statistical tools are established on these nonlinear manifolds. These initial attempts have identified new challenging problems in complex motion analysis, as well as in more general tasks in video dynamics. The insights gained from nonlinear geometric modeling and analysis in this dissertation may hopefully be useful toward a broader class of computer vision applications

    Efficient Methods for Continuous and Discrete Shape Analysis

    Get PDF
    When interpreting an image of a given object, humans are able to abstract from the presented color information in order to really see the presented object. This abstraction is also known as shape. The concept of shape is not defined exactly in Computer Vision and in this work, we use three different forms of these definitions in order to acquire and analyze shapes. This work is devoted to improve the efficiency of methods that solve important applications of shape analysis. The most important problem in order to analyze shapes is the problem of shape acquisition. To simplify this very challenging problem, numerous researchers have incorporated prior knowledge into the acquisition of shapes. We will present the first approach to acquire shapes given a certain shape knowledge that computes always the global minimum of the involved functional which incorporates a Mumford-Shah like functional with a certain class of shape priors including statistic shape prior and dynamical shape prior. In order to analyze shapes, it is not only important to acquire shapes, but also to classify shapes. In this work, we follow the concept of defining a distance function that measures the dissimilarity of two given shapes. There are two different ways of obtaining such a distance function that we address in this work. Firstly, we model the set of all shapes as a metric space induced by the shortest path on an orbifold. The shortest path will provide us with a shape morphing, i.e., a continuous transformation from one shape into another. Secondly, we address the problem of shape matching that finds corresponding points on two shapes with respect to a preselected feature. Our main contribution for the problem of shape morphing lies in the immense acceleration of the morphing computation. Instead of solving partial resp. ordinary differential equations, we are able to solve this problem via a gradient descent approach that subsequently shortens the length of a path on the given manifold. During our runtime test, we observed a run-time acceleration of up to a factor of 1000. Shape matching is a classical discrete problem. If each shape is discretized by N shape points, most Computer Vision methods needed a cubic run-time. We will provide two approaches how to reduce this worst-case complexity to O(N2 log(N)). One approach exploits the planarity of the involved graph in order to efficiently compute N shortest path in a graph of O(N2) vertices. The other approach computes a minimal cut in a planar graph in O(N log(N)). In order to make this approach applicable to shape matching, we improved the run-time of a recently developed graph cut approach by an empirical factor of 2–4

    Segmentation and quantification of spinal cord gray matter–white matter structures in magnetic resonance images

    Get PDF
    This thesis focuses on finding ways to differentiate the gray matter (GM) and white matter (WM) in magnetic resonance (MR) images of the human spinal cord (SC). The aim of this project is to quantify tissue loss in these compartments to study their implications on the progression of multiple sclerosis (MS). To this end, we propose segmentation algorithms that we evaluated on MR images of healthy volunteers. Segmentation of GM and WM in MR images can be done manually by human experts, but manual segmentation is tedious and prone to intra- and inter-rater variability. Therefore, a deterministic automation of this task is necessary. On axial 2D images acquired with a recently proposed MR sequence, called AMIRA, we experiment with various automatic segmentation algorithms. We first use variational model-based segmentation approaches combined with appearance models and later directly apply supervised deep learning to train segmentation networks. Evaluation of the proposed methods shows accurate and precise results, which are on par with manual segmentations. We test the developed deep learning approach on images of conventional MR sequences in the context of a GM segmentation challenge, resulting in superior performance compared to the other competing methods. To further assess the quality of the AMIRA sequence, we apply an already published GM segmentation algorithm to our data, yielding higher accuracy than the same algorithm achieves on images of conventional MR sequences. On a different topic, but related to segmentation, we develop a high-order slice interpolation method to address the large slice distances of images acquired with the AMIRA protocol at different vertebral levels, enabling us to resample our data to intermediate slice positions. From the methodical point of view, this work provides an introduction to computer vision, a mathematically focused perspective on variational segmentation approaches and supervised deep learning, as well as a brief overview of the underlying project's anatomical and medical background

    Dense Vision in Image-guided Surgery

    Get PDF
    Image-guided surgery needs an efficient and effective camera tracking system in order to perform augmented reality for overlaying preoperative models or label cancerous tissues on the 2D video images of the surgical scene. Tracking in endoscopic/laparoscopic scenes however is an extremely difficult task primarily due to tissue deformation, instrument invasion into the surgical scene and the presence of specular highlights. State of the art feature-based SLAM systems such as PTAM fail in tracking such scenes since the number of good features to track is very limited. When the scene is smoky and when there are instrument motions, it will cause feature-based tracking to fail immediately. The work of this thesis provides a systematic approach to this problem using dense vision. We initially attempted to register a 3D preoperative model with multiple 2D endoscopic/laparoscopic images using a dense method but this approach did not perform well. We subsequently proposed stereo reconstruction to directly obtain the 3D structure of the scene. By using the dense reconstructed model together with robust estimation, we demonstrate that dense stereo tracking can be incredibly robust even within extremely challenging endoscopic/laparoscopic scenes. Several validation experiments have been conducted in this thesis. The proposed stereo reconstruction algorithm has turned out to be the state of the art method for several publicly available ground truth datasets. Furthermore, the proposed robust dense stereo tracking algorithm has been proved highly accurate in synthetic environment (< 0.1 mm RMSE) and qualitatively extremely robust when being applied to real scenes in RALP prostatectomy surgery. This is an important step toward achieving accurate image-guided laparoscopic surgery.Open Acces

    Trends in Mathematical Imaging and Surface Processing

    Get PDF
    Motivated both by industrial applications and the challenge of new problems, one observes an increasing interest in the field of image and surface processing over the last years. It has become clear that even though the applications areas differ significantly the methodological overlap is enormous. Even if contributions to the field come from almost any discipline in mathematics, a major role is played by partial differential equations and in particular by geometric and variational modeling and by their numerical counterparts. The aim of the workshop was to gather a group of leading experts coming from mathematics, engineering and computer graphics to cover the main developments

    Interactive Segmentation of 3D Medical Images with Implicit Surfaces

    Get PDF
    To cope with a variety of clinical applications, research in medical image processing has led to a large spectrum of segmentation techniques that extract anatomical structures from volumetric data acquired with 3D imaging modalities. Despite continuing advances in mathematical models for automatic segmentation, many medical practitioners still rely on 2D manual delineation, due to the lack of intuitive semi-automatic tools in 3D. In this thesis, we propose a methodology and associated numerical schemes enabling the development of 3D image segmentation tools that are reliable, fast and interactive. These properties are key factors for clinical acceptance. Our approach derives from the framework of variational methods: segmentation is obtained by solving an optimization problem that translates the expected properties of target objects in mathematical terms. Such variational methods involve three essential components that constitute our main research axes: an objective criterion, a shape representation and an optional set of constraints. As objective criterion, we propose a unified formulation that extends existing homogeneity measures in order to model the spatial variations of statistical properties that are frequently encountered in medical images, without compromising efficiency. Within this formulation, we explore several shape representations based on implicit surfaces with the objective to cover a broad range of typical anatomical structures. Firstly, to model tubular shapes in vascular imaging, we introduce convolution surfaces in the variational context of image segmentation. Secondly, compact shapes such as lesions are described with a new representation that generalizes Radial Basis Functions with non-Euclidean distances, which enables the design of basis functions that naturally align with salient image features. Finally, we estimate geometric non-rigid deformations of prior templates to recover structures that have a predictable shape such as whole organs. Interactivity is ensured by restricting admissible solutions with additional constraints. Translating user input into constraints on the sign of the implicit representation at prescribed points in the image leads us to consider inequality-constrained optimization

    Efficient Dense Registration, Segmentation, and Modeling Methods for RGB-D Environment Perception

    Get PDF
    One perspective for artificial intelligence research is to build machines that perform tasks autonomously in our complex everyday environments. This setting poses challenges to the development of perception skills: A robot should be able to perceive its location and objects in its surrounding, while the objects and the robot itself could also be moving. Objects may not only be composed of rigid parts, but could be non-rigidly deformable or appear in a variety of similar shapes. Furthermore, it could be relevant to the task to observe object semantics. For a robot acting fluently and immediately, these perception challenges demand efficient methods. This theses presents novel approaches to robot perception with RGB-D sensors. It develops efficient registration, segmentation, and modeling methods for scene and object perception. We propose multi-resolution surfel maps as a concise representation for RGB-D measurements. We develop probabilistic registration methods that handle rigid scenes, scenes with multiple rigid parts that move differently, and scenes that undergo non-rigid deformations. We use these methods to learn and perceive 3D models of scenes and objects in both static and dynamic environments. For learning models of static scenes, we propose a real-time capable simultaneous localization and mapping approach. It aligns key views in RGB-D video using our rigid registration method and optimizes the pose graph of the key views. The acquired models are then perceived in live images through detection and tracking within a Bayesian filtering framework. An assumption frequently made for environment mapping is that the observed scene remains static during the mapping process. Through rigid multi-body registration, we take advantage of releasing this assumption: Our registration method segments views into parts that move independently between the views and simultaneously estimates their motion. Within simultaneous motion segmentation, localization, and mapping, we separate scenes into objects by their motion. Our approach acquires 3D models of objects and concurrently infers hierarchical part relations between them using probabilistic reasoning. It can be applied for interactive learning of objects and their part decomposition. Endowing robots with manipulation skills for a large variety of objects is a tedious endeavor if the skill is programmed for every instance of an object class. Furthermore, slight deformations of an instance could not be handled by an inflexible program. Deformable registration is useful to perceive such shape variations, e.g., between specific instances of a tool. We develop an efficient deformable registration method and apply it for the transfer of robot manipulation skills between varying object instances. On the object-class level, we segment images using random decision forest classifiers in real-time. The probabilistic labelings of individual images are fused in 3D semantic maps within a Bayesian framework. We combine our object-class segmentation method with simultaneous localization and mapping to achieve online semantic mapping in real-time. The methods developed in this thesis are evaluated in experiments on publicly available benchmark datasets and novel own datasets. We publicly demonstrate several of our perception approaches within integrated robot systems in the mobile manipulation context.Effiziente Dichte Registrierungs-, Segmentierungs- und Modellierungsmethoden für die RGB-D Umgebungswahrnehmung In dieser Arbeit beschäftigen wir uns mit Herausforderungen der visuellen Wahrnehmung für intelligente Roboter in Alltagsumgebungen. Solche Roboter sollen sich selbst in ihrer Umgebung zurechtfinden, und Wissen über den Verbleib von Objekten erwerben können. Die Schwierigkeit dieser Aufgaben erhöht sich in dynamischen Umgebungen, in denen ein Roboter die Bewegung einzelner Teile differenzieren und auch wahrnehmen muss, wie sich diese Teile bewegen. Bewegt sich ein Roboter selbständig in dieser Umgebung, muss er auch seine eigene Bewegung von der Veränderung der Umgebung unterscheiden. Szenen können sich aber nicht nur durch die Bewegung starrer Teile verändern. Auch die Teile selbst können ihre Form in nicht-rigider Weise ändern. Eine weitere Herausforderung stellt die semantische Interpretation von Szenengeometrie und -aussehen dar. Damit intelligente Roboter unmittelbar und flüssig handeln können, sind effiziente Algorithmen für diese Wahrnehmungsprobleme erforderlich. Im ersten Teil dieser Arbeit entwickeln wir effiziente Methoden zur Repräsentation und Registrierung von RGB-D Messungen. Zunächst stellen wir Multi-Resolutions-Oberflächenelement-Karten (engl. multi-resolution surfel maps, MRSMaps) als eine kompakte Repräsentation von RGB-D Messungen vor, die unseren effizienten Registrierungsmethoden zugrunde liegt. Bilder können effizient in dieser Repräsentation aggregiert werde, wobei auch mehrere Bilder aus verschiedenen Blickpunkten integriert werden können, um Modelle von Szenen und Objekte aus vielfältigen Ansichten darzustellen. Für die effiziente, robuste und genaue Registrierung von MRSMaps wird eine Methode vorgestellt, die Rigidheit der betrachteten Szene voraussetzt. Die Registrierung schätzt die Kamerabewegung zwischen den Bildern und gewinnt ihre Effizienz durch die Ausnutzung der kompakten multi-resolutionalen Darstellung der Karten. Die Registrierungsmethode erzielt hohe Bildverarbeitungsraten auf einer CPU. Wir demonstrieren hohe Effizienz, Genauigkeit und Robustheit unserer Methode im Vergleich zum bisherigen Stand der Forschung auf Vergleichsdatensätzen. In einem weiteren Registrierungsansatz lösen wir uns von der Annahme, dass die betrachtete Szene zwischen Bildern statisch ist. Wir erlauben nun, dass sich rigide Teile der Szene bewegen dürfen, und erweitern unser rigides Registrierungsverfahren auf diesen Fall. Unser Ansatz segmentiert das Bild in Bereiche einzelner Teile, die sich unterschiedlich zwischen Bildern bewegen. Wir demonstrieren hohe Segmentierungsgenauigkeit und Genauigkeit in der Bewegungsschätzung unter Echtzeitbedingungen für die Verarbeitung. Schließlich entwickeln wir ein Verfahren für die Wahrnehmung von nicht-rigiden Deformationen zwischen zwei MRSMaps. Auch hier nutzen wir die multi-resolutionale Struktur in den Karten für ein effizientes Registrieren von grob zu fein. Wir schlagen Methoden vor, um aus den geschätzten Deformationen die lokale Bewegung zwischen den Bildern zu berechnen. Wir evaluieren Genauigkeit und Effizienz des Registrierungsverfahrens. Der zweite Teil dieser Arbeit widmet sich der Verwendung unserer Kartenrepräsentation und Registrierungsmethoden für die Wahrnehmung von Szenen und Objekten. Wir verwenden MRSMaps und unsere rigide Registrierungsmethode, um dichte 3D Modelle von Szenen und Objekten zu lernen. Die räumlichen Beziehungen zwischen Schlüsselansichten, die wir durch Registrierung schätzen, werden in einem Simultanen Lokalisierungs- und Kartierungsverfahren (engl. simultaneous localization and mapping, SLAM) gegeneinander abgewogen, um die Blickposen der Schlüsselansichten zu schätzen. Für das Verfolgen der Kamerapose bezüglich der Modelle in Echtzeit, kombinieren wir die Genauigkeit unserer Registrierung mit der Robustheit von Partikelfiltern. Zu Beginn der Posenverfolgung, oder wenn das Objekt aufgrund von Verdeckungen oder extremen Bewegungen nicht weiter verfolgt werden konnte, initialisieren wir das Filter durch Objektdetektion. Anschließend wenden wir unsere erweiterten Registrierungsverfahren für die Wahrnehmung in nicht-rigiden Szenen und für die Übertragung von Objekthandhabungsfähigkeiten von Robotern an. Wir erweitern unseren rigiden Kartierungsansatz auf dynamische Szenen, in denen sich rigide Teile bewegen. Die Bewegungssegmente in Schlüsselansichten werden zueinander in Bezug gesetzt, um Äquivalenz- und Teilebeziehungen von Objekten probabilistisch zu inferieren, denen die Segmente entsprechen. Auch hier liefert unsere Registrierungsmethode die Bewegung der Kamera bezüglich der Objekte, die wir in einem SLAM Verfahren optimieren. Aus diesen Blickposen wiederum können wir die Bewegungssegmente in dichten Objektmodellen vereinen. Objekte einer Klasse teilen oft eine gemeinsame Topologie von funktionalen Elementen, die durch Formkorrespondenzen ermittelt werden kann. Wir verwenden unsere deformierbare Registrierung, um solche Korrespondenzen zu finden und die Handhabung eines Objektes durch einen Roboter auf neue Objektinstanzen derselben Klasse zu übertragen. Schließlich entwickeln wir einen echtzeitfähigen Ansatz, der Kategorien von Objekten in RGB-D Bildern erkennt und segmentiert. Die Segmentierung basiert auf Ensemblen randomisierter Entscheidungsbäume, die Geometrie- und Texturmerkmale zur Klassifikation verwenden. Wir fusionieren Segmentierungen von Einzelbildern einer Szene aus mehreren Ansichten in einer semantischen Objektklassenkarte mit Hilfe unseres SLAM-Verfahrens. Die vorgestellten Methoden werden auf öffentlich verfügbaren Vergleichsdatensätzen und eigenen Datensätzen evaluiert. Einige unserer Ansätze wurden auch in integrierten Robotersystemen für mobile Objekthantierungsaufgaben öffentlich demonstriert. Sie waren ein wichtiger Bestandteil für das Gewinnen der RoboCup-Roboterwettbewerbe in der RoboCup@Home Liga in den Jahren 2011, 2012 und 2013
    • …
    corecore