1,232 research outputs found

    Semantic 3D Occupancy Mapping through Efficient High Order CRFs

    Full text link
    Semantic 3D mapping can be used for many applications such as robot navigation and virtual interaction. In recent years, there has been great progress in semantic segmentation and geometric 3D mapping. However, it is still challenging to combine these two tasks for accurate and large-scale semantic mapping from images. In the paper, we propose an incremental and (near) real-time semantic mapping system. A 3D scrolling occupancy grid map is built to represent the world, which is memory and computationally efficient and bounded for large scale environments. We utilize the CNN segmentation as prior prediction and further optimize 3D grid labels through a novel CRF model. Superpixels are utilized to enforce smoothness and form robust P N high order potential. An efficient mean field inference is developed for the graph optimization. We evaluate our system on the KITTI dataset and improve the segmentation accuracy by 10% over existing systems.Comment: IROS 201

    Advances in Graph-Cut Optimization: Multi-Surface Models, Label Costs, and Hierarchical Costs

    Get PDF
    Computer vision is full of problems that are elegantly expressed in terms of mathematical optimization, or energy minimization. This is particularly true of low-level inference problems such as cleaning up noisy signals, clustering and classifying data, or estimating 3D points from images. Energies let us state each problem as a clear, precise objective function. Minimizing the correct energy would, hypothetically, yield a good solution to the corresponding problem. Unfortunately, even for low-level problems we are confronted by energies that are computationally hard—often NP-hard—to minimize. As a consequence, a rather large portion of computer vision research is dedicated to proposing better energies and better algorithms for energies. This dissertation presents work along the same line, specifically new energies and algorithms based on graph cuts. We present three distinct contributions. First we consider biomedical segmentation where the object of interest comprises multiple distinct regions of uncertain shape (e.g. blood vessels, airways, bone tissue). We show that this common yet difficult scenario can be modeled as an energy over multiple interacting surfaces, and can be globally optimized by a single graph cut. Second, we introduce multi-label energies with label costs and provide algorithms to minimize them. We show how label costs are useful for clustering and robust estimation problems in vision. Third, we characterize a class of energies with hierarchical costs and propose a novel hierarchical fusion algorithm with improved approximation guarantees. Hierarchical costs are natural for modeling an array of difficult problems, e.g. segmentation with hierarchical context, simultaneous estimation of motions and homographies, or detecting hierarchies of patterns

    Rapid Segmentation Techniques for Cardiac and Neuroimage Analysis

    Get PDF
    Recent technological advances in medical imaging have allowed for the quick acquisition of highly resolved data to aid in diagnosis and characterization of diseases or to guide interventions. In order to to be integrated into a clinical work flow, accurate and robust methods of analysis must be developed which manage this increase in data. Recent improvements in in- expensive commercially available graphics hardware and General-Purpose Programming on Graphics Processing Units (GPGPU) have allowed for many large scale data analysis problems to be addressed in meaningful time and will continue to as parallel computing technology improves. In this thesis we propose methods to tackle two clinically relevant image segmentation problems: a user-guided segmentation of myocardial scar from Late-Enhancement Magnetic Resonance Images (LE-MRI) and a multi-atlas segmentation pipeline to automatically segment and partition brain tissue from multi-channel MRI. Both methods are based on recent advances in computer vision, in particular max-flow optimization that aims at solving the segmentation problem in continuous space. This allows for (approximately) globally optimal solvers to be employed in multi-region segmentation problems, without the particular drawbacks of their discrete counterparts, graph cuts, which typically present with metrication artefacts. Max-flow solvers are generally able to produce robust results, but are known for being computationally expensive, especially with large datasets, such as volume images. Additionally, we propose two new deformable registration methods based on Gauss-Newton optimization and smooth the resulting deformation fields via total-variation regularization to guarantee the problem is mathematically well-posed. We compare the performance of these two methods against four highly ranked and well-known deformable registration methods on four publicly available databases and are able to demonstrate a highly accurate performance with low run times. The best performing variant is subsequently used in a multi-atlas segmentation pipeline for the segmentation of brain tissue and facilitates fast run times for this computationally expensive approach. All proposed methods are implemented using GPGPU for a substantial increase in computational performance and so facilitate deployment into clinical work flows. We evaluate all proposed algorithms in terms of run times, accuracy, repeatability and errors arising from user interactions and we demonstrate that these methods are able to outperform established methods. The presented approaches demonstrate high performance in comparison with established methods in terms of accuracy and repeatability while largely reducing run times due to the employment of GPU hardware

    Contributions of Continuous Max-Flow Theory to Medical Image Processing

    Get PDF
    Discrete graph cuts and continuous max-flow theory have created a paradigm shift in many areas of medical image processing. As previous methods limited themselves to analytically solvable optimization problems or guaranteed only local optimizability to increasingly complex and non-convex functionals, current methods based now rely on describing an optimization problem in a series of general yet simple functionals with a global, but non-analytic, solution algorithms. This has been increasingly spurred on by the availability of these general-purpose algorithms in an open-source context. Thus, graph-cuts and max-flow have changed every aspect of medical image processing from reconstruction to enhancement to segmentation and registration. To wax philosophical, continuous max-flow theory in particular has the potential to bring a high degree of mathematical elegance to the field, bridging the conceptual gap between the discrete and continuous domains in which we describe different imaging problems, properties and processes. In Chapter 1, we use the notion of infinitely dense and infinitely densely connected graphs to transfer between the discrete and continuous domains, which has a certain sense of mathematical pedantry to it, but the resulting variational energy equations have a sense of elegance and charm. As any application of the principle of duality, the variational equations have an enigmatic side that can only be decoded with time and patience. The goal of this thesis is to show the contributions of max-flow theory through image enhancement and segmentation, increasing incorporation of topological considerations and increasing the role played by user knowledge and interactivity. These methods will be rigorously grounded in calculus of variations, guaranteeing fuzzy optimality and providing multiple solution approaches to addressing each individual problem

    Image Parsing with a Wide Range of Classes and Scene-Level Context

    Full text link
    This paper presents a nonparametric scene parsing approach that improves the overall accuracy, as well as the coverage of foreground classes in scene images. We first improve the label likelihood estimates at superpixels by merging likelihood scores from different probabilistic classifiers. This boosts the classification performance and enriches the representation of less-represented classes. Our second contribution consists of incorporating semantic context in the parsing process through global label costs. Our method does not rely on image retrieval sets but rather assigns a global likelihood estimate to each label, which is plugged into the overall energy function. We evaluate our system on two large-scale datasets, SIFTflow and LMSun. We achieve state-of-the-art performance on the SIFTflow dataset and near-record results on LMSun.Comment: Published at CVPR 2015, Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference o

    A Comparative Study of Modern Inference Techniques for Structured Discrete Energy Minimization Problems

    Get PDF
    International audienceSzeliski et al. published an influential study in 2006 on energy minimization methods for Markov Random Fields (MRF). This study provided valuable insights in choosing the best optimization technique for certain classes of problems. While these insights remain generally useful today, the phenomenal success of random field models means that the kinds of inference problems that have to be solved changed significantly. Specifically , the models today often include higher order interactions, flexible connectivity structures, large label-spaces of different car-dinalities, or learned energy tables. To reflect these changes, we provide a modernized and enlarged study. We present an empirical comparison of more than 27 state-of-the-art optimization techniques on a corpus of 2,453 energy minimization instances from diverse applications in computer vision. To ensure reproducibility, we evaluate all methods in the OpenGM 2 framework and report extensive results regarding runtime and solution quality. Key insights from our study agree with the results of Szeliski et al. for the types of models they studied. However, on new and challenging types of models our findings disagree and suggest that polyhedral methods and integer programming solvers are competitive in terms of runtime and solution quality over a large range of model types

    Lecture Notes of Tensor Network Contractions

    Get PDF
    Tensor network (TN), a young mathematical tool of high vitality and great potential, has been undergoing extremely rapid developments in the last two decades, gaining tremendous success in condensed matter physics, atomic physics, quantum information science, statistical physics, and so on. In this lecture notes, we focus on the contraction algorithms of TN as well as some of the applications to the simulations of quantum many-body systems. Starting from basic concepts and definitions, we first explain the relations between TN and physical problems, including the TN representations of classical partition functions, quantum many-body states (by matrix product state, tree TN, and projected entangled pair state), time evolution simulations, etc. These problems, which are challenging to solve, can be transformed to TN contraction problems. We present then several paradigm algorithms based on the ideas of the numerical renormalization group and/or boundary states, including density matrix renormalization group, time-evolving block decimation, coarse-graining/corner tensor renormalization group, and several distinguished variational algorithms. Finally, we revisit the TN approaches from the perspective of multi-linear algebra (also known as tensor algebra or tensor decompositions) and quantum simulation. Despite the apparent differences in the ideas and strategies of different TN algorithms, we aim at revealing the underlying relations and resemblances in order to present a systematic picture to understand the TN contraction approaches.Comment: 134 pages, 68 figures. In this version, the manuscript has been changed into the format of book; new sections about tensor network and quantum circuits have been adde
    • 

    corecore