22 research outputs found

    A First Derivative Potts Model for Segmentation and Denoising Using ILP

    Full text link
    Unsupervised image segmentation and denoising are two fundamental tasks in image processing. Usually, graph based models such as multicut are used for segmentation and variational models are employed for denoising. Our approach addresses both problems at the same time. We propose a novel ILP formulation of the first derivative Potts model with the â„“1\ell_1 data term, where binary variables are introduced to deal with the â„“0\ell_0 norm of the regularization term. The ILP is then solved by a standard off-the-shelf MIP solver. Numerical experiments are compared with the multicut problem.Comment: 6 pages, 2 figures. To appear at Proceedings of International Conference on Operations Research 2017, Berli

    MILP Formulations for Unsupervised and Interactive Image Segmentation and Denoising

    Get PDF
    Image segmentation and denoising are two key components of modern computer vision systems. The Potts model plays an important role for denoising of piecewise defined functions, and Markov Random Field (MRF) using Potts terms are popular in image segmentation. We propose Mixed Integer Linear Programming (MILP) formulations for both models, and utilize standard MILP solvers to efficiently solve them. Firstly, we investigate the discrete first derivative (piecewise constant) Potts model with the ` 1 norm data term. We propose a novel MILP formulation by introducing binary edge variables to model the Potts prior. We look into the facet-defining inequalities for the associated integer polytope. We apply the model for generating superpixels on noisy images. Secondly, we propose a MILP formulation for the discrete piecewise affine Potts model. To obtain consistent partitions, the inclusion of multicut constraints is necessary, which is added iteratively using the cutting plane method. We apply the model for simultaneously segmenting and denoising depth images. Thirdly, MILP formulations of MRF models with global connectivity constraints were investigated previously, but only simplified versions of the problem were solved. We investigate this problem via a branch-and-cut method and propose a user-interactive way for segmentation. Our proposed MILPs are in general NP-hard, but they can be used to generate globally optimal solutions and ground-truth results. We also propose three fast heuristic algorithms that provide good solutions in very short time. The MILPs can be applied as a post-processing method on top of any algorithms, not only providing a guarantee on the quality, but also seek for better solutions within the branch-and-cut framework of the solver. We demonstrate the power and usefulness of our methods by extensive experiments against other state-of-the-art methods on synthetic images, standard image datasets, as well as medical images with trained probability maps

    Toward more scalable structured models

    Get PDF
    While deep learning has achieved huge success across different disciplines from computer vision and natural language processing to computational biology and physical sciences, training such models is known to require significant amounts of data. One possible reason is that the structural properties of the data and problem are not modeled explicitly. Effectively exploiting the structure can help build more efficient and performing models. The complexity of the structure requires models with enough representation capabilities. However, increased structured model complexity usually leads to increased inference complexity and trickier learning procedures. Also, making progress on real-world applications requires learning paradigms that circumvent the limitation of evaluating the partition function and scale to high-dimensional datasets. In this dissertation, we develop more scalable structured models, i.e., models with inference procedures that can handle complex dependencies between variables efficiently, and learning algorithms that operate in high-dimensional spaces. First, we extend Gaussian conditional random fields, traditionally unimodal and only capturing pairwise variables interactions, to model multi-modal distributions with high-order dependencies between the output space variables, while enabling exact inference and incorporating external constraints at runtime. We show compelling results on the task of diverse gray-image colorization. Then, we introduce a reinforcement learning-based method for solving inference in models with general higher-order potentials, that are intractable with traditional techniques. We show promising results on semantic segmentation. Finally, we propose a new loss, max-sliced score matching (MSSM), for learning structured models at scale. We assess our model on an estimation of densities and scores for implicit distributions in Variational and Wasserstein auto-encoders

    Optimization of Markov Random Fields in Computer Vision

    No full text
    A large variety of computer vision tasks can be formulated using Markov Random Fields (MRF). Except in certain special cases, optimizing an MRF is intractable, due to a large number of variables and complex dependencies between them. In this thesis, we present new algorithms to perform inference in MRFs, that are either more efficient (in terms of running time and/or memory usage) or more effective (in terms of solution quality), than the state-of-the-art methods. First, we introduce a memory efficient max-flow algorithm for multi-label submodular MRFs. In fact, such MRFs have been shown to be optimally solvable using max-flow based on an encoding of the labels proposed by Ishikawa, in which each variable XiX_i is represented by ℓ\ell nodes (where ℓ\ell is the number of labels) arranged in a column. However, this method in general requires 2 ℓ22\,\ell^2 edges for each pair of neighbouring variables. This makes it inapplicable to realistic problems with many variables and labels, due to excessive memory requirement. By contrast, our max-flow algorithm stores 2 ℓ2\,\ell values per variable pair, requiring much less storage. Consequently, our algorithm makes it possible to optimally solve multi-label submodular problems involving large numbers of variables and labels on a standard computer. Next, we present a move-making style algorithm for multi-label MRFs with robust non-convex priors. In particular, our algorithm iteratively approximates the original MRF energy with an appropriately weighted surrogate energy that is easier to minimize. Furthermore, it guarantees that the original energy decreases at each iteration. To this end, we consider the scenario where the weighted surrogate energy is multi-label submodular (i.e., it can be optimally minimized by max-flow), and show that our algorithm then lets us handle of a large variety of non-convex priors. Finally, we consider the fully connected Conditional Random Field (dense CRF) with Gaussian pairwise potentials that has proven popular and effective for multi-class semantic segmentation. While the energy of a dense CRF can be minimized accurately using a Linear Programming (LP) relaxation, the state-of-the-art algorithm is too slow to be useful in practice. To alleviate this deficiency, we introduce an efficient LP minimization algorithm for dense CRFs. To this end, we develop a proximal minimization framework, where the dual of each proximal problem is optimized via block-coordinate descent. We show that each block of variables can be optimized in a time linear in the number of pixels and labels. Consequently, our algorithm enables efficient and effective optimization of dense CRFs with Gaussian pairwise potentials. We evaluated all our algorithms on standard energy minimization datasets consisting of computer vision problems, such as stereo, inpainting and semantic segmentation. The experiments at the end of each chapter provide compelling evidence that all our approaches are either more efficient or more effective than all existing baselines

    Courbure discrète : théorie et applications

    Get PDF
    International audienceThe present volume contains the proceedings of the 2013 Meeting on discrete curvature, held at CIRM, Luminy, France. The aim of this meeting was to bring together researchers from various backgrounds, ranging from mathematics to computer science, with a focus on both theory and applications. With 27 invited talks and 8 posters, the conference attracted 70 researchers from all over the world. The challenge of finding a common ground on the topic of discrete curvature was met with success, and these proceedings are a testimony of this wor

    Optimization for Image Segmentation

    Get PDF
    Image segmentation, i.e., assigning each pixel a discrete label, is an essential task in computer vision with lots of applications. Major techniques for segmentation include for example Markov Random Field (MRF), Kernel Clustering (KC), and nowadays popular Convolutional Neural Networks (CNN). In this work, we focus on optimization for image segmentation. Techniques like MRF, KC, and CNN optimize MRF energies, KC criteria, or CNN losses respectively, and their corresponding optimization is very different. We are interested in the synergy and the complementary benefits of MRF, KC, and CNN for interactive segmentation and semantic segmentation. Our first contribution is pseudo-bound optimization for binary MRF energies that are high-order or non-submodular. Secondly, we propose Kernel Cut, a novel formulation for segmentation, which combines MRF regularization with Kernel Clustering. We show why to combine KC with MRF and how to optimize the joint objective. In the third part, we discuss how deep CNN segmentation can benefit from non-deep (i.e., shallow) methods like MRF and KC. In particular, we propose regularized losses for weakly-supervised CNN segmentation, in which we can integrate MRF energy or KC criteria as part of the losses. Minimization of regularized losses is a principled approach to semi-supervised learning, in general. Our regularized loss method is very simple and allows different kinds of regularization losses for CNN segmentation. We also study the optimization of regularized losses beyond gradient descent. Our regularized losses approach achieves state-of-the-art accuracy in semantic segmentation with near full supervision quality

    Holistic interpretation of visual data based on topology:semantic segmentation of architectural facades

    Get PDF
    The work presented in this dissertation is a step towards effectively incorporating contextual knowledge in the task of semantic segmentation. To date, the use of context has been confined to the genre of the scene with a few exceptions in the field. Research has been directed towards enhancing appearance descriptors. While this is unarguably important, recent studies show that computer vision has reached a near-human level of performance in relying on these descriptors when objects have stable distinctive surface properties and in proper imaging conditions. When these conditions are not met, humans exploit their knowledge about the intrinsic geometric layout of the scene to make local decisions. Computer vision lags behind when it comes to this asset. For this reason, we aim to bridge the gap by presenting algorithms for semantic segmentation of building facades making use of scene topological aspects. We provide a classification scheme to carry out segmentation and recognition simultaneously.The algorithm is able to solve a single optimization function and yield a semantic interpretation of facades, relying on the modeling power of probabilistic graphs and efficient discrete combinatorial optimization tools. We tackle the same problem of semantic facade segmentation with the neural network approach.We attain accuracy figures that are on-par with the state-of-the-art in a fully automated pipeline.Starting from pixelwise classifications obtained via Convolutional Neural Networks (CNN). These are then structurally validated through a cascade of Restricted Boltzmann Machines (RBM) and Multi-Layer Perceptron (MLP) that regenerates the most likely layout. In the domain of architectural modeling, there is geometric multi-model fitting. We introduce a novel guided sampling algorithm based on Minimum Spanning Trees (MST), which surpasses other propagation techniques in terms of robustness to noise. We make a number of additional contributions such as measure of model deviation which captures variations among fitted models
    corecore