8 research outputs found

    Multiclass pixel labeling with non-local matching constraints

    Full text link

    What’s the Point: Semantic Segmentation with Point Supervision

    Get PDF
    The semantic image segmentation task presents a trade-off between test time accuracy and training-time annotation cost. Detailed per-pixel annotations enable training accurate models but are very time-consuming to obtain, image-level class labels are an order of magnitude cheaper but result in less accurate models. We take a natural step from image-level annotation towards stronger supervision: we ask annotators to point to an object if one exists. We incorporate this point supervision along with a novel objectness potential in the training loss function of a CNN model. Experimental results on the PASCAL VOC 2012 benchmark reveal that the combined effect of point-level supervision and objectness potential yields an improvement of 12.9% mIOU over image-level supervision. Further, we demonstrate that models trained with point-level supervision are more accurate than models trained with image-level, squiggle-level or full supervision given a fixed annotation budget.Comment: ECCV (2016) submissio

    Closed-Form Approximate CRF Training for Scalable Image Segmentation

    Get PDF
    We present LS-CRF, a new method for training cyclic Conditional Random Fields (CRFs) from large datasets that is inspired by classical closed-form expressions for the maximum likelihood parameters of a generative graphical model with tree topology. Training a CRF with LS-CRF requires only solving a set of independent regression problems, each of which can be solved efficiently in closed form or by an iterative solver. This makes LS-CRF orders of magnitude faster than classical CRF training based on probabilistic inference, and at the same time more flexible and easier to implement than other approximate techniques, such as pseudolikelihood or piecewise training. We apply LS-CRF to the task of semantic image segmentation, showing that it achieves on par accuracy to other training techniques at higher speed, thereby allowing efficient CRF training from very large training sets. For example, training a linearly parameterized pairwise CRF on 150,000 images requires less than one hour on a modern workstation

    Region-Based Approach for Single Image Super-Resolution

    Get PDF
    Single image super-resolution (SR) is a technique that generates a high- resolution image from a single low-resolution image [1,2,10,11]. Single image super- resolution can be generally classified into two groups: example-based and self-similarity based SR algorithms. The performance of the example-based SR algorithm depends on the similarity between testing data and the database. Usually, a large database is needed for better performance in general. This would result in heavy computational cost. The self-similarity based SR algorithm can generate a high-resolution (HR) image with sharper edges and fewer ringing artifacts if there is sufficient recurrence within or across scales of the same image [10, 11], but it is hard to generate HR details for an image region with fine texture. Based on the limitation of each type of SR algorithm, we propose to combine these two types of algorithms. We segment each image into regions based on image content, and choose the appropriate SR algorithm to recover the HR image for each region based on the texture feature. Our experimental results show that our proposed method takes advantage of each SR algorithm and can produce natural looking results with sharp edges, while suppressing ringing artifacts. We compute PSNR to qualitatively evaluate the SR results, and our proposed method outperforms the self-similarity based or example-based SR algorithm with higher PSNR (+0.1dB)

    Fluid Morphing for 2D Animations

    Get PDF
    Professionaalsel tasemel animeerimine on aeganõudev ja kulukas tegevus. Seda eriti sõltumatule arvutimängude tegijale. Siit tulenevalt osutub kasulikuks leida meetodeid, mis võimaldaks programmaatiliselt suurendada kaadrite arvu igas kahemõõtmelises raster animatsioonis. Vedeliku simulaatoriga eksperimenteerimine andis käesoleva töö autoritele idee, kuidas saavutada visuaalselt meeldiv kaadrite üleminek, kasutades selleks vedeliku dünaamikat. Tulemusena valmis programm, mis võib animaatori efektiivsust tõsta lausa mitmeid kordi. Autorid usuvad, et see avastus võib viia kahemõõtmeliste animatsioonide uuele võidukäigule — näiteks kaasaegsete arvutimängude kontekstis.Creation of professional animations is expensive and time-consuming, especially for the independent game developers. Therefore, it is rewarding to find a method that would programmatically increase the frame rate of any two-dimensional raster animation. Experimenting with a fluid simulator gave the authors an insight that to achieve visually pleasant and smooth animations, elements from fluid dynamics can be used. As a result, fluid image morphing was developed, allowing the animators to produce more significant frames than they would with the classic methods. The authors believe that this discovery could reintroduce hand drawn animations to modern computer games

    Proceedings of the MICCAI Challenge on Multimodal Brain Tumor Image Segmentation (BRATS) 2013

    Get PDF
    International audienceBecause of their unpredictable appearance and shape, segmenting brain tumors from multi-modal imaging data is one of the most challenging tasks in medical image analysis. Although many different segmentation strategies have been proposed in the literature, it is hard to compare existing methods because the validation datasets that are used differ widely in terms of input data (structural MR contrasts; perfusion or diffusion data; ...), the type of lesion (primary or secondary tumors; solid or infiltratively growing), and the state of the disease (pre- or post-treatment). In order to gauge the current state-of-the-art in automated brain tumor segmentation and compare between different methods, we are organizing a Multimodal Brain Tumor Image Segmentation (BRATS) challenge that is held in conjunction with the 16th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2013) on September 22nd, 2013 in Nagoya, Japan

    Computational models for image contour grouping

    Get PDF
    Contours are one dimensional curves which may correspond to meaningful entities such as object boundaries. Accurate contour detection will simplify many vision tasks such as object detection and image recognition. Due to the large variety of image content and contour topology, contours are often detected as edge fragments at first, followed by a second step known as {u0300}{u0300}contour grouping'' to connect them. Due to ambiguities in local image patches, contour grouping is essential for constructing globally coherent contour representation. This thesis aims to group contours so that they are consistent with human perception. We draw inspirations from Gestalt principles, which describe perceptual grouping ability of human vision system. In particular, our work is most relevant to the principles of closure, similarity, and past experiences. The first part of our contribution is a new computational model for contour closure. Most of existing contour grouping methods have focused on pixel-wise detection accuracy and ignored the psychological evidences for topological correctness. This chapter proposes a higher-order CRF model to achieve contour closure in the contour domain. We also propose an efficient inference method which is guaranteed to find integer solutions. Tested on the BSDS benchmark, our method achieves a superior contour grouping performance, comparable precision-recall curves, and more visually pleasant results. Our work makes progresses towards a better computational model of human perceptual grouping. The second part is an energy minimization framework for salient contour detection problem. Region cues such as color/texture homogeneity, and contour cues such as local contrast, are both useful for this task. In order to capture both kinds of cues in a joint energy function, topological consistency between both region and contour labels must be satisfied. Our technique makes use of the topological concept of winding numbers. By using a fast method for winding number computation, we find that a small number of linear constraints are sufficient for label consistency. Our method is instantiated by ratio-based energy functions. Due to cue integration, our method obtains improved results. User interaction can also be incorporated to further improve the results. The third part of our contribution is an efficient category-level image contour detector. The objective is to detect contours which most likely belong to a prescribed category. Our method, which is based on three levels of shape representation and non-parametric Bayesian learning, shows flexibility in learning from either human labeled edge images or unlabelled raw images. In both cases, our experiments obtain better contour detection results than competing methods. In addition, our training process is robust even with a considerable size of training samples. In contrast, state-of-the-art methods require more training samples, and often human interventions are required for new category training. Last but not least, in Chapter 7 we also show how to leverage contour information for symmetry detection. Our method is simple yet effective for detecting the symmetric axes of bilaterally symmetric objects in unsegmented natural scene images. Compared with methods based on feature points, our model can often produce better results for the images containing limited texture

    Multiclass pixel labeling with non-local matching constraints

    No full text
    A popular approach to pixel labeling problems, such as multiclass image segmentation, is to construct a pairwise conditional Markov random field (CRF) over image pixels where the pairwise term encodes a preference for smoothness within local 4-connected or 8-connected pixel neighborhoods. Recently, researchers have considered higherorder models that encode soft non-local constraints (e.g., label consistency, connectedness, or co-occurrence statistics). These new models and the associated energy minimization algorithms have significantly pushed the state-of-the-art for pixel labeling problems. In this paper, we consider a new non-local constraint that penalizes inconsistent pixel labels between disjoint image regions having similar appearance. We encode this constraint as a truncated higher-order matching potential function between pairs of image regions in a conditional Markov random field model and show how to perform efficient approximate MAP inference in the model. We experimentally demonstrate quantitative and qualitative improvements over a strong baseline pairwise conditional Markov random field model on two challenging multiclass pixel labeling datasets
    corecore