18 research outputs found

    A Human Visual System-Driven Image Segmentation Algorithm

    Get PDF
    This paper presents a novel image segmentation algorithm driven by human visual system (HVS) properties. Quality metrics for evaluating the segmentation result, from both region-based and boundary-based perspectives, are integrated into an objective function. The objective function encodes the HVS properties into a Markov random fields (MRF) framework, where the just-noticeable difference (JND) model is employed when calculating the difference between the image contents. Experiments are carried out to compare the performances of three variations of the presented algorithm and several representative segmentation algorithms available in the literature. Results are very encouraging and show that the presented algorithms outperform the state-of-the-art image segmentation algorithms

    Oil Spill Candidate Detection Using a Conditional Random Field Model on Simulated Compact Polarimetric Imagery

    Get PDF
    This is an Accepted Manuscript of an article published by Taylor & Francis in Canadian Journal of Remote Sensing on 20 April 2022, available online: https://doi.org/10.1080/07038992.2022.2055534Although the compact polarimetric (CP) synthetic aperture radar (SAR) mode of the RADARSAT Constellation Mission (RCM) offers new opportunities for oil spill candidate detection, there has not been an efficient machine learning model explicitly designed to utilize this new CP SAR data for improved detection. This paper presents a conditional random field model based on the Wishart mixture model (CRF-WMM) to detect oil spill candidates in CP SAR imagery. First, a “Wishart mixture model” (WMM) is designed as the unary potential in the CRF-WMM to address the class-dependent information of oil spill candidates and oil-free water. Second, we introduce a new similarity measure based on CP statistics designed as a pairwise potential in the CRF-WMM model so that pixels with strong spatial connections have the same class label. Finally, we investigate three different optimization approaches to solve the resulting maximum a posterior (MAP) problem, namely iterated conditional modes (ICM), simulated annealing (SA), and graph cuts (GC). The results show that our proposed CRF-WMM model can delineate oil spill candidates better than the traditional CRF approaches and that the GC algorithm provides the best optimization.Natural Sciences and Engineering Research Council of Canada (NSERC),Grant RGPIN-2017-04869 || NSERC, Grant DGDND-2017-00078 || NSERC, Grant RGPAS2017-50794 || NSERC, Grant RGPIN-2019-06744

    Geodesic Active Fields:A Geometric Framework for Image Registration

    Get PDF
    Image registration is the concept of mapping homologous points in a pair of images. In other words, one is looking for an underlying deformation field that matches one image to a target image. The spectrum of applications of image registration is extremely large: It ranges from bio-medical imaging and computer vision, to remote sensing or geographic information systems, and even involves consumer electronics. Mathematically, image registration is an inverse problem that is ill-posed, which means that the exact solution might not exist or not be unique. In order to render the problem tractable, it is usual to write the problem as an energy minimization, and to introduce additional regularity constraints on the unknown data. In the case of image registration, one often minimizes an image mismatch energy, and adds an additive penalty on the deformation field regularity as smoothness prior. Here, we focus on the registration of the human cerebral cortex. Precise cortical registration is required, for example, in statistical group studies in functional MR imaging, or in the analysis of brain connectivity. In particular, we work with spherical inflations of the extracted hemispherical surface and associated features, such as cortical mean curvature. Spatial mapping between cortical surfaces can then be achieved by registering the respective spherical feature maps. Despite the simplified spherical geometry, inter-subject registration remains a challenging task, mainly due to the complexity and inter-subject variability of the involved brain structures. In this thesis, we therefore present a registration scheme, which takes the peculiarities of the spherical feature maps into particular consideration. First, we realize that we need an appropriate hierarchical representation, so as to coarsely align based on the important structures with greater inter-subject stability, before taking smaller and more variable details into account. Based on arguments from brain morphogenesis, we propose an anisotropic scale-space of mean-curvature maps, built around the Beltrami framework. Second, inspired by concepts from vision-related elements of psycho-physical Gestalt theory, we hypothesize that anisotropic Beltrami regularization better suits the requirements of image registration regularization, compared to traditional Gaussian filtering. Different objects in an image should be allowed to move separately, and regularization should be limited to within the individual Gestalts. We render the regularization feature-preserving by limiting diffusion across edges in the deformation field, which is in clear contrast to the indifferent linear smoothing. We do so by embedding the deformation field as a manifold in higher-dimensional space, and minimize the associated Beltrami energy which represents the hyperarea of this embedded manifold as measure of deformation field regularity. Further, instead of simply adding this regularity penalty to the image mismatch in lieu of the standard penalty, we propose to incorporate the local image mismatch as weighting function into the Beltrami energy. The image registration problem is thus reformulated as a weighted minimal surface problem. This approach has several appealing aspects, including (1) invariance to re-parametrization and ability to work with images defined on non-flat, Riemannian domains (e.g., curved surfaces, scalespaces), and (2) intrinsic modulation of the local regularization strength as a function of the local image mismatch and/or noise level. On a side note, we show that the proposed scheme can easily keep up with recent trends in image registration towards using diffeomorphic and inverse consistent deformation models. The proposed registration scheme, called Geodesic Active Fields (GAF), is non-linear and non-convex. Therefore we propose an efficient optimization scheme, based on splitting. Data-mismatch and deformation field regularity are optimized over two different deformation fields, which are constrained to be equal. The constraint is addressed using an augmented Lagrangian scheme, and the resulting optimization problem is solved efficiently using alternate minimization of simpler sub-problems. In particular, we show that the proposed method can easily compete with state-of-the-art registration methods, such as Demons. Finally, we provide an implementation of the fast GAF method on the sphere, so as to register the triangulated cortical feature maps. We build an automatic parcellation algorithm for the human cerebral cortex, which combines the delineations available on a set of atlas brains in a Bayesian approach, so as to automatically delineate the corresponding regions on a subject brain given its feature map. In a leave-one-out cross-validation study on 39 brain surfaces with 35 manually delineated gyral regions, we show that the pairwise subject-atlas registration with the proposed spherical registration scheme significantly improves the individual alignment of cortical labels between subject and atlas brains, and, consequently, that the estimated automatic parcellations after label fusion are of better quality

    Noise-Enhanced and Human Visual System-Driven Image Processing: Algorithms and Performance Limits

    Get PDF
    This dissertation investigates the problem of image processing based on stochastic resonance (SR) noise and human visual system (HVS) properties, where several novel frameworks and algorithms for object detection in images, image enhancement and image segmentation as well as the method to estimate the performance limit of image segmentation algorithms are developed. Object detection in images is a fundamental problem whose goal is to make a decision if the object of interest is present or absent in a given image. We develop a framework and algorithm to enhance the detection performance of suboptimal detectors using SR noise, where we add a suitable dose of noise into the original image data and obtain the performance improvement. Micro-calcification detection is employed in this dissertation as an illustrative example. The comparative experiments with a large number of images verify the efficiency of the presented approach. Image enhancement plays an important role and is widely used in various vision tasks. We develop two image enhancement approaches. One is based on SR noise, HVS-driven image quality evaluation metrics and the constrained multi-objective optimization (MOO) technique, which aims at refining the existing suboptimal image enhancement methods. Another is based on the selective enhancement framework, under which we develop several image enhancement algorithms. The two approaches are applied to many low quality images, and they outperform many existing enhancement algorithms. Image segmentation is critical to image analysis. We present two segmentation algorithms driven by HVS properties, where we incorporate the human visual perception factors into the segmentation procedure and encode the prior expectation on the segmentation results into the objective functions through Markov random fields (MRF). Our experimental results show that the presented algorithms achieve higher segmentation accuracy than many representative segmentation and clustering algorithms available in the literature. Performance limit, or performance bound, is very useful to evaluate different image segmentation algorithms and to analyze the segmentability of the given image content. We formulate image segmentation as a parameter estimation problem and derive a lower bound on the segmentation error, i.e., the mean square error (MSE) of the pixel labels considered in our work, using a modified Cramér-Rao bound (CRB). The derivation is based on the biased estimator assumption, whose reasonability is verified in this dissertation. Experimental results demonstrate the validity of the derived bound

    Learning Discriminative Features and Structured Models for Segmentation in Microscopy and Natural Images

    Get PDF
    Segmenting images is a significant challenge that has drawn a lot of attention from different fields of artificial intelligence and has many practical applications. One such challenge addressed in this thesis is the segmentation of electron microscope (EM) imaging of neural tissue. EM microscopy is one of the key tools used to analyze neural tissue and understand the brain, but the huge amounts of data it produces make automated analysis necessary. In addition to the challenges specific to EM data, the common problems encountered in image segmentation must also be addressed. These problems include extracting discriminative features from the data and constructing a statistical model using ground-truth data. Although complex models appear to be more attractive because they allow for more expressiveness, they also lead to a higher computational complexity. On the other hand, simple models come with a lower complexity but less faithfully express the real world. Therefore, one of the most challenging tasks in image segmentation is in constructing models that are expressive enough while remaining tractable. In this work, we propose several automated graph partitioning approaches that address these issues. These methods reduce the computational complexity by operating on supervoxels instead of voxels, incorporating features capable of describing the 3D shape of the target objects and using structured models to account for correlation in output variables. One of the non-trivial issues with such models is that their parameters must be carefully chosen for optimal performance. A popular approach to learning model parameters is a maximum-margin approach called Structured SVM (SSVM) that provides optimality guarantees but also suffers from two main drawbacks. First, SSVM-based approaches are usually limited to linear kernels, since more powerful nonlinear kernels cause the learning to become prohibitively expensive. In this thesis, we introduce an approach to “kernelize” the features so that a linear SSVM framework can leverage the power of nonlinear kernels without incurring their high computational cost. Second, the optimality guarentees are violated for complex models with strong inter-relations between the output variables. We propose a new subgradient-based method that is more robust and leads to improved convergence properties and increased reliability. The different approaches presented in this thesis are applicable to both natural and medical images. They are able to segment mitochondria at a performance level close to that of a human annotator, and outperform state-of-the-art segmentation techniques while still benefiting from a low learning time

    Novel block-based motion estimation and segmentation for video coding

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Tracking-by-Assignment as a Probabilistic Graphical Model with Applications in Developmental Biology

    Get PDF
    This thesis presents a novel approach for tracking a varying number of divisible objects with similar appearance in the presence of a non-negligible number of false positive detections (more than 10%). It is applied to the reconstruction of cell lineages in developing zebrafish and fruit fly embryos from 3d time-lapse record- ings. The model takes the form of a chain graph—a mixed directed-undirected probabilistic graphical model—and a tracking is obtained simultaneously over all time slices from the maximum a-posteriori configuration. The tracking model is used as the second step in a two-step pipeline to produce digital embryos—maps of cell nuclei in an embryo and their ancestral fate; the first step being the segmentation of the fluorescently-stained cell nuclei in light sheet microscopy images. The pipeline is implemented as a software with an intuitive graphical user interface. It is the first freely available program of its kind and makes the presented methods accessible to a broad audience of users from the life sciences

    Efficient multi-level scene understanding in videos

    No full text
    Automatic video parsing is a key step towards human-level dynamic scene understanding, and a fundamental problem in computer vision. A core issue in video understanding is to infer multiple scene properties of a video in an efficient and consistent manner. This thesis addresses the problem of holistic scene understanding from monocular videos, which jointly reason about semantic and geometric scene properties from multiple levels, including pixelwise annotation of video frames, object instance segmentation in spatio-temporal domain, and/or scene-level description in terms of scene categories and layouts. We focus on four main issues in the holistic video understanding: 1) what is the representation for consistent semantic and geometric parsing of videos? 2) how do we integrate high-level reasoning (e.g., objects) with pixel-wise video parsing? 3) how can we do efficient inference for multi-level video understanding? and 4) what is the representation learning strategy for efficient/cost-aware scene parsing? We discuss three multi-level video scene segmentation scenarios based on different aspects of scene properties and efficiency requirements. The first case addresses the problem of consistent geometric and semantic video segmentation for outdoor scenes. We propose a geometric scene layout representation, or a stage scene model, to efficiently capture the dependency between the semantic and geometric labels. We build a unified conditional random field for joint modeling of the semantic class, geometric label and the stage representation, and design an alternating inference algorithm to minimize the resulting energy function. The second case focuses on the problem of simultaneous pixel-level and object-level segmentation in videos. We propose to incorporate foreground object information into pixel labeling by jointly reasoning semantic labels of supervoxels, object instance tracks and geometric relations between objects. In order to model objects, we take an exemplar approach based on a small set of object annotations to generate a set of object proposals. We then design a conditional random field framework that jointly models the supervoxel labels and object instance segments. To scale up our method, we develop an active inference strategy to improve the efficiency of multi-level video parsing, which adaptively selects an informative subset of object proposals and performs inference on the resulting compact model. The last case explores the problem of learning a flexible representation for efficient scene labeling. We propose a dynamic hierarchical model that allows us to achieve flexible trade-offs between efficiency and accuracy. Our approach incorporates the cost of feature computation and model inference, and optimizes the model performance for any given test-time budget. We evaluate all our methods on several publicly available video and image semantic segmentation datasets, and demonstrate superior performance in efficiency and accuracy. Keywords: Semantic video segmentation, Multi-level scene understanding, Efficient inference, Cost-aware scene parsin

    Spatial Modeling of Compact Polarimetric Synthetic Aperture Radar Imagery

    Get PDF
    The RADARSAT Constellation Mission (RCM) utilizes compact polarimetric (CP) mode to provide data with varying resolutions, supporting a wide range of applications including oil spill detection, sea ice mapping, and land cover analysis. However, the complexity and variability of CP data, influenced by factors such as weather conditions and satellite infrastructure, introduce signature ambiguity. This ambiguity poses challenges in accurate object classification, reducing discriminability and increasing uncertainty. To address these challenges, this thesis introduces tailored spatial models in CP SAR imagery through the utilization of machine learning techniques. Firstly, to enhance oil spill monitoring, a novel conditional random field (CRF) is introduced. The CRF model leverages the statistical properties of CP SAR data and exploits similarities in labels and features among neighboring pixels to effectively model spatial interactions. By mitigating the impact of speckle noise and accurately distinguishing oil spill candidates from oil-free water, the CRF model achieves successful results even in scenarios where the availability of labeled samples is limited. This highlights the capability of CRF in handling situations with a scarcity of training data. Secondly, to improve the accuracy of sea ice mapping, a region-based automated classification methodology is developed. This methodology incorporates learned features, spatial context, and statistical properties from various SAR modes, resulting in enhanced classification accuracy and improved algorithmic efficiency. Thirdly, the presence of a high degree of heterogeneity in target distribution presents an additional challenge in land cover mapping tasks, further compounded by signature ambiguity. To address this, a novel transformer model is proposed. The transformer model incorporates both fine- and coarse-grained spatial dependencies between pixels and leverages different levels of features to enhance the accuracy of land cover type detection. The proposed approaches have undergone extensive experimentation in various remote sensing tasks, validating their effectiveness. By introducing tailored spatial models and innovative algorithms, this thesis successfully addresses the inherent complexity and variability of CP data, thereby ensuring the accuracy and reliability of diverse applications in the field of remote sensing
    corecore