36,277 research outputs found

    Doctor of Philosophy in Computing

    Get PDF
    dissertationImage segmentation is the problem of partitioning an image into disjoint segments that are perceptually or semantically homogeneous. As one of the most fundamental computer vision problems, image segmentation is used as a primary step for high-level vision tasks, such as object recognition and image understanding, and has even wider applications in interdisciplinary areas, such as longitudinal brain image analysis. Hierarchical models have gained popularity as a key component in image segmentation frameworks. By imposing structures, a hierarchical model can efficiently utilize features from larger image regions and make optimal inference for final segmentation feasible. We develop a hierarchical merge tree (HMT) model for image segmentation. Motivated by the application in large-scale segmentation of neuronal structures in electron microscopy (EM) images, our model provides a compact representation of region merging hypotheses and utilizes higher order information for efficient segmentation inference. Taking advantage of supervised learning, our model is free from parameter tuning and outperforms previous state-of-the-art methods on both two-dimensional (2D) and three-dimensional EM image data sets without any change. We also extend HMT to the hierarchical merge forest (HMF) model. By identifying region correspondences, HMF utilizes inter-section information to correct intra-section errors and improves 2D EM segmentation accuracy. HMT is a generic segmentation model. We demonstrate this by applying it to natural image segmentation problems. We propose a constrained conditional model formulation with a globally optimal inference algorithm for HMT and an iterative merge tree sampling algorithm that significantly improves its performance. Experimental results show our approach achieves state-of-the-art accuracy for object-independent image segmentation. Finally, we propose a semi-supervised HMT (SSHMT) model to reduce the high demand for labeled data by supervised learning. We introduce a differentiable unsupervised loss term that enforces consistent boundary predictions and develop a Bayesian learning model that combines supervised and unsupervised information. We show that with a very small amount of labeled data, SSHMT consistently performs close to the supervised HMT with full labeled data sets and significantly outperforms HMT trained with the same labeled subsets

    STV-based Video Feature Processing for Action Recognition

    Get PDF
    In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end

    Automated detection of extended sources in radio maps: progress from the SCORPIO survey

    Get PDF
    Automated source extraction and parameterization represents a crucial challenge for the next-generation radio interferometer surveys, such as those performed with the Square Kilometre Array (SKA) and its precursors. In this paper we present a new algorithm, dubbed CAESAR (Compact And Extended Source Automated Recognition), to detect and parametrize extended sources in radio interferometric maps. It is based on a pre-filtering stage, allowing image denoising, compact source suppression and enhancement of diffuse emission, followed by an adaptive superpixel clustering stage for final source segmentation. A parameterization stage provides source flux information and a wide range of morphology estimators for post-processing analysis. We developed CAESAR in a modular software library, including also different methods for local background estimation and image filtering, along with alternative algorithms for both compact and diffuse source extraction. The method was applied to real radio continuum data collected at the Australian Telescope Compact Array (ATCA) within the SCORPIO project, a pathfinder of the ASKAP-EMU survey. The source reconstruction capabilities were studied over different test fields in the presence of compact sources, imaging artefacts and diffuse emission from the Galactic plane and compared with existing algorithms. When compared to a human-driven analysis, the designed algorithm was found capable of detecting known target sources and regions of diffuse emission, outperforming alternative approaches over the considered fields.Comment: 15 pages, 9 figure

    Combining multiple resolutions into hierarchical representations for kernel-based image classification

    Get PDF
    Geographic object-based image analysis (GEOBIA) framework has gained increasing interest recently. Following this popular paradigm, we propose a novel multiscale classification approach operating on a hierarchical image representation built from two images at different resolutions. They capture the same scene with different sensors and are naturally fused together through the hierarchical representation, where coarser levels are built from a Low Spatial Resolution (LSR) or Medium Spatial Resolution (MSR) image while finer levels are generated from a High Spatial Resolution (HSR) or Very High Spatial Resolution (VHSR) image. Such a representation allows one to benefit from the context information thanks to the coarser levels, and subregions spatial arrangement information thanks to the finer levels. Two dedicated structured kernels are then used to perform machine learning directly on the constructed hierarchical representation. This strategy overcomes the limits of conventional GEOBIA classification procedures that can handle only one or very few pre-selected scales. Experiments run on an urban classification task show that the proposed approach can highly improve the classification accuracy w.r.t. conventional approaches working on a single scale.Comment: International Conference on Geographic Object-Based Image Analysis (GEOBIA 2016), University of Twente in Enschede, The Netherland
    • …
    corecore