4,313 research outputs found
Combining Multiple Clusterings via Crowd Agreement Estimation and Multi-Granularity Link Analysis
The clustering ensemble technique aims to combine multiple clusterings into a
probably better and more robust clustering and has been receiving an increasing
attention in recent years. There are mainly two aspects of limitations in the
existing clustering ensemble approaches. Firstly, many approaches lack the
ability to weight the base clusterings without access to the original data and
can be affected significantly by the low-quality, or even ill clusterings.
Secondly, they generally focus on the instance level or cluster level in the
ensemble system and fail to integrate multi-granularity cues into a unified
model. To address these two limitations, this paper proposes to solve the
clustering ensemble problem via crowd agreement estimation and
multi-granularity link analysis. We present the normalized crowd agreement
index (NCAI) to evaluate the quality of base clusterings in an unsupervised
manner and thus weight the base clusterings in accordance with their clustering
validity. To explore the relationship between clusters, the source aware
connected triple (SACT) similarity is introduced with regard to their common
neighbors and the source reliability. Based on NCAI and multi-granularity
information collected among base clusterings, clusters, and data instances, we
further propose two novel consensus functions, termed weighted evidence
accumulation clustering (WEAC) and graph partitioning with multi-granularity
link analysis (GP-MGLA) respectively. The experiments are conducted on eight
real-world datasets. The experimental results demonstrate the effectiveness and
robustness of the proposed methods.Comment: The MATLAB source code of this work is available at:
https://www.researchgate.net/publication/28197031
Segmentation of turbulent computational fluid dynamics simulations with unsupervised ensemble learning
Computer vision and machine learning tools offer an exciting new way for automatically analyzing and categorizing information from complex computer simulations. Here we design an ensemble machine learning framework that can independently and robustly categorize and dissect simulation data output contents of turbulent flow patterns into distinct structure catalogs. The segmentation is performed using an unsupervised clustering algorithm, which segments physical structures by grouping together similar pixels in simulation images. The accuracy and robustness of the resulting segment region boundaries are enhanced by combining information from multiple simultaneously-evaluated clustering operations. The stacking of object segmentation evaluations is performed using image mask combination operations. This statistically-combined ensemble (SCE) of different cluster masks allows us to construct cluster reliability metrics for each pixel and for the associated segments without any prior user input. By comparing the similarity of different cluster occurrences in the ensemble, we can also assess the optimal number of clusters needed to describe the data. Furthermore, by relying on ensemble-averaged spatial segment region boundaries, the SCE method enables reconstruction of more accurate and robust region of interest (ROI) boundaries for the different image data clusters. We apply the SCE algorithm to 2-dimensional simulation data snapshots of magnetically-dominated fully-kinetic turbulent plasma flows where accurate ROI boundaries are needed for geometrical measurements of intermittent flow structures known as current sheets.Peer reviewe
Automatic region-of-interest extraction in low depth-of-field images
PhD ThesisAutomatic extraction of focused regions from images with low depth-of-field
(DOF) is a problem without an efficient solution yet. The capability of
extracting focused regions can help to bridge the semantic gap by integrating
image regions which are meaningfully relevant and generally do not exhibit
uniform visual characteristics. There exist two main difficulties for extracting
focused regions from low DOF images using high-frequency based techniques:
computational complexity and performance.
A novel unsupervised segmentation approach based on ensemble clustering is
proposed to extract the focused regions from low DOF images in two stages.
The first stage is to cluster image blocks in a joint contrast-energy feature space
into three constituent groups. To achieve this, we make use of a normal
mixture-based model along with standard expectation-maximization (EM)
algorithm at two consecutive levels of block size. To avoid the common
problem of local optima experienced in many models, an ensemble EM
clustering algorithm is proposed. As a result, relevant blocks, i.e., block-based
region-of-interest (ROI), closely conforming to image objects are extracted.
In stage two, two different approaches have been developed to extract
pixel-based ROI. In the first approach, a binary saliency map is constructed
from the relevant blocks at the pixel level, which is based on difference of
Gaussian (DOG) and binarization methods. Then, a set of morphological
operations is employed to create the pixel-based ROI from the map.
Experimental results demonstrate that the proposed approach achieves an
average segmentation performance of 91.3% and is computationally 3 times
faster than the best existing approach. In the second approach, a minimal graph
cut is constructed by using the max-flow method and also by using
object/background seeds provided by the ensemble clustering algorithm.
Experimental results demonstrate an average segmentation performance of 91.7%
and approximately 50% reduction of the average computational time by the
proposed colour based approach compared with existing unsupervised
approaches
- …