2,315 research outputs found
Non-Parametric Probabilistic Image Segmentation
We propose a simple probabilistic generative model for
image segmentation. Like other probabilistic algorithms
(such as EM on a Mixture of Gaussians) the proposed model
is principled, provides both hard and probabilistic cluster
assignments, as well as the ability to naturally incorporate
prior knowledge. While previous probabilistic approaches
are restricted to parametric models of clusters (e.g., Gaussians)
we eliminate this limitation. The suggested approach
does not make heavy assumptions on the shape of the clusters
and can thus handle complex structures. Our experiments
show that the suggested approach outperforms previous
work on a variety of image segmentation tasks
DSMK-means “Density-based Split-and-Merge K-means clustering Algorithm”
Clustering is widely used to explore and understand large collections of data. K-means clustering method is one of the most popular approaches due to its ease of use and simplicity to implement. This paper introduces Density-based Split-and-Merge K-means clustering Algorithm (DSMK-means), which is developed to address stability problems of standard K-means clustering algorithm, and to improve the performance of clustering when dealing with datasets that contain clusters with different complex shapes and noise or outliers. Based on a set of many experiments, this paper concluded that developed algorithms “DSMK-means” are more capable of finding high accuracy results compared with other algorithms especially as they can process datasets containing clusters with different shapes, densities, or those with outliers and noise
Mapping Topographic Structure in White Matter Pathways with Level Set Trees
Fiber tractography on diffusion imaging data offers rich potential for
describing white matter pathways in the human brain, but characterizing the
spatial organization in these large and complex data sets remains a challenge.
We show that level set trees---which provide a concise representation of the
hierarchical mode structure of probability density functions---offer a
statistically-principled framework for visualizing and analyzing topography in
fiber streamlines. Using diffusion spectrum imaging data collected on
neurologically healthy controls (N=30), we mapped white matter pathways from
the cortex into the striatum using a deterministic tractography algorithm that
estimates fiber bundles as dimensionless streamlines. Level set trees were used
for interactive exploration of patterns in the endpoint distributions of the
mapped fiber tracks and an efficient segmentation of the tracks that has
empirical accuracy comparable to standard nonparametric clustering methods. We
show that level set trees can also be generalized to model pseudo-density
functions in order to analyze a broader array of data types, including entire
fiber streamlines. Finally, resampling methods show the reliability of the
level set tree as a descriptive measure of topographic structure, illustrating
its potential as a statistical descriptor in brain imaging analysis. These
results highlight the broad applicability of level set trees for visualizing
and analyzing high-dimensional data like fiber tractography output
Analysis of Mass Based and Density Based Clustering Techniques on Numerical Datasets
Clustering is the techniques adopted by data mining tools across a range of application . It provides several algorithms that can assess large data set based on specific parameters & group related points . This paper gives comparative analysis of density based clustering algorithms and mass based clustering algorithms. DBSCAN [15] is a base algorithm for density based clustering techniques. One of the advantages of using these techniques is that method does not require the number of clusters to be given a prior and it can detect the clusters of different shapes and sizes from large amount of data which contains noise and outliers. OPTICS [14] on the other hand does not produce a clustering of a data set explicitly, but instead creates an augmented ordering of the database representing its density based clustering structure. Mass based clustering algorithm mass estimation technique is used (it is alternate of density based clustering) .In Mass based clustering algorithm [22] there are also core regions and noise points are used as a parameter. We analyze the algorithms in terms of the parameters essential for creating meaningful clusters. All the algorithms are tested using numerical data sets for low as well as high dimensional data sets. Keywords: Mass Based (DEMassDBSCAN) ,DBSCAN,OPTICS
Semi-supervised model-based clustering with controlled clusters leakage
In this paper, we focus on finding clusters in partially categorized data
sets. We propose a semi-supervised version of Gaussian mixture model, called
C3L, which retrieves natural subgroups of given categories. In contrast to
other semi-supervised models, C3L is parametrized by user-defined leakage
level, which controls maximal inconsistency between initial categorization and
resulting clustering. Our method can be implemented as a module in practical
expert systems to detect clusters, which combine expert knowledge with true
distribution of data. Moreover, it can be used for improving the results of
less flexible clustering techniques, such as projection pursuit clustering. The
paper presents extensive theoretical analysis of the model and fast algorithm
for its efficient optimization. Experimental results show that C3L finds high
quality clustering model, which can be applied in discovering meaningful groups
in partially classified data
- …