159,657 research outputs found
Clustering in a hyperbolic model of complex networks
In this paper we consider the clustering coefficient and clustering function
in a random graph model proposed by Krioukov et al.~in 2010. In this model,
nodes are chosen randomly inside a disk in the hyperbolic plane and two nodes
are connected if they are at most a certain hyperbolic distance from each
other. It has been shown that this model has various properties associated with
complex networks, e.g. power-law degree distribution, short distances and
non-vanishing clustering coefficient. Here we show that the clustering
coefficient tends in probability to a constant that we give explicitly
as a closed form expression in terms of and certain special
functions. This improves earlier work by Gugelmann et al., who proved that the
clustering coefficient remains bounded away from zero with high probability,
but left open the issue of convergence to a limiting constant. Similarly, we
are able to show that , the average clustering coefficient over all
vertices of degree exactly , tends in probability to a limit
which we give explicitly as a closed form expression in terms of
and certain special functions. We are able to extend this last result also to
sequences where grows as a function of . Our results show
that scales differently, as grows, for different ranges of
. More precisely, there exists constants depending on
and , such that as , if , if and
when . These
results contradict a claim of Krioukov et al., which stated that the limiting
values should always scale with as we let grow.Comment: 127 page
On Variants of k-means Clustering
\textit{Clustering problems} often arise in the fields like data mining,
machine learning etc. to group a collection of objects into similar groups with
respect to a similarity (or dissimilarity) measure. Among the clustering
problems, specifically \textit{-means} clustering has got much attention
from the researchers. Despite the fact that -means is a very well studied
problem its status in the plane is still an open problem. In particular, it is
unknown whether it admits a PTAS in the plane. The best known approximation
bound in polynomial time is 9+\eps.
In this paper, we consider the following variant of -means. Given a set
of points in and a real , find a finite set of
points in that minimizes the quantity . For any fixed dimension , we design a local
search PTAS for this problem. We also give a "bi-criterion" local search
algorithm for -means which uses (1+\eps)k centers and yields a solution
whose cost is at most (1+\eps) times the cost of an optimal -means
solution. The algorithm runs in polynomial time for any fixed dimension.
The contribution of this paper is two fold. On the one hand, we are being
able to handle the square of distances in an elegant manner, which yields near
optimal approximation bound. This leads us towards a better understanding of
the -means problem. On the other hand, our analysis of local search might
also be useful for other geometric problems. This is important considering that
very little is known about the local search method for geometric approximation.Comment: 15 page
Clustering properties of a generalised critical Euclidean network
Many real-world networks exhibit scale-free feature, have a small diameter
and a high clustering tendency. We have studied the properties of a growing
network, which has all these features, in which an incoming node is connected
to its th predecessor of degree with a link of length using a
probability proportional to . For , the
network is scale free at with the degree distribution and as in the Barab\'asi-Albert model (). We find a phase boundary in the plane along which
the network is scale-free. Interestingly, we find scale-free behaviour even for
for where the existence of a new universality class
is indicated from the behaviour of the degree distribution and the clustering
coefficients. The network has a small diameter in the entire scale-free region.
The clustering coefficients emulate the behaviour of most real networks for
increasing negative values of on the phase boundary.Comment: 4 pages REVTEX, 4 figure
Centroid-Based Clustering with ab-Divergences
Centroid-based clustering is a widely used technique within unsupervised learning
algorithms in many research fields. The success of any centroid-based clustering relies on the
choice of the similarity measure under use. In recent years, most studies focused on including several
divergence measures in the traditional hard k-means algorithm. In this article, we consider the
problem of centroid-based clustering using the family of ab-divergences, which is governed by two
parameters, a and b. We propose a new iterative algorithm, ab-k-means, giving closed-form solutions
for the computation of the sided centroids. The algorithm can be fine-tuned by means of this pair of
values, yielding a wide range of the most frequently used divergences. Moreover, it is guaranteed to
converge to local minima for a wide range of values of the pair (a, b). Our theoretical contribution
has been validated by several experiments performed with synthetic and real data and exploring the
(a, b) plane. The numerical results obtained confirm the quality of the algorithm and its suitability to
be used in several practical applications.MINECO TEC2017-82807-
Self-stabilizing k-clustering in mobile ad hoc networks
In this thesis, two silent self-stabilizing asynchronous distributed algorithms are given for constructing a k-clustering of a connected network of processes. These are the first self-stabilizing solutions to this problem. One algorithm, FLOOD, takes O( k) time and uses O(k log n) space per process, while the second algorithm, BFS-MIS-CLSTR, takes O(n) time and uses O(log n) space; where n is the size of the network. Processes have unique IDs, and there is no designated leader. BFS-MIS-CLSTR solves three problems; it elects a leader and constructs a BFS tree for the network, constructs a minimal independent set, and finally a k-clustering. Finding a minimal k-clustering is known to be NP -hard. If the network is a unit disk graph in a plane, BFS-MIS-CLSTR is within a factor of O(7.2552k) of choosing the minimal number of clusters; A lower bound is given, showing that any comparison-based algorithm for the k-clustering problem that takes o( diam) rounds has very bad worst case performance; Keywords: BFS tree construction, K-clustering, leader election, MIS construction, self-stabilization, unit disk graph
Complex-valued K-means clustering of interpolative separable density fitting algorithm for large-scale hybrid functional enabled \textit{ab initio} molecular dynamics simulations within plane waves
K-means clustering, as a classic unsupervised machine learning algorithm, is
the key step to select the interpolation sampling points in interpolative
separable density fitting (ISDF) decomposition. Real-valued K-means clustering
for accelerating the ISDF decomposition has been demonstrated for large-scale
hybrid functional enabled \textit{ab initio} molecular dynamics (hybrid AIMD)
simulations within plane-wave basis sets where the Kohn-Sham orbitals are
real-valued. However, it is unclear whether such K-means clustering works for
complex-valued Kohn-Sham orbitals. Here, we apply the K-means clustering into
hybrid AIMD simulations for complex-valued Kohn-Sham orbitals and use an
improved weight function defined as the sum of the square modulus of
complex-valued Kohn-Sham orbitals in K-means clustering. Numerical results
demonstrate that this improved weight function in K-means clustering algorithm
yields smoother and more delocalized interpolation sampling points, resulting
in smoother energy potential, smaller energy drift and longer time steps for
hybrid AIMD simulations compared to the previous weight function used in the
real-valued K-means algorithm. In particular, we find that this improved
algorithm can obtain more accurate oxygen-oxygen radial distribution functions
in liquid water molecules and more accurate power spectrum in crystal silicon
dioxide compared to the previous K-means algorithm. Finally, we describe a
massively parallel implementation of this ISDF decomposition to accelerate
large-scale complex-valued hybrid AIMD simulations containing thousands of
atoms (2,744 atoms), which can scale up to 5,504 CPU cores on modern
supercomputers.Comment: 43 pages, 12 figure
Algorithms for Stable Matching and Clustering in a Grid
We study a discrete version of a geometric stable marriage problem originally
proposed in a continuous setting by Hoffman, Holroyd, and Peres, in which
points in the plane are stably matched to cluster centers, as prioritized by
their distances, so that each cluster center is apportioned a set of points of
equal area. We show that, for a discretization of the problem to an
grid of pixels with centers, the problem can be solved in time , and we experiment with two slower but more practical algorithms and
a hybrid method that switches from one of these algorithms to the other to gain
greater efficiency than either algorithm alone. We also show how to combine
geometric stable matchings with a -means clustering algorithm, so as to
provide a geometric political-districting algorithm that views distance in
economic terms, and we experiment with weighted versions of stable -means in
order to improve the connectivity of the resulting clusters.Comment: 23 pages, 12 figures. To appear (without the appendices) at the 18th
International Workshop on Combinatorial Image Analysis, June 19-21, 2017,
Plovdiv, Bulgari
Evaluating the index of panoramic X-ray image quality using K-means clustering method
Background A panoramic X-ray image is generally considered optimal when the occlusal plane is slightly arched, presenting with a gentle curve. However, the ideal angle of the occlusal plane has not been determined. This study provides a simple evaluation index for panoramic X-ray image quality, built using various image and cluster analyzes, which can be used as a training tool for radiological technologists and as a reference for image quality improvement.
Results A reference panoramic X-ray image was acquired using a phantom with the Frankfurt plane positioned horizontally, centered in the middle, and frontal plane centered on the canine teeth. Other images with positioning errors were acquired with anteroposterior shifts, vertical rotations of the Frankfurt plane, and horizontal left/right rotations. The reference and positioning-error images were evaluated with the cross-correlation coefficients for the occlusal plane profile, left/right angle difference, peak signal-to-noise ratio (PSNR), and deformation vector fields (DVF). The results of the image analyzes were scored for positioning-error images using K-means clustering analysis. Next, we analyzed the correlations between the total score, cross-correlation analysis of the occlusal plane curves, left/right angle difference, PSNR, and DVF. In the scoring, the positioning-error images with the highest quality were the ones with posterior shifts of 1 mm. In the analysis of the correlations between each pair of results, the strongest correlations (r = 0.7–0.9) were between all combinations of PSNR, DVF, and total score.
Conclusions The scoring of positioning-error images using K-means clustering analysis is a valid evaluation indicator of correct patient positioning for technologists in training
- …