Search CORE

159,657 research outputs found

Clustering in a hyperbolic model of complex networks

Author: Fountoulakis Nikolaos
Müller Tobias
Schepers Markus
van der Hoorn Pim
Publication venue
Publication date: 17/12/2020
Field of study

In this paper we consider the clustering coefficient and clustering function in a random graph model proposed by Krioukov et al.~in 2010. In this model, nodes are chosen randomly inside a disk in the hyperbolic plane and two nodes are connected if they are at most a certain hyperbolic distance from each other. It has been shown that this model has various properties associated with complex networks, e.g. power-law degree distribution, short distances and non-vanishing clustering coefficient. Here we show that the clustering coefficient tends in probability to a constant

\gamma

that we give explicitly as a closed form expression in terms of

\alpha, \nu

and certain special functions. This improves earlier work by Gugelmann et al., who proved that the clustering coefficient remains bounded away from zero with high probability, but left open the issue of convergence to a limiting constant. Similarly, we are able to show that

c(k)

, the average clustering coefficient over all vertices of degree exactly

k

, tends in probability to a limit

\gamma(k)

which we give explicitly as a closed form expression in terms of

\alpha, \nu

and certain special functions. We are able to extend this last result also to sequences

(k_n)_n

where

k_n

grows as a function of

n

. Our results show that

\gamma(k)

scales differently, as

k

grows, for different ranges of

\alpha

. More precisely, there exists constants

c_{\alpha,\nu}

depending on

\alpha

and

\nu

, such that as

k \to \infty

\gamma(k) \sim c_{\alpha,\nu} \cdot k^{2 - 4\alpha}

\frac{1}{2} < \alpha < \frac{3}{4}

\gamma(k) \sim c_{\alpha,\nu} \cdot \log(k) \cdot k^{-1}

\alpha=\frac{3}{4}

and

\gamma(k) \sim c_{\alpha,\nu} \cdot k^{-1}

when

\alpha > \frac{3}{4}

. These results contradict a claim of Krioukov et al., which stated that the limiting values

\gamma(k)

should always scale with

k^{-1}

as we let

k

grow.Comment: 127 page

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

University of Birmingham Research Portal

ARTS repository - University of Groningen

Pure OAI Repository

Dissertations of the University of Groningen

On Variants of k-means Clustering

Author: Bandyapadhyay Sayan
Varadarajan Kasturi
Publication venue
Publication date: 09/12/2015
Field of study

\textit{Clustering problems} often arise in the fields like data mining, machine learning etc. to group a collection of objects into similar groups with respect to a similarity (or dissimilarity) measure. Among the clustering problems, specifically \textit{

k

-means} clustering has got much attention from the researchers. Despite the fact that

k

-means is a very well studied problem its status in the plane is still an open problem. In particular, it is unknown whether it admits a PTAS in the plane. The best known approximation bound in polynomial time is 9+\eps. In this paper, we consider the following variant of

k

-means. Given a set

C

of points in

\mathcal{R}^d

and a real

f > 0

, find a finite set

F

of points in

\mathcal{R}^d

that minimizes the quantity

f*|F|+\sum_{p\in C} \min_{q \in F} {||p-q||}^2

. For any fixed dimension

d

, we design a local search PTAS for this problem. We also give a "bi-criterion" local search algorithm for

k

-means which uses (1+\eps)k centers and yields a solution whose cost is at most (1+\eps) times the cost of an optimal

k

-means solution. The algorithm runs in polynomial time for any fixed dimension. The contribution of this paper is two fold. On the one hand, we are being able to handle the square of distances in an elegant manner, which yields near optimal approximation bound. This leads us towards a better understanding of the

k

-means problem. On the other hand, our analysis of local search might also be useful for other geometric problems. This is important considering that very little is known about the local search method for geometric approximation.Comment: 15 page

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Clustering properties of a generalised critical Euclidean network

Author: A.-L. Barabási
D.J. Watts
J. Jost
K. Klemm
K. Klemm
P. Sen
P.L. Krapivsky
Parongama Sen
R. Albert
R. Xulvi-Brunet
S. S. Manna
S.-H. Yook
S.H. Yook
S.N. Dorogovtsev
S.S. Manna
Publication venue: 'American Physical Society (APS)'
Publication date: 31/01/2003
Field of study

Many real-world networks exhibit scale-free feature, have a small diameter and a high clustering tendency. We have studied the properties of a growing network, which has all these features, in which an incoming node is connected to its

i

th predecessor of degree

k_i

with a link of length

\ell

using a probability proportional to

k^\beta_i \ell^{\alpha}

. For

\alpha > -0.5

, the network is scale free at

\beta = 1

with the degree distribution

P(k) \propto k^{-\gamma}

and

\gamma = 3.0

as in the Barab\'asi-Albert model (

\alpha =0, \beta =1

). We find a phase boundary in the

\alpha-\beta

plane along which the network is scale-free. Interestingly, we find scale-free behaviour even for

\beta > 1

for

\alpha < -0.5

where the existence of a new universality class is indicated from the behaviour of the degree distribution and the clustering coefficients. The network has a small diameter in the entire scale-free region. The clustering coefficients emulate the behaviour of most real networks for increasing negative values of

\alpha

on the phase boundary.Comment: 4 pages REVTEX, 4 figure

arXiv.org e-Print Archive

Crossref

Centroid-Based Clustering with ab-Divergences

Author: Cruces Álvarez Sergio Antonio
Durán Díaz Iván
Fondón García Irene
Sarmiento Vega María Auxiliadora
Publication venue: 'MDPI AG'
Publication date: 19/02/2019
Field of study

Centroid-based clustering is a widely used technique within unsupervised learning algorithms in many research fields. The success of any centroid-based clustering relies on the choice of the similarity measure under use. In recent years, most studies focused on including several divergence measures in the traditional hard k-means algorithm. In this article, we consider the problem of centroid-based clustering using the family of ab-divergences, which is governed by two parameters, a and b. We propose a new iterative algorithm, ab-k-means, giving closed-form solutions for the computation of the sided centroids. The algorithm can be fine-tuned by means of this pair of values, yielding a wide range of the most frequently used divergences. Moreover, it is guaranteed to converge to local minima for a wide range of values of the pair (a, b). Our theoretical contribution has been validated by several experiments performed with synthetic and real data and exploring the (a, b) plane. The numerical results obtained confirm the quality of the algorithm and its suitability to be used in several practical applications.MINECO TEC2017-82807-

idUS. Depósito de Investigación Universidad de Sevilla

Self-stabilizing k-clustering in mobile ad hoc networks

Author: Vemula Priyanka
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2008
Field of study

In this thesis, two silent self-stabilizing asynchronous distributed algorithms are given for constructing a k-clustering of a connected network of processes. These are the first self-stabilizing solutions to this problem. One algorithm, FLOOD, takes O( k) time and uses O(k log n) space per process, while the second algorithm, BFS-MIS-CLSTR, takes O(n) time and uses O(log n) space; where n is the size of the network. Processes have unique IDs, and there is no designated leader. BFS-MIS-CLSTR solves three problems; it elects a leader and constructs a BFS tree for the network, constructs a minimal independent set, and finally a k-clustering. Finding a minimal k-clustering is known to be NP -hard. If the network is a unit disk graph in a plane, BFS-MIS-CLSTR is within a factor of O(7.2552k) of choosing the minimal number of clusters; A lower bound is given, showing that any comparison-based algorithm for the k-clustering problem that takes o( diam) rounds has very bad worst case performance; Keywords: BFS tree construction, K-clustering, leader election, MIS construction, self-stabilization, unit disk graph

University of Nevada, Las Vegas Repository

Complex-valued K-means clustering of interpolative separable density fitting algorithm for large-scale hybrid functional enabled \textit{ab initio} molecular dynamics simulations within plane waves

Author: Hu Wei
Jiao Shizhe
Li Jielan
Qin Xinming
Wan Lingyun
Yang Jinlong
Publication venue
Publication date: 07/01/2024
Field of study

K-means clustering, as a classic unsupervised machine learning algorithm, is the key step to select the interpolation sampling points in interpolative separable density fitting (ISDF) decomposition. Real-valued K-means clustering for accelerating the ISDF decomposition has been demonstrated for large-scale hybrid functional enabled \textit{ab initio} molecular dynamics (hybrid AIMD) simulations within plane-wave basis sets where the Kohn-Sham orbitals are real-valued. However, it is unclear whether such K-means clustering works for complex-valued Kohn-Sham orbitals. Here, we apply the K-means clustering into hybrid AIMD simulations for complex-valued Kohn-Sham orbitals and use an improved weight function defined as the sum of the square modulus of complex-valued Kohn-Sham orbitals in K-means clustering. Numerical results demonstrate that this improved weight function in K-means clustering algorithm yields smoother and more delocalized interpolation sampling points, resulting in smoother energy potential, smaller energy drift and longer time steps for hybrid AIMD simulations compared to the previous weight function used in the real-valued K-means algorithm. In particular, we find that this improved algorithm can obtain more accurate oxygen-oxygen radial distribution functions in liquid water molecules and more accurate power spectrum in crystal silicon dioxide compared to the previous K-means algorithm. Finally, we describe a massively parallel implementation of this ISDF decomposition to accelerate large-scale complex-valued hybrid AIMD simulations containing thousands of atoms (2,744 atoms), which can scale up to 5,504 CPU cores on modern supercomputers.Comment: 43 pages, 12 figure

arXiv.org e-Print Archive

Algorithms for Stable Matching and Clustering in a Grid

Author: AK Jain
C Hoffman
D Eppstein
D Gale
DE Knuth
EM Arkin
F Aurenhammer
F Dehne
F Ricca
H Fraysseix De
J Chun
KF Böhringer
M Chrobak
MH Overmars
MS Rahman
R Hartley
S Chandran
T Kanungo
TH Cormen
TM Chan
TP Fang
V Akman
Publication venue
Publication date: 01/01/2017
Field of study

We study a discrete version of a geometric stable marriage problem originally proposed in a continuous setting by Hoffman, Holroyd, and Peres, in which points in the plane are stably matched to cluster centers, as prioritized by their distances, so that each cluster center is apportioned a set of points of equal area. We show that, for a discretization of the problem to an

n\times n

grid of pixels with

k

centers, the problem can be solved in time

O(n^2 \log^5 n)

, and we experiment with two slower but more practical algorithms and a hybrid method that switches from one of these algorithms to the other to gain greater efficiency than either algorithm alone. We also show how to combine geometric stable matchings with a

k

-means clustering algorithm, so as to provide a geometric political-districting algorithm that views distance in economic terms, and we experiment with weighted versions of stable

k

-means in order to improve the connectivity of the resulting clusters.Comment: 23 pages, 12 figures. To appear (without the appendices) at the 18th International Workshop on Combinatorial Image Analysis, June 19-21, 2017, Plovdiv, Bulgari

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Evaluating the index of panoramic X-ray image quality using K-means clustering method

Author: Honda Mitsugi
Imajo Satoshi
Kuroda Masahiro
Nakamura Nobue
Tanabe Yoshinori
Publication venue: Springer
Publication date: 02/01/2024
Field of study

Background A panoramic X-ray image is generally considered optimal when the occlusal plane is slightly arched, presenting with a gentle curve. However, the ideal angle of the occlusal plane has not been determined. This study provides a simple evaluation index for panoramic X-ray image quality, built using various image and cluster analyzes, which can be used as a training tool for radiological technologists and as a reference for image quality improvement. Results A reference panoramic X-ray image was acquired using a phantom with the Frankfurt plane positioned horizontally, centered in the middle, and frontal plane centered on the canine teeth. Other images with positioning errors were acquired with anteroposterior shifts, vertical rotations of the Frankfurt plane, and horizontal left/right rotations. The reference and positioning-error images were evaluated with the cross-correlation coefficients for the occlusal plane profile, left/right angle difference, peak signal-to-noise ratio (PSNR), and deformation vector fields (DVF). The results of the image analyzes were scored for positioning-error images using K-means clustering analysis. Next, we analyzed the correlations between the total score, cross-correlation analysis of the occlusal plane curves, left/right angle difference, PSNR, and DVF. In the scoring, the positioning-error images with the highest quality were the ones with posterior shifts of 1 mm. In the analysis of the correlations between each pair of results, the strongest correlations (r = 0.7–0.9) were between all combinations of PSNR, DVF, and total score. Conclusions The scoring of positioning-error images using K-means clustering analysis is a valid evaluation indicator of correct patient positioning for technologists in training

Okayama University Scientific Achievement Repository