159,657 research outputs found

    Clustering in a hyperbolic model of complex networks

    Get PDF
    In this paper we consider the clustering coefficient and clustering function in a random graph model proposed by Krioukov et al.~in 2010. In this model, nodes are chosen randomly inside a disk in the hyperbolic plane and two nodes are connected if they are at most a certain hyperbolic distance from each other. It has been shown that this model has various properties associated with complex networks, e.g. power-law degree distribution, short distances and non-vanishing clustering coefficient. Here we show that the clustering coefficient tends in probability to a constant γ\gamma that we give explicitly as a closed form expression in terms of α,ν\alpha, \nu and certain special functions. This improves earlier work by Gugelmann et al., who proved that the clustering coefficient remains bounded away from zero with high probability, but left open the issue of convergence to a limiting constant. Similarly, we are able to show that c(k)c(k), the average clustering coefficient over all vertices of degree exactly kk, tends in probability to a limit γ(k)\gamma(k) which we give explicitly as a closed form expression in terms of α,ν\alpha, \nu and certain special functions. We are able to extend this last result also to sequences (kn)n(k_n)_n where knk_n grows as a function of nn. Our results show that γ(k)\gamma(k) scales differently, as kk grows, for different ranges of α\alpha. More precisely, there exists constants cα,νc_{\alpha,\nu} depending on α\alpha and ν\nu, such that as kk \to \infty, γ(k)cα,νk24α\gamma(k) \sim c_{\alpha,\nu} \cdot k^{2 - 4\alpha} if 12<α<34\frac{1}{2} < \alpha < \frac{3}{4}, γ(k)cα,νlog(k)k1\gamma(k) \sim c_{\alpha,\nu} \cdot \log(k) \cdot k^{-1} if α=34\alpha=\frac{3}{4} and γ(k)cα,νk1\gamma(k) \sim c_{\alpha,\nu} \cdot k^{-1} when α>34\alpha > \frac{3}{4}. These results contradict a claim of Krioukov et al., which stated that the limiting values γ(k)\gamma(k) should always scale with k1k^{-1} as we let kk grow.Comment: 127 page

    On Variants of k-means Clustering

    Get PDF
    \textit{Clustering problems} often arise in the fields like data mining, machine learning etc. to group a collection of objects into similar groups with respect to a similarity (or dissimilarity) measure. Among the clustering problems, specifically \textit{kk-means} clustering has got much attention from the researchers. Despite the fact that kk-means is a very well studied problem its status in the plane is still an open problem. In particular, it is unknown whether it admits a PTAS in the plane. The best known approximation bound in polynomial time is 9+\eps. In this paper, we consider the following variant of kk-means. Given a set CC of points in Rd\mathcal{R}^d and a real f>0f > 0, find a finite set FF of points in Rd\mathcal{R}^d that minimizes the quantity fF+pCminqFpq2f*|F|+\sum_{p\in C} \min_{q \in F} {||p-q||}^2. For any fixed dimension dd, we design a local search PTAS for this problem. We also give a "bi-criterion" local search algorithm for kk-means which uses (1+\eps)k centers and yields a solution whose cost is at most (1+\eps) times the cost of an optimal kk-means solution. The algorithm runs in polynomial time for any fixed dimension. The contribution of this paper is two fold. On the one hand, we are being able to handle the square of distances in an elegant manner, which yields near optimal approximation bound. This leads us towards a better understanding of the kk-means problem. On the other hand, our analysis of local search might also be useful for other geometric problems. This is important considering that very little is known about the local search method for geometric approximation.Comment: 15 page

    Clustering properties of a generalised critical Euclidean network

    Full text link
    Many real-world networks exhibit scale-free feature, have a small diameter and a high clustering tendency. We have studied the properties of a growing network, which has all these features, in which an incoming node is connected to its iith predecessor of degree kik_i with a link of length \ell using a probability proportional to kiβαk^\beta_i \ell^{\alpha}. For α>0.5\alpha > -0.5, the network is scale free at β=1\beta = 1 with the degree distribution P(k)kγP(k) \propto k^{-\gamma} and γ=3.0\gamma = 3.0 as in the Barab\'asi-Albert model (α=0,β=1\alpha =0, \beta =1). We find a phase boundary in the αβ\alpha-\beta plane along which the network is scale-free. Interestingly, we find scale-free behaviour even for β>1\beta > 1 for α<0.5\alpha < -0.5 where the existence of a new universality class is indicated from the behaviour of the degree distribution and the clustering coefficients. The network has a small diameter in the entire scale-free region. The clustering coefficients emulate the behaviour of most real networks for increasing negative values of α\alpha on the phase boundary.Comment: 4 pages REVTEX, 4 figure

    Centroid-Based Clustering with ab-Divergences

    Get PDF
    Centroid-based clustering is a widely used technique within unsupervised learning algorithms in many research fields. The success of any centroid-based clustering relies on the choice of the similarity measure under use. In recent years, most studies focused on including several divergence measures in the traditional hard k-means algorithm. In this article, we consider the problem of centroid-based clustering using the family of ab-divergences, which is governed by two parameters, a and b. We propose a new iterative algorithm, ab-k-means, giving closed-form solutions for the computation of the sided centroids. The algorithm can be fine-tuned by means of this pair of values, yielding a wide range of the most frequently used divergences. Moreover, it is guaranteed to converge to local minima for a wide range of values of the pair (a, b). Our theoretical contribution has been validated by several experiments performed with synthetic and real data and exploring the (a, b) plane. The numerical results obtained confirm the quality of the algorithm and its suitability to be used in several practical applications.MINECO TEC2017-82807-

    Self-stabilizing k-clustering in mobile ad hoc networks

    Full text link
    In this thesis, two silent self-stabilizing asynchronous distributed algorithms are given for constructing a k-clustering of a connected network of processes. These are the first self-stabilizing solutions to this problem. One algorithm, FLOOD, takes O( k) time and uses O(k log n) space per process, while the second algorithm, BFS-MIS-CLSTR, takes O(n) time and uses O(log n) space; where n is the size of the network. Processes have unique IDs, and there is no designated leader. BFS-MIS-CLSTR solves three problems; it elects a leader and constructs a BFS tree for the network, constructs a minimal independent set, and finally a k-clustering. Finding a minimal k-clustering is known to be NP -hard. If the network is a unit disk graph in a plane, BFS-MIS-CLSTR is within a factor of O(7.2552k) of choosing the minimal number of clusters; A lower bound is given, showing that any comparison-based algorithm for the k-clustering problem that takes o( diam) rounds has very bad worst case performance; Keywords: BFS tree construction, K-clustering, leader election, MIS construction, self-stabilization, unit disk graph

    Complex-valued K-means clustering of interpolative separable density fitting algorithm for large-scale hybrid functional enabled \textit{ab initio} molecular dynamics simulations within plane waves

    Full text link
    K-means clustering, as a classic unsupervised machine learning algorithm, is the key step to select the interpolation sampling points in interpolative separable density fitting (ISDF) decomposition. Real-valued K-means clustering for accelerating the ISDF decomposition has been demonstrated for large-scale hybrid functional enabled \textit{ab initio} molecular dynamics (hybrid AIMD) simulations within plane-wave basis sets where the Kohn-Sham orbitals are real-valued. However, it is unclear whether such K-means clustering works for complex-valued Kohn-Sham orbitals. Here, we apply the K-means clustering into hybrid AIMD simulations for complex-valued Kohn-Sham orbitals and use an improved weight function defined as the sum of the square modulus of complex-valued Kohn-Sham orbitals in K-means clustering. Numerical results demonstrate that this improved weight function in K-means clustering algorithm yields smoother and more delocalized interpolation sampling points, resulting in smoother energy potential, smaller energy drift and longer time steps for hybrid AIMD simulations compared to the previous weight function used in the real-valued K-means algorithm. In particular, we find that this improved algorithm can obtain more accurate oxygen-oxygen radial distribution functions in liquid water molecules and more accurate power spectrum in crystal silicon dioxide compared to the previous K-means algorithm. Finally, we describe a massively parallel implementation of this ISDF decomposition to accelerate large-scale complex-valued hybrid AIMD simulations containing thousands of atoms (2,744 atoms), which can scale up to 5,504 CPU cores on modern supercomputers.Comment: 43 pages, 12 figure

    Algorithms for Stable Matching and Clustering in a Grid

    Full text link
    We study a discrete version of a geometric stable marriage problem originally proposed in a continuous setting by Hoffman, Holroyd, and Peres, in which points in the plane are stably matched to cluster centers, as prioritized by their distances, so that each cluster center is apportioned a set of points of equal area. We show that, for a discretization of the problem to an n×nn\times n grid of pixels with kk centers, the problem can be solved in time O(n2log5n)O(n^2 \log^5 n), and we experiment with two slower but more practical algorithms and a hybrid method that switches from one of these algorithms to the other to gain greater efficiency than either algorithm alone. We also show how to combine geometric stable matchings with a kk-means clustering algorithm, so as to provide a geometric political-districting algorithm that views distance in economic terms, and we experiment with weighted versions of stable kk-means in order to improve the connectivity of the resulting clusters.Comment: 23 pages, 12 figures. To appear (without the appendices) at the 18th International Workshop on Combinatorial Image Analysis, June 19-21, 2017, Plovdiv, Bulgari

    Evaluating the index of panoramic X-ray image quality using K-means clustering method

    Get PDF
    Background A panoramic X-ray image is generally considered optimal when the occlusal plane is slightly arched, presenting with a gentle curve. However, the ideal angle of the occlusal plane has not been determined. This study provides a simple evaluation index for panoramic X-ray image quality, built using various image and cluster analyzes, which can be used as a training tool for radiological technologists and as a reference for image quality improvement. Results A reference panoramic X-ray image was acquired using a phantom with the Frankfurt plane positioned horizontally, centered in the middle, and frontal plane centered on the canine teeth. Other images with positioning errors were acquired with anteroposterior shifts, vertical rotations of the Frankfurt plane, and horizontal left/right rotations. The reference and positioning-error images were evaluated with the cross-correlation coefficients for the occlusal plane profile, left/right angle difference, peak signal-to-noise ratio (PSNR), and deformation vector fields (DVF). The results of the image analyzes were scored for positioning-error images using K-means clustering analysis. Next, we analyzed the correlations between the total score, cross-correlation analysis of the occlusal plane curves, left/right angle difference, PSNR, and DVF. In the scoring, the positioning-error images with the highest quality were the ones with posterior shifts of 1 mm. In the analysis of the correlations between each pair of results, the strongest correlations (r = 0.7–0.9) were between all combinations of PSNR, DVF, and total score. Conclusions The scoring of positioning-error images using K-means clustering analysis is a valid evaluation indicator of correct patient positioning for technologists in training
    corecore