Search CORE

22,539 research outputs found

On the centroid of increasing trees

Author: Durant Kevin
Wagner Stephan
Publication venue
Publication date: 01/01/2019
Field of study

A centroid node in a tree is a node for which the sum of the distances to all other nodes attains its minimum, or equivalently a node with the property that none of its branches contains more than half of the other nodes. We generalise some known results regarding the behaviour of centroid nodes in random recursive trees (due to Moon) to the class of very simple increasing trees, which also includes the families of plane-oriented and

d

-ary increasing trees. In particular, we derive limits of distributions and moments for the depth and label of the centroid node nearest to the root, as well as for the size of the subtree rooted at this node

arXiv.org e-Print Archive

Episciences.org

Directory of Open Access Journals

Stellenbosch University SUNScholar Repository

Bounds on the radius and status of graphs

Author: Burkard Rainer E.
Rissner Roswitha
Publication venue: 'Wiley'
Publication date: 08/01/2014
Field of study

Two classical concepts of centrality in a graph are the median and the center. The connected notions of the status and the radius of a graph seem to be in no relation. In this paper, however, we show a clear connection of both concepts, as they obtain their minimum and maximum values at the same type of tree graphs. Trees with fixed maximum degree and extremum radius and status, resp., are characterized. The bounds on radius and status can be transferred to general connected graphs via spanning trees. A new method of proof allows not only to regain results of Lin et al. on graphs with extremum status, but it allows also to prove analogous results on graphs with extremum radius

arXiv.org e-Print Archive

CiteSeerX

Faster K-Means Cluster Estimation

Author: A Likas
DT Pham
SP Lloyd
T Kanungo
Publication venue
Publication date: 17/01/2017
Field of study

There has been considerable work on improving popular clustering algorithm `K-means' in terms of mean squared error (MSE) and speed, both. However, most of the k-means variants tend to compute distance of each data point to each cluster centroid for every iteration. We propose a fast heuristic to overcome this bottleneck with only marginal increase in MSE. We observe that across all iterations of K-means, a data point changes its membership only among a small subset of clusters. Our heuristic predicts such clusters for each data point by looking at nearby clusters after the first iteration of k-means. We augment well known variants of k-means with our heuristic to demonstrate effectiveness of our heuristic. For various synthetic and real-world datasets, our heuristic achieves speed-up of up-to 3 times when compared to efficient variants of k-means.Comment: 6 pages, Accepted at ECIR 201

arXiv.org e-Print Archive

Crossref

An Even Faster and More Unifying Algorithm for Comparing Trees via Unbalanced Bipartite Matchings

Author: Kao Ming-Yang
Lam Tak-Wah
Sung Wing-Kin
Ting Hing-Fung
Publication venue
Publication date: 01/01/2001
Field of study

A widely used method for determining the similarity of two labeled trees is to compute a maximum agreement subtree of the two trees. Previous work on this similarity measure is only concerned with the comparison of labeled trees of two special kinds, namely, uniformly labeled trees (i.e., trees with all their nodes labeled by the same symbol) and evolutionary trees (i.e., leaf-labeled trees with distinct symbols for distinct leaves). This paper presents an algorithm for comparing trees that are labeled in an arbitrary manner. In addition to this generality, this algorithm is faster than the previous algorithms. Another contribution of this paper is on maximum weight bipartite matchings. We show how to speed up the best known matching algorithms when the input graphs are node-unbalanced or weight-unbalanced. Based on these enhancements, we obtain an efficient algorithm for a new matching problem called the hierarchical bipartite matching problem, which is at the core of our maximum agreement subtree algorithm.Comment: To appear in Journal of Algorithm

arXiv.org e-Print Archive

HKU Scholars Hub

Recommended from our members

Dynamic load balancing in parallel KD-tree k-means

Author: Di Fatta Giuseppe
Pettinger David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/06/2010
Field of study

One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy

Central Archive at the University of Reading

Crossref