14,935 research outputs found
Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm
Over the past five decades, k-means has become the clustering algorithm of
choice in many application domains primarily due to its simplicity, time/space
efficiency, and invariance to the ordering of the data points. Unfortunately,
the algorithm's sensitivity to the initial selection of the cluster centers
remains to be its most serious drawback. Numerous initialization methods have
been proposed to address this drawback. Many of these methods, however, have
time complexity superlinear in the number of data points, which makes them
impractical for large data sets. On the other hand, linear methods are often
random and/or sensitive to the ordering of the data points. These methods are
generally unreliable in that the quality of their results is unpredictable.
Therefore, it is common practice to perform multiple runs of such methods and
take the output of the run that produces the best results. Such a practice,
however, greatly increases the computational requirements of the otherwise
highly efficient k-means algorithm. In this chapter, we investigate the
empirical performance of six linear, deterministic (non-random), and
order-invariant k-means initialization methods on a large and diverse
collection of data sets from the UCI Machine Learning Repository. The results
demonstrate that two relatively unknown hierarchical initialization methods due
to Su and Dy outperform the remaining four methods with respect to two
objective effectiveness criteria. In addition, a recent method due to Erisoglu
et al. performs surprisingly poorly.Comment: 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms
(Springer, 2014). arXiv admin note: substantial text overlap with
arXiv:1304.7465, arXiv:1209.196
Fronthaul-Constrained Cloud Radio Access Networks: Insights and Challenges
As a promising paradigm for fifth generation (5G) wireless communication
systems, cloud radio access networks (C-RANs) have been shown to reduce both
capital and operating expenditures, as well as to provide high spectral
efficiency (SE) and energy efficiency (EE). The fronthaul in such networks,
defined as the transmission link between a baseband unit (BBU) and a remote
radio head (RRH), requires high capacity, but is often constrained. This
article comprehensively surveys recent advances in fronthaul-constrained
C-RANs, including system architectures and key techniques. In particular, key
techniques for alleviating the impact of constrained fronthaul on SE/EE and
quality of service for users, including compression and quantization,
large-scale coordinated processing and clustering, and resource allocation
optimization, are discussed. Open issues in terms of software-defined
networking, network function virtualization, and partial centralization are
also identified.Comment: 5 Figures, accepted by IEEE Wireless Communications. arXiv admin
note: text overlap with arXiv:1407.3855 by other author
- …