73,115 research outputs found

    Towards Theoretical Foundations of Clustering

    Get PDF
    Clustering is a central unsupervised learning task with a wide variety of applications. Unlike in supervised learning, different clustering algorithms may yield dramatically different outputs for the same input sets. As such, the choice of algorithm is crucial. When selecting a clustering algorithm, users tend to focus on cost-related considerations, such as running times, software purchasing costs, etc. Yet differences concerning the output of the algorithms are a more primal consideration. We propose an approach for selecting clustering algorithms based on differences in their input-output behaviour. This approach relies on identifying significant properties of clustering algorithms and classifying algorithms based on the properties that they satisfy. We begin with Kleinberg's impossibility result, which relies on concise abstract properties that are well-suited for our approach. Kleinberg showed that three specific properties cannot be satisfied by the same algorithm. We illustrate that the impossibility result is a consequence of the formalism used, proving that these properties can be formulated without leading to inconsistency in the context of clustering quality measures or algorithms whose input requires the number of clusters. Combining Kleinberg's properties with newly proposed ones, we provide an extensive property-base classification of common clustering paradigms. We use some of these properties to provide a novel characterization of the class of linkage-based algorithms. That is, we distil a small set of properties that uniquely identify this family of algorithms. Lastly, we investigate how the output of algorithms is affected by the addition of small, potentially adversarial, sets of points. We prove that given clusterable input, the output of kk-means is robust to the addition of a small number of data points. On the other hand, clusterings produced by many well-known methods, including linkage-based techniques, can be changed radically by adding a small number of elements

    A Theoretical Study of Clusterability and Clustering Quality

    Get PDF
    Clustering is a widely used technique, with applications ranging from data mining, bioinformatics and image analysis to marketing, psychology, and city planning. Despite the practical importance of clustering, there is very limited theoretical analysis of the topic. We make a step towards building theoretical foundations for clustering by carrying out an abstract analysis of two central concepts in clustering; clusterability and clustering quality. We compare a number of notions of clusterability found in the literature. While all these notions attempt to measure the same property, and all appear to be reasonable, we show that they are pairwise inconsistent. In addition, we give the first computational complexity analysis of a few notions of clusterability. In the second part of the thesis, we discuss how the quality of a given clustering can be defined (and measured). Users often need to compare the quality of clusterings obtained by different methods. Perhaps more importantly, users need to determine whether a given clustering is sufficiently good for being used in further data mining analysis. We analyze what a measure of clustering quality should look like. We do that by introducing a set of requirements (`axioms') of clustering quality measures. We propose a number of clustering quality measures that satisfy these requirements

    Clustering tales from the Greek construction sector: lessons from experience

    Get PDF
    The idea of increasing regional and national economic competitiveness through the implementation of cluster strategies is not something new. In each business sector, in each country, the creation of clusters has been used to capitalise on sector characteristics and address country specific productivity needs. While clusters have met with significant success in many context, the Greek context and in particularly the Greek Construction sector has not been so fruitful. This paper, through the development of a conceptual framework, questionnaires with 92 firms and interviews with 10 key firms, sought to investigate the critical success factors for the creation of a cluster within the challenging context of the Greek construction sector. Using evidence of good practicefrom other European countries facing similar challenges and the empirical data, the findings indicated a series of factors which firms could adopt, mitigate against or manage to help improve the potential success of the cluster. The findingstherefore have important implications for interventions not only by the state and local authorities that will encourage construction firms to participate in a cluster, but also by the managers/owners/practitioners for the creation of the required foundations for their participation in an environment where competitors cooperate

    Social Cohesion, Structural Holes, and a Tale of Two Measures

    Get PDF
    EMBARGOED - author can archive pre-print or post-print on any open access repository after 12 months from publication. Publication date is May 2013 so embargoed until May 2014.This is an author’s accepted manuscript (deposited at arXiv arXiv:1211.0719v2 [physics.soc-ph] ), which was subsequently published in Journal of Statistical Physics May 2013, Volume 151, Issue 3-4, pp 745-764. The final publication is available at link.springer.com http://link.springer.com/article/10.1007/s10955-013-0722-

    A Potentiality and Conceptuality Interpretation of Quantum Physics

    Full text link
    We elaborate on a new interpretation of quantum mechanics which we introduced recently. The main hypothesis of this new interpretation is that quantum particles are entities interacting with matter conceptually, which means that pieces of matter function as interfaces for the conceptual content carried by the quantum particles. We explain how our interpretation was inspired by our earlier analysis of non-locality as non-spatiality and a specific interpretation of quantum potentiality, which we illustrate by means of the example of two interconnected vessels of water. We show by means of this example that philosophical realism is not in contradiction with the recent findings with respect to Leggett's inequalities and their violations. We explain our recent work on using the quantum formalism to model human concepts and their combinations and how this has given rise to the foundational ideas of our new quantum interpretation. We analyze the equivalence of meaning in the realm of human concepts and coherence in the realm of quantum particles, and how the duality of abstract and concrete leads naturally to a Heisenberg uncertainty relation. We illustrate the role played by interference and entanglement and show how the new interpretation explains the problems related to identity and individuality in quantum mechanics. We put forward a possible scenario for the emergence of the reality of macroscopic objects.Comment: 20 pages, 1 figur
    • …
    corecore