15 research outputs found

    Axioms for graph clustering quality functions

    Get PDF
    We investigate properties that intuitively ought to be satisfied by graph clustering quality functions, that is, functions that assign a score to a clustering of a graph. Graph clustering, also known as network community detection, is often performed by optimizing such a function. Two axioms tailored for graph clustering quality functions are introduced, and the four axioms introduced in previous work on distance based clustering are reformulated and generalized for the graph setting. We show that modularity, a standard quality function for graph clustering, does not satisfy all of these six properties. This motivates the derivation of a new family of quality functions, adaptive scale modularity, which does satisfy the proposed axioms. Adaptive scale modularity has two parameters, which give greater flexibility in the kinds of clusterings that can be found. Standard graph clustering quality functions, such as normalized cut and unnormalized cut, are obtained as special cases of adaptive scale modularity. In general, the results of our investigation indicate that the considered axiomatic framework covers existing `good' quality functions for graph clustering, and can be used to derive an interesting new family of quality functions.Comment: 23 pages. Full text and sources available on: http://www.cs.ru.nl/~T.vanLaarhoven/graph-clustering-axioms-2014

    Incompatibility boundaries for properties of community partitions

    Get PDF
    We prove the incompatibility of certain desirable properties of community partition quality functions. Our results generalize the impossibility result of [Kleinberg 2003] by considering sets of weaker properties. In particular, we use an alternative notion to solve the central issue of the consistency property. (The latter means that modifying the graph in a way consistent with a partition should not have counterintuitive effects). Our results clearly show that community partition methods should not be expected to perfectly satisfy all ideally desired properties. We then proceed to show that this incompatibility no longer holds when slightly relaxed versions of the properties are considered, and we provide in fact examples of simple quality functions satisfying these relaxed properties. An experimental study of these quality functions shows a behavior comparable to established methods in some situations, but more debatable results in others. This suggests that defining a notion of good partition in communities probably requires imposing additional properties.Comment: 17 pages, 3 figure

    Leiden algoritmasında kalite faktörünün etkisi

    Get PDF
    Leiden algorithm is a widely utilized algorithm to cluster network graphs. It divides the specified network into smaller clusters. The clusters are relatively dense networks of vertices. In the process, the networks are divided based on quality factors. In this study, we compare the result of the Leiden algorithm with changing quality factors, namely Modularity and Constant Potts Model (CPM). For our analysis, we used 3×3 knight graph. Our investigation is completed for resolutions from 0.1 to 4.0 for Modularity and from 0.1 to 1.0 for CPM. The maximum quality scores are 0.9 and 0.59375 for Modularity and CPM respectively. The continuous decrease in the quality was recorded for both cases with respect to the increasing resolution. Both scoring factors are followed similar trends, but CPM has a relatively rapid division of the specified graph.Leiden algoritması, çizgeleri kümelemek için yaygın olarak kullanılan bir algoritmadır ve belirtilen çizgeyi daha küçük kümelere böler. Bu kümeler, nispeten yoğun düğüm çizgeleridir. Süreçte çizgeler kalite faktörlerine göre kümelenir. Bu çalışmada Leiden algoritmasını Modülerlik ve Sabit Potts Modeli (CPM) kalite faktörleri ile değişimini karşılaştırılmıştır. Analiz için 3×3 at çizgesi kullanıldı. İnceleme Modülerlik için 0,1'den 4,0'a ve CPM için 0,1'den 1,0'a kadar olan çözünürlükler için tamamlandı. Maksimum kalite puanları Modülerlik ve CPM için sırasıyla 0,9 ve 0,59375'tir. Kalitede artan çözünürlüğe göre her iki durumda da sürekli düşüş kaydedildi. Her iki puanlama faktörü de benzer eğilimler izlendi, ancak CPM nispeten konu edilen çizgeyi daha hızlı kümeledi

    Incompatibility boundaries for properties of community partitions

    Get PDF
    We prove the incompatibility of certain desirable properties of community partition quality functions. Our results generalize the impossibility result of [Kleinberg 2003] by considering sets of weaker properties. In particular, we use an alternative notion to solve the central issue of the consistency property. (The latter means that modifying the graph in a way consistent with a partition should not have counterintuitive effects). Our results clearly show that community partition methods should not be expected to perfectly satisfy all ideally desired properties. We then proceed to show that this incompatibility no longer holds when slightly relaxed versions of the properties are considered, and we provide examples of simple quality functions satisfying these relaxed properties. An experimental study of these quality functions shows a behavior comparable to established methods in some situations, but more debatable results in others. This suggests that defining a notion of good partition in communities probably requires imposing additional properties

    Systematic Analysis of Cluster Similarity Indices: How to Validate Validation Measures

    Get PDF
    Many cluster similarity indices are used to evaluate clustering algorithms, and choosing the best one for a particular task remains an open problem. We demonstrate that this problem is crucial: there are many disagreements among the indices, these disagreements do affect which algorithms are preferred in applications, and this can lead to degraded performance in real-world systems. We propose a theoretical framework to tackle this problem: we develop a list of desirable properties and conduct an extensive theoretical analysis to verify which indices satisfy them. This allows for making an informed choice: given a particular application, one can first select properties that are desirable for the task and then identify indices satisfying these. Our work unifies and considerably extends existing attempts at analyzing cluster similarity indices: we introduce new properties, formalize existing ones, and mathematically prove or disprove each property for an extensive list of validation indices. This broader and more rigorous approach leads to recommendations that considerably differ from how validation indices are currently being chosen by practitioners. Some of the most popular indices are even shown to be dominated by previously overlooked ones

    Indices de qualité en clustering

    No full text
    National audienceL'absence de vérité de terrain, entre autres, fait que l'évaluation d'un clustering est un problème non trivial pour lequel il est nécessaire d'utiliser des indices de qualité adaptés au but recherché et aux données. L'exposé présentera les éléments clés pour caractériser un indice de qualité, les principaux indices internes et externes et une approche axiomatique pour le choix d'un indice
    corecore