5,963 research outputs found

    On morphological hierarchical representations for image processing and spatial data clustering

    Full text link
    Hierarchical data representations in the context of classi cation and data clustering were put forward during the fties. Recently, hierarchical image representations have gained renewed interest for segmentation purposes. In this paper, we briefly survey fundamental results on hierarchical clustering and then detail recent paradigms developed for the hierarchical representation of images in the framework of mathematical morphology: constrained connectivity and ultrametric watersheds. Constrained connectivity can be viewed as a way to constrain an initial hierarchy in such a way that a set of desired constraints are satis ed. The framework of ultrametric watersheds provides a generic scheme for computing any hierarchical connected clustering, in particular when such a hierarchy is constrained. The suitability of this framework for solving practical problems is illustrated with applications in remote sensing

    Statistical Mechanics of Community Detection

    Full text link
    Starting from a general \textit{ansatz}, we show how community detection can be interpreted as finding the ground state of an infinite range spin glass. Our approach applies to weighted and directed networks alike. It contains the \textit{at hoc} introduced quality function from \cite{ReichardtPRL} and the modularity QQ as defined by Newman and Girvan \cite{Girvan03} as special cases. The community structure of the network is interpreted as the spin configuration that minimizes the energy of the spin glass with the spin states being the community indices. We elucidate the properties of the ground state configuration to give a concise definition of communities as cohesive subgroups in networks that is adaptive to the specific class of network under study. Further we show, how hierarchies and overlap in the community structure can be detected. Computationally effective local update rules for optimization procedures to find the ground state are given. We show how the \textit{ansatz} may be used to discover the community around a given node without detecting all communities in the full network and we give benchmarks for the performance of this extension. Finally, we give expectation values for the modularity of random graphs, which can be used in the assessment of statistical significance of community structure

    Communities in Networks

    Full text link
    We survey some of the concepts, methods, and applications of community detection, which has become an increasingly important area of network science. To help ease newcomers into the field, we provide a guide to available methodology and open problems, and discuss why scientists from diverse backgrounds are interested in these problems. As a running theme, we emphasize the connections of community detection to problems in statistical physics and computational optimization.Comment: survey/review article on community structure in networks; published version is available at http://people.maths.ox.ac.uk/~porterm/papers/comnotices.pd

    Business-oriented Analysis of a Social Network of University Students

    Get PDF
    Despites the great interest caused by social networks in Business Science, their analysis is rarely performed both in a global and systematic way in this field: most authors focus on parts of the studied network, or on a few nodes considered individually. This could be explained by the fact that practical extraction of social networks is a difficult and costly task, since the specific relational data it requires are often difficult to access and thereby expensive. One may ask if equivalent information could be extracted from less expensive individual data, i.e. data concerning single individuals instead of several ones. In this work, we try to tackle this problem through group detection. We gather both types of data from a population of students, and estimate groups separately using individual and relational data, leading to sets of clusters and communities, respectively. We found out there is no strong overlapping between them, meaning both types of data do not convey the same information in this specific context, and can therefore be considered as complementary. However, a link, even if weak, exists and appears when we identify the most discriminant attributes relatively to the communities. Implications in Business Science include community prediction using individual data.Social Networks; Business Science; Cluster Analysis; Community Detection; Community Comparison; Individual Data; Relational Data

    Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale

    Full text link
    Notions of community quality underlie network clustering. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms -- Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on information recovery metrics. Our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it absolutely superior. Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters

    Interpretable Clustering using Unsupervised Binary Trees

    Get PDF
    We herein introduce a new method of interpretable clustering that uses unsupervised binary trees. It is a three-stage procedure, the first stage of which entails a series of recursive binary splits to reduce the heterogeneity of the data within the new subsamples. During the second stage (pruning), consideration is given to whether adjacent nodes can be aggregated. Finally, during the third stage (joining), similar clusters are joined together, even if they do not descend from the same node originally. Consistency results are obtained, and the procedure is used on simulated and real data sets.Comment: 25 pages, 6 figure

    Considerations about multistep community detection

    Full text link
    The problem and implications of community detection in networks have raised a huge attention, for its important applications in both natural and social sciences. A number of algorithms has been developed to solve this problem, addressing either speed optimization or the quality of the partitions calculated. In this paper we propose a multi-step procedure bridging the fastest, but less accurate algorithms (coarse clustering), with the slowest, most effective ones (refinement). By adopting heuristic ranking of the nodes, and classifying a fraction of them as `critical', a refinement step can be restricted to this subset of the network, thus saving computational time. Preliminary numerical results are discussed, showing improvement of the final partition.Comment: 12 page
    • …
    corecore