27 research outputs found

    SCI

    Get PDF
    Persistent homology is a powerful tool in Topological Data Analysis (TDA) to capture the topological properties of data succinctly at different spatial resolutions. For graphical data, the shape, and structure of the neighborhood of individual data items (nodes) are an essential means of characterizing their properties. We propose the use of persistent homology methods to capture structural and topological properties of graphs and use it to address the problem of link prediction. We achieve encouraging results on nine different real-world datasets that attest to the potential of persistent homology-based methods for network analysis

    Finding and testing network communities by lumped Markov chains

    Get PDF
    Identifying communities (or clusters), namely groups of nodes with comparatively strong internal connectivity, is a fundamental task for deeply understanding the structure and function of a network. Yet, there is a lack of formal criteria for defining communities and for testing their significance. We propose a sharp definition which is based on a significance threshold. By means of a lumped Markov chain model of a random walker, a quality measure called "persistence probability" is associated to a cluster. Then the cluster is defined as an "α\alpha-community" if such a probability is not smaller than α\alpha. Consistently, a partition composed of α\alpha-communities is an "α\alpha-partition". These definitions turn out to be very effective for finding and testing communities. If a set of candidate partitions is available, setting the desired α\alpha-level allows one to immediately select the α\alpha-partition with the finest decomposition. Simultaneously, the persistence probabilities quantify the significance of each single community. Given its ability in individually assessing the quality of each cluster, this approach can also disclose single well-defined communities even in networks which overall do not possess a definite clusterized structure

    Evolution of communities of software: using tensor decompositions to compare software ecosystems

    Get PDF
    © 2019 The Authors. Published by Springer. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1007/s41109-019-0193-5Modern software development is often a collaborative effort involving many authors through the re-use and sharing of code through software libraries. Modern software “ecosystems” are complex socio-technical systems which can be represented as a multilayer dynamic network. Many of these libraries and software packages are open-source and developed in the open on sites such as , so there is a large amount of data available about these networks. Studying these networks could be of interest to anyone choosing or designing a programming language. In this work, we use tensor factorisation to explore the dynamics of communities of software, and then compare these dynamics between languages on a dataset of approximately 1 million software projects. We hope to be able to inform the debate on software dependencies that has been recently re-ignited by the malicious takeover of the npm package and other incidents through giving a clearer picture of the structure of software dependency networks, and by exploring how the choices of language designers—for example, in the size of standard libraries, or the standards to which packages are held before admission to a language ecosystem is granted—may have shaped their language ecosystems. We establish that adjusted mutual information is a valid metric by which to assess the number of communities in a tensor decomposition and find that there are striking differences between the communities found across different software ecosystems and that communities do experience large and interpretable changes in activity over time. The differences between the elm and R software ecosystems, which see some communities decline over time, and the more conventional software ecosystems of Python, Java and JavaScript, which do not see many declining communities, are particularly marked.OAB’s work was supported as part of an Engineering and Physical Sciences Research Council (EPSRC) grant, project reference EP/I028099/1.Published versio

    Topological Characteristics of Class Collaborations

    No full text

    Are refactoring practices related to clusters in java software?

    No full text
    Refactoring is widely used among the practices of Agile software development. In this preliminary work we present an empirical study carried out on several releases of 5 software systems written in Java. We focus our attention on the effect of refactoring activities on the topology of the software network. We find that refactoring activities involve classes linked together into clusters inside the software network and the clusters may be modified in different ways by the refactoring activity. This could lead to significative changes in source code, whose knowledge could be valuable for people involved in software development

    Refactoring clustering in Java software networks

    No full text
    We present a study on the refactoring activities performed during the evolution of 7 popular Java open source software systems, using a complex network approach. We find that classes affected by refactorings are more likely to be interlinked than others, forming connected subgraphs. Our results show that in a software network, classes linked to refactored classes are likely to be refactored themselves. This result is meaningful because knowing how refactored classes are arranged inside a network could be useful to support developers in maintenance and refactoring activities
    corecore