89,658 research outputs found

    Identification-method research for open-source software ecosystems

    Get PDF
    In recent years, open-source software (OSS) development has grown, with many developers around the world working on different OSS projects. A variety of open-source software ecosystems have emerged, for instance, GitHub, StackOverflow, and SourceForge. One of the most typical social-programming and code-hosting sites, GitHub, has amassed numerous open-source-software projects and developers in the same virtual collaboration platform. Since GitHub itself is a large open-source community, it hosts a collection of software projects that are developed together and coevolve. The great challenge here is how to identify the relationship between these projects, i.e., project relevance. Software-ecosystem identification is the basis of other studies in the ecosystem. Therefore, how to extract useful information in GitHub and identify software ecosystems is particularly important, and it is also a research area in symmetry. In this paper, a Topic-based Project Knowledge Metrics Framework (TPKMF) is proposed. By collecting the multisource dataset of an open-source ecosystem, project-relevance analysis of the open-source software is carried out on the basis of software-ecosystem identification. Then, we used our Spectral Clustering algorithm based on Core Project (CP-SC) to identify software-ecosystem projects and further identify software ecosystems. We verified that most software ecosystems usually contain a core software project, and most other projects are associated with it. Furthermore, we analyzed the characteristics of the ecosystem, and we also found that interactive information has greater impact on project relevance. Finally, we summarize the Topic-based Project Knowledge Metrics Framework

    Defining and identifying communities in networks

    Full text link
    The investigation of community structures in networks is an important issue in many domains and disciplines. This problem is relevant for social tasks (objective analysis of relationships on the web), biological inquiries (functional studies in metabolic, cellular or protein networks) or technological problems (optimization of large infrastructures). Several types of algorithm exist for revealing the community structure in networks, but a general and quantitative definition of community is still lacking, leading to an intrinsic difficulty in the interpretation of the results of the algorithms without any additional non-topological information. In this paper we face this problem by introducing two quantitative definitions of community and by showing how they are implemented in practice in the existing algorithms. In this way the algorithms for the identification of the community structure become fully self-contained. Furthermore, we propose a new local algorithm to detect communities which outperforms the existing algorithms with respect to the computational cost, keeping the same level of reliability. The new algorithm is tested on artificial and real-world graphs. In particular we show the application of the new algorithm to a network of scientific collaborations, which, for its size, can not be attacked with the usual methods. This new class of local algorithms could open the way to applications to large-scale technological and biological applications.Comment: Revtex, final form, 14 pages, 6 figure

    Characterization of complex networks: A survey of measurements

    Full text link
    Each complex network (or class of networks) presents specific topological features which characterize its connectivity and highly influence the dynamics of processes executed on the network. The analysis, discrimination, and synthesis of complex networks therefore rely on the use of measurements capable of expressing the most relevant topological features. This article presents a survey of such measurements. It includes general considerations about complex network characterization, a brief review of the principal models, and the presentation of the main existing measurements. Important related issues covered in this work comprise the representation of the evolution of complex networks in terms of trajectories in several measurement spaces, the analysis of the correlations between some of the most traditional measurements, perturbation analysis, as well as the use of multivariate statistics for feature selection and network classification. Depending on the network and the analysis task one has in mind, a specific set of features may be chosen. It is hoped that the present survey will help the proper application and interpretation of measurements.Comment: A working manuscript with 78 pages, 32 figures. Suggestions of measurements for inclusion are welcomed by the author

    A new measure for community structures through indirect social connections

    Full text link
    Based on an expert systems approach, the issue of community detection can be conceptualized as a clustering model for networks. Building upon this further, community structure can be measured through a clustering coefficient, which is generated from the number of existing triangles around the nodes over the number of triangles that can be hypothetically constructed. This paper provides a new definition of the clustering coefficient for weighted networks under a generalized definition of triangles. Specifically, a novel concept of triangles is introduced, based on the assumption that, should the aggregate weight of two arcs be strong enough, a link between the uncommon nodes can be induced. Beyond the intuitive meaning of such generalized triangles in the social context, we also explore the usefulness of them for gaining insights into the topological structure of the underlying network. Empirical experiments on the standard networks of 500 commercial US airports and on the nervous system of the Caenorhabditis elegans support the theoretical framework and allow a comparison between our proposal and the standard definition of clustering coefficient

    Detection of Trending Topic Communities: Bridging Content Creators and Distributors

    Full text link
    The rise of a trending topic on Twitter or Facebook leads to the temporal emergence of a set of users currently interested in that topic. Given the temporary nature of the links between these users, being able to dynamically identify communities of users related to this trending topic would allow for a rapid spread of information. Indeed, individual users inside a community might receive recommendations of content generated by the other users, or the community as a whole could receive group recommendations, with new content related to that trending topic. In this paper, we tackle this challenge, by identifying coherent topic-dependent user groups, linking those who generate the content (creators) and those who spread this content, e.g., by retweeting/reposting it (distributors). This is a novel problem on group-to-group interactions in the context of recommender systems. Analysis on real-world Twitter data compare our proposal with a baseline approach that considers the retweeting activity, and validate it with standard metrics. Results show the effectiveness of our approach to identify communities interested in a topic where each includes content creators and content distributors, facilitating users' interactions and the spread of new information.Comment: 9 pages, 4 figures, 2 tables, Hypertext 2017 conferenc

    Clustering of tag-induced sub-graphs in complex networks

    Full text link
    We study the behavior of the clustering coefficient in tagged networks. The rich variety of tags associated with the nodes in the studied systems provide additional information about the entities represented by the nodes which can be important for practical applications like searching in the networks. Here we examine how the clustering coefficient changes when narrowing the network to a sub-graph marked by a given tag, and how does it correlate with various other properties of the sub-graph. Another interesting question addressed in the paper is how the clustering coefficient of the individual nodes is affected by the tags on the node. We believe these sort of analysis help acquiring a more complete description of the structure of large complex systems
    • …