89,658 research outputs found
Identification-method research for open-source software ecosystems
In recent years, open-source software (OSS) development has grown, with many developers around the world working on different OSS projects. A variety of open-source software ecosystems have emerged, for instance, GitHub, StackOverflow, and SourceForge. One of the most typical social-programming and code-hosting sites, GitHub, has amassed numerous open-source-software projects and developers in the same virtual collaboration platform. Since GitHub itself is a large open-source community, it hosts a collection of software projects that are developed together and coevolve. The great challenge here is how to identify the relationship between these projects, i.e., project relevance. Software-ecosystem identification is the basis of other studies in the ecosystem. Therefore, how to extract useful information in GitHub and identify software ecosystems is particularly important, and it is also a research area in symmetry. In this paper, a Topic-based Project Knowledge Metrics Framework (TPKMF) is proposed. By collecting the multisource dataset of an open-source ecosystem, project-relevance analysis of the open-source software is carried out on the basis of software-ecosystem identification. Then, we used our Spectral Clustering algorithm based on Core Project (CP-SC) to identify software-ecosystem projects and further identify software ecosystems. We verified that most software ecosystems usually contain a core software project, and most other projects are associated with it. Furthermore, we analyzed the characteristics of the ecosystem, and we also found that interactive information has greater impact on project relevance. Finally, we summarize the Topic-based Project Knowledge Metrics Framework
Defining and identifying communities in networks
The investigation of community structures in networks is an important issue
in many domains and disciplines. This problem is relevant for social tasks
(objective analysis of relationships on the web), biological inquiries
(functional studies in metabolic, cellular or protein networks) or
technological problems (optimization of large infrastructures). Several types
of algorithm exist for revealing the community structure in networks, but a
general and quantitative definition of community is still lacking, leading to
an intrinsic difficulty in the interpretation of the results of the algorithms
without any additional non-topological information. In this paper we face this
problem by introducing two quantitative definitions of community and by showing
how they are implemented in practice in the existing algorithms. In this way
the algorithms for the identification of the community structure become fully
self-contained. Furthermore, we propose a new local algorithm to detect
communities which outperforms the existing algorithms with respect to the
computational cost, keeping the same level of reliability. The new algorithm is
tested on artificial and real-world graphs. In particular we show the
application of the new algorithm to a network of scientific collaborations,
which, for its size, can not be attacked with the usual methods. This new class
of local algorithms could open the way to applications to large-scale
technological and biological applications.Comment: Revtex, final form, 14 pages, 6 figure
Characterization of complex networks: A survey of measurements
Each complex network (or class of networks) presents specific topological
features which characterize its connectivity and highly influence the dynamics
of processes executed on the network. The analysis, discrimination, and
synthesis of complex networks therefore rely on the use of measurements capable
of expressing the most relevant topological features. This article presents a
survey of such measurements. It includes general considerations about complex
network characterization, a brief review of the principal models, and the
presentation of the main existing measurements. Important related issues
covered in this work comprise the representation of the evolution of complex
networks in terms of trajectories in several measurement spaces, the analysis
of the correlations between some of the most traditional measurements,
perturbation analysis, as well as the use of multivariate statistics for
feature selection and network classification. Depending on the network and the
analysis task one has in mind, a specific set of features may be chosen. It is
hoped that the present survey will help the proper application and
interpretation of measurements.Comment: A working manuscript with 78 pages, 32 figures. Suggestions of
measurements for inclusion are welcomed by the author
A new measure for community structures through indirect social connections
Based on an expert systems approach, the issue of community detection can be
conceptualized as a clustering model for networks. Building upon this further,
community structure can be measured through a clustering coefficient, which is
generated from the number of existing triangles around the nodes over the
number of triangles that can be hypothetically constructed. This paper provides
a new definition of the clustering coefficient for weighted networks under a
generalized definition of triangles. Specifically, a novel concept of triangles
is introduced, based on the assumption that, should the aggregate weight of two
arcs be strong enough, a link between the uncommon nodes can be induced. Beyond
the intuitive meaning of such generalized triangles in the social context, we
also explore the usefulness of them for gaining insights into the topological
structure of the underlying network. Empirical experiments on the standard
networks of 500 commercial US airports and on the nervous system of the
Caenorhabditis elegans support the theoretical framework and allow a comparison
between our proposal and the standard definition of clustering coefficient
Detection of Trending Topic Communities: Bridging Content Creators and Distributors
The rise of a trending topic on Twitter or Facebook leads to the temporal
emergence of a set of users currently interested in that topic. Given the
temporary nature of the links between these users, being able to dynamically
identify communities of users related to this trending topic would allow for a
rapid spread of information. Indeed, individual users inside a community might
receive recommendations of content generated by the other users, or the
community as a whole could receive group recommendations, with new content
related to that trending topic. In this paper, we tackle this challenge, by
identifying coherent topic-dependent user groups, linking those who generate
the content (creators) and those who spread this content, e.g., by
retweeting/reposting it (distributors). This is a novel problem on
group-to-group interactions in the context of recommender systems. Analysis on
real-world Twitter data compare our proposal with a baseline approach that
considers the retweeting activity, and validate it with standard metrics.
Results show the effectiveness of our approach to identify communities
interested in a topic where each includes content creators and content
distributors, facilitating users' interactions and the spread of new
information.Comment: 9 pages, 4 figures, 2 tables, Hypertext 2017 conferenc
Clustering of tag-induced sub-graphs in complex networks
We study the behavior of the clustering coefficient in tagged networks. The
rich variety of tags associated with the nodes in the studied systems provide
additional information about the entities represented by the nodes which can be
important for practical applications like searching in the networks. Here we
examine how the clustering coefficient changes when narrowing the network to a
sub-graph marked by a given tag, and how does it correlate with various other
properties of the sub-graph. Another interesting question addressed in the
paper is how the clustering coefficient of the individual nodes is affected by
the tags on the node. We believe these sort of analysis help acquiring a more
complete description of the structure of large complex systems
- …