12,327 research outputs found
Collaboration in an Open Data eScience: A Case Study of Sloan Digital Sky Survey
Current science and technology has produced more and more publically
accessible scientific data. However, little is known about how the open data
trend impacts a scientific community, specifically in terms of its
collaboration behaviors. This paper aims to enhance our understanding of the
dynamics of scientific collaboration in the open data eScience environment via
a case study of co-author networks of an active and highly cited open data
project, called Sloan Digital Sky Survey. We visualized the co-authoring
networks and measured their properties over time at three levels: author,
institution, and country levels. We compared these measurements to a random
network model and also compared results across the three levels. The study
found that 1) the collaboration networks of the SDSS community transformed from
random networks to small-world networks; 2) the number of author-level
collaboration instances has not changed much over time, while the number of
collaboration instances at the other two levels has increased over time; 3)
pairwise institutional collaboration become common in recent years. The open
data trend may have both positive and negative impacts on scientific
collaboration.Comment: iConference 201
Global Network Alignment
Motivation: High-throughput methods for detecting molecular interactions have lead to a plethora of biological network data with much more yet to come, stimulating the development of techniques for biological network alignment. Analogous to sequence alignment, efficient and reliable network alignment methods will improve our understanding of biological systems. Network alignment is computationally hard. Hence, devising efficient network alignment heuristics is currently one of the foremost challenges in computational biology. 

Results: We present a superior heuristic network alignment algorithm, called Matching-based GRAph ALigner (M-GRAAL), which can process and integrate any number and type of similarity measures between network nodes (e.g., proteins), including, but not limited to, any topological network similarity measure, sequence similarity, functional similarity, and structural similarity. This is efficient in resolving ties in similarity measures and in finding a combination of similarity measures yielding the largest biologically sound alignments. When used to align protein-protein interaction (PPI) networks of various species, M-GRAAL exposes the largest known functional and contiguous regions of network similarity. Hence, we use M-GRAAL’s alignments to predict functions of un-annotated proteins in yeast, human, and bacteria _C. jejuni_ and _E. coli_. Furthermore, using M-GRAAL to compare PPI networks of different herpes viruses, we reconstruct their phylogenetic relationship and our phylogenetic tree is the same as sequenced-based one
Defining and identifying communities in networks
The investigation of community structures in networks is an important issue
in many domains and disciplines. This problem is relevant for social tasks
(objective analysis of relationships on the web), biological inquiries
(functional studies in metabolic, cellular or protein networks) or
technological problems (optimization of large infrastructures). Several types
of algorithm exist for revealing the community structure in networks, but a
general and quantitative definition of community is still lacking, leading to
an intrinsic difficulty in the interpretation of the results of the algorithms
without any additional non-topological information. In this paper we face this
problem by introducing two quantitative definitions of community and by showing
how they are implemented in practice in the existing algorithms. In this way
the algorithms for the identification of the community structure become fully
self-contained. Furthermore, we propose a new local algorithm to detect
communities which outperforms the existing algorithms with respect to the
computational cost, keeping the same level of reliability. The new algorithm is
tested on artificial and real-world graphs. In particular we show the
application of the new algorithm to a network of scientific collaborations,
which, for its size, can not be attacked with the usual methods. This new class
of local algorithms could open the way to applications to large-scale
technological and biological applications.Comment: Revtex, final form, 14 pages, 6 figure
Improving Reachability and Navigability in Recommender Systems
In this paper, we investigate recommender systems from a network perspective
and investigate recommendation networks, where nodes are items (e.g., movies)
and edges are constructed from top-N recommendations (e.g., related movies). In
particular, we focus on evaluating the reachability and navigability of
recommendation networks and investigate the following questions: (i) How well
do recommendation networks support navigation and exploratory search? (ii) What
is the influence of parameters, in particular different recommendation
algorithms and the number of recommendations shown, on reachability and
navigability? and (iii) How can reachability and navigability be improved in
these networks? We tackle these questions by first evaluating the reachability
of recommendation networks by investigating their structural properties.
Second, we evaluate navigability by simulating three different models of
information seeking scenarios. We find that with standard algorithms,
recommender systems are not well suited to navigation and exploration and
propose methods to modify recommendations to improve this. Our work extends
from one-click-based evaluations of recommender systems towards multi-click
analysis (i.e., sequences of dependent clicks) and presents a general,
comprehensive approach to evaluating navigability of arbitrary recommendation
networks
Economic sector identification in a set of stocks traded at the New York Stock Exchange: a comparative analysis
We review some methods recently used in the literature to detect the
existence of a certain degree of common behavior of stock returns belonging to
the same economic sector. Specifically, we discuss methods based on random
matrix theory and hierarchical clustering techniques. We apply these methods to
a set of stocks traded at the New York Stock Exchange. The investigated time
series are recorded at a daily time horizon.
All the considered methods are able to detect economic information and the
presence of clusters characterized by the economic sector of stocks. However,
different methodologies provide different information about the considered set.
Our comparative analysis suggests that the application of just a single method
could not be able to extract all the economic information present in the
correlation coefficient matrix of a set of stocks.Comment: 13 pages, 8 figures, 2 Table
Sector identification in a set of stock return time series traded at the London Stock Exchange
We compare some methods recently used in the literature to detect the
existence of a certain degree of common behavior of stock returns belonging to
the same economic sector. Specifically, we discuss methods based on random
matrix theory and hierarchical clustering techniques. We apply these methods to
a portfolio of stocks traded at the London Stock Exchange. The investigated
time series are recorded both at a daily time horizon and at a 5-minute time
horizon. The correlation coefficient matrix is very different at different time
horizons confirming that more structured correlation coefficient matrices are
observed for long time horizons. All the considered methods are able to detect
economic information and the presence of clusters characterized by the economic
sector of stocks. However different methods present a different degree of
sensitivity with respect to different sectors. Our comparative analysis
suggests that the application of just a single method could not be able to
extract all the economic information present in the correlation coefficient
matrix of a stock portfolio.Comment: 28 pages, 13 figures, 3 Tables. Proceedings of the conference on
"Applications of Random Matrices to Economy and other Complex Systems",
Krakow (Poland), May 25-28 2005. Submitted for pubblication to Acta Phys. Po
Exploring complex networks by walking on them
We carry out a comparative study on the problem for a walker searching on
several typical complex networks. The search efficiency is evaluated for
various strategies. Having no knowledge of the global properties of the
underlying networks and the optimal path between any two given nodes, it is
found that the best search strategy is the self-avoid random walk. The
preferentially self-avoid random walk does not help in improving the search
efficiency further. In return, topological information of the underlying
networks may be drawn by comparing the results of the different search
strategies.Comment: 5 pages, 5 figure
- …