18,778 research outputs found
Recommended from our members
Heterogeneous network embedding enabling accurate disease association predictions.
BackgroundIt is significant to identificate complex biological mechanisms of various diseases in biomedical research. Recently, the growing generation of tremendous amount of data in genomics, epigenomics, metagenomics, proteomics, metabolomics, nutriomics, etc., has resulted in the rise of systematic biological means of exploring complex diseases. However, the disparity between the production of the multiple data and our capability of analyzing data has been broaden gradually. Furthermore, we observe that networks can represent many of the above-mentioned data, and founded on the vector representations learned by network embedding methods, entities which are in close proximity but at present do not actually possess direct links are very likely to be related, therefore they are promising candidate subjects for biological investigation.ResultsWe incorporate six public biological databases to construct a heterogeneous biological network containing three categories of entities (i.e., genes, diseases, miRNAs) and multiple types of edges (i.e., the known relationships). To tackle the inherent heterogeneity, we develop a heterogeneous network embedding model for mapping the network into a low dimensional vector space in which the relationships between entities are preserved well. And in order to assess the effectiveness of our method, we conduct gene-disease as well as miRNA-disease associations predictions, results of which show the superiority of our novel method over several state-of-the-arts. Furthermore, many associations predicted by our method are verified in the latest real-world dataset.ConclusionsWe propose a novel heterogeneous network embedding method which can adequately take advantage of the abundant contextual information and structures of heterogeneous network. Moreover, we illustrate the performance of the proposed method on directing studies in biology, which can assist in identifying new hypotheses in biological investigation
Switcher-random-walks: a cognitive-inspired mechanism for network exploration
Semantic memory is the subsystem of human memory that stores knowledge of
concepts or meanings, as opposed to life specific experiences. The organization
of concepts within semantic memory can be understood as a semantic network,
where the concepts (nodes) are associated (linked) to others depending on
perceptions, similarities, etc. Lexical access is the complementary part of
this system and allows the retrieval of such organized knowledge. While
conceptual information is stored under certain underlying organization (and
thus gives rise to a specific topology), it is crucial to have an accurate
access to any of the information units, e.g. the concepts, for efficiently
retrieving semantic information for real-time needings. An example of an
information retrieval process occurs in verbal fluency tasks, and it is known
to involve two different mechanisms: -clustering-, or generating words within a
subcategory, and, when a subcategory is exhausted, -switching- to a new
subcategory. We extended this approach to random-walking on a network
(clustering) in combination to jumping (switching) to any node with certain
probability and derived its analytical expression based on Markov chains.
Results show that this dual mechanism contributes to optimize the exploration
of different network models in terms of the mean first passage time.
Additionally, this cognitive inspired dual mechanism opens a new framework to
better understand and evaluate exploration, propagation and transport phenomena
in other complex systems where switching-like phenomena are feasible.Comment: 9 pages, 3 figures. Accepted in "International Journal of
Bifurcations and Chaos": Special issue on "Modelling and Computation on
Complex Networks
Estimating graph parameters with random walks
An algorithm observes the trajectories of random walks over an unknown graph
, starting from the same vertex , as well as the degrees along the
trajectories. For all finite connected graphs, one can estimate the number of
edges up to a bounded factor in
steps, where
is the relaxation time of the lazy random walk on and
is the minimum degree in . Alternatively, can be estimated in
, where is
the number of vertices and is the uniform mixing time on
. The number of vertices can then be estimated up to a bounded factor in
an additional steps. Our
algorithms are based on counting the number of intersections of random walk
paths , i.e. the number of pairs such that . This
improves on previous estimates which only consider collisions (i.e., times
with ). We also show that the complexity of our algorithms is optimal,
even when restricting to graphs with a prescribed relaxation time. Finally, we
show that, given either or the mixing time of , we can compute the
"other parameter" with a self-stopping algorithm
Estimating and Sampling Graphs with Multidimensional Random Walks
Estimating characteristics of large graphs via sampling is a vital part of
the study of complex networks. Current sampling methods such as (independent)
random vertex and random walks are useful but have drawbacks. Random vertex
sampling may require too many resources (time, bandwidth, or money). Random
walks, which normally require fewer resources per sample, can suffer from large
estimation errors in the presence of disconnected or loosely connected graphs.
In this work we propose a new -dimensional random walk that uses
dependent random walkers. We show that the proposed sampling method, which we
call Frontier sampling, exhibits all of the nice sampling properties of a
regular random walk. At the same time, our simulations over large real world
graphs show that, in the presence of disconnected or loosely connected
components, Frontier sampling exhibits lower estimation errors than regular
random walks. We also show that Frontier sampling is more suitable than random
vertex sampling to sample the tail of the degree distribution of the graph
- …