346 research outputs found
Topological network alignment uncovers biological function and phylogeny
Sequence comparison and alignment has had an enormous impact on our
understanding of evolution, biology, and disease. Comparison and alignment of
biological networks will likely have a similar impact. Existing network
alignments use information external to the networks, such as sequence, because
no good algorithm for purely topological alignment has yet been devised. In
this paper, we present a novel algorithm based solely on network topology, that
can be used to align any two networks. We apply it to biological networks to
produce by far the most complete topological alignments of biological networks
to date. We demonstrate that both species phylogeny and detailed biological
function of individual proteins can be extracted from our alignments.
Topology-based alignments have the potential to provide a completely new,
independent source of phylogenetic information. Our alignment of the
protein-protein interaction networks of two very different species--yeast and
human--indicate that even distant species share a surprising amount of network
topology with each other, suggesting broad similarities in internal cellular
wiring across all life on Earth.Comment: Algorithm explained in more details. Additional analysis adde
The genetic architecture of type 2 diabetes
The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of heritability. To test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole genome sequencing in 2,657 Europeans with and without diabetes, and exome sequencing in a total of 12,940 subjects from five ancestral groups. To increase statistical power, we expanded sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support a major role for lower-frequency variants in predisposition to type 2 diabetes
Sequence data and association statistics from 12,940 type 2 diabetes cases and controls
To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1–5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (\u3e80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D
Network Archaeology: Uncovering Ancient Networks from Present-day Interactions
Often questions arise about old or extinct networks. What proteins interacted
in a long-extinct ancestor species of yeast? Who were the central players in
the Last.fm social network 3 years ago? Our ability to answer such questions
has been limited by the unavailability of past versions of networks. To
overcome these limitations, we propose several algorithms for reconstructing a
network's history of growth given only the network as it exists today and a
generative model by which the network is believed to have evolved. Our
likelihood-based method finds a probable previous state of the network by
reversing the forward growth model. This approach retains node identities so
that the history of individual nodes can be tracked. We apply these algorithms
to uncover older, non-extant biological and social networks believed to have
grown via several models, including duplication-mutation with complementarity,
forest fire, and preferential attachment. Through experiments on both synthetic
and real-world data, we find that our algorithms can estimate node arrival
times, identify anchor nodes from which new nodes copy links, and can reveal
significant features of networks that have long since disappeared.Comment: 16 pages, 10 figure
Simultaneous Optimization of Both Node and Edge Conservation in Network Alignment via WAVE
Network alignment can be used to transfer functional knowledge between
conserved regions of different networks. Typically, existing methods use a node
cost function (NCF) to compute similarity between nodes in different networks
and an alignment strategy (AS) to find high-scoring alignments with respect to
the total NCF over all aligned nodes (or node conservation). But, they then
evaluate quality of their alignments via some other measure that is different
than the node conservation measure used to guide the alignment construction
process. Typically, one measures the amount of conserved edges, but only after
alignments are produced. Hence, a recent attempt aimed to directly maximize the
amount of conserved edges while constructing alignments, which improved
alignment accuracy. Here, we aim to directly maximize both node and edge
conservation during alignment construction to further improve alignment
accuracy. For this, we design a novel measure of edge conservation that (unlike
existing measures that treat each conserved edge the same) weighs each
conserved edge so that edges with highly NCF-similar end nodes are favored. As
a result, we introduce a novel AS, Weighted Alignment VotEr (WAVE), which can
optimize any measures of node and edge conservation, and which can be used with
any NCF or combination of multiple NCFs. Using WAVE on top of established
state-of-the-art NCFs leads to superior alignments compared to the existing
methods that optimize only node conservation or only edge conservation or that
treat each conserved edge the same. And while we evaluate WAVE in the
computational biology domain, it is easily applicable in any domain.Comment: 12 pages, 4 figure
AlignNemo: A Local Network Alignment Method to Integrate Homology and Topology
Local network alignment is an important component of the analysis of protein-protein interaction networks that may lead to the identification of evolutionary related complexes. We present AlignNemo, a new algorithm that, given the networks of two organisms, uncovers subnetworks of proteins that relate in biological function and topology of interactions. The discovered conserved subnetworks have a general topology and need not to correspond to specific interaction patterns, so that they more closely fit the models of functional complexes proposed in the literature. The algorithm is able to handle sparse interaction data with an expansion process that at each step explores the local topology of the networks beyond the proteins directly interacting with the current solution. To assess the performance of AlignNemo, we ran a series of benchmarks using statistical measures as well as biological knowledge. Based on reference datasets of protein complexes, AlignNemo shows better performance than other methods in terms of both precision and recall. We show our solutions to be biologically sound using the concept of semantic similarity applied to Gene Ontology vocabularies. The binaries of AlignNemo and supplementary details about the algorithms and the experiments are available at: sourceforge.net/p/alignnemo
- …