82,869 research outputs found
From principal component to direct coupling analysis of coevolution in proteins: Low-eigenvalue modes are needed for structure prediction
Various approaches have explored the covariation of residues in
multiple-sequence alignments of homologous proteins to extract functional and
structural information. Among those are principal component analysis (PCA),
which identifies the most correlated groups of residues, and direct coupling
analysis (DCA), a global inference method based on the maximum entropy
principle, which aims at predicting residue-residue contacts. In this paper,
inspired by the statistical physics of disordered systems, we introduce the
Hopfield-Potts model to naturally interpolate between these two approaches. The
Hopfield-Potts model allows us to identify relevant 'patterns' of residues from
the knowledge of the eigenmodes and eigenvalues of the residue-residue
correlation matrix. We show how the computation of such statistical patterns
makes it possible to accurately predict residue-residue contacts with a much
smaller number of parameters than DCA. This dimensional reduction allows us to
avoid overfitting and to extract contact information from multiple-sequence
alignments of reduced size. In addition, we show that low-eigenvalue
correlation modes, discarded by PCA, are important to recover structural
information: the corresponding patterns are highly localized, that is, they are
concentrated in few sites, which we find to be in close contact in the
three-dimensional protein fold.Comment: Supporting information can be downloaded from:
http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.100317
Large-scale analysis of disease pathways in the human interactome
Discovering disease pathways, which can be defined as sets of proteins
associated with a given disease, is an important problem that has the potential
to provide clinically actionable insights for disease diagnosis, prognosis, and
treatment. Computational methods aid the discovery by relying on
protein-protein interaction (PPI) networks. They start with a few known
disease-associated proteins and aim to find the rest of the pathway by
exploring the PPI network around the known disease proteins. However, the
success of such methods has been limited, and failure cases have not been well
understood. Here we study the PPI network structure of 519 disease pathways. We
find that 90% of pathways do not correspond to single well-connected components
in the PPI network. Instead, proteins associated with a single disease tend to
form many separate connected components/regions in the network. We then
evaluate state-of-the-art disease pathway discovery methods and show that their
performance is especially poor on diseases with disconnected pathways. Thus, we
conclude that network connectivity structure alone may not be sufficient for
disease pathway discovery. However, we show that higher-order network
structures, such as small subgraphs of the pathway, provide a promising
direction for the development of new methods
Large-scale inference and graph theoretical analysis of gene-regulatory networks in B. stubtilis
We present the methods and results of a two-stage modeling process that
generates candidate gene-regulatory networks of the bacterium B. subtilis from
experimentally obtained, yet mathematically underdetermined microchip array
data. By employing a computational, linear correlative procedure to generate
these networks, and by analyzing the networks from a graph theoretical
perspective, we are able to verify the biological viability of our inferred
networks, and we demonstrate that our networks' graph theoretical properties
are remarkably similar to those of other biological systems. In addition, by
comparing our inferred networks to those of a previous, noisier implementation
of the linear inference process [17], we are able to identify trends in graph
theoretical behavior that occur both in our networks as well as in their
perturbed counterparts. These commonalities in behavior at multiple levels of
complexity allow us to ascertain the level of complexity to which our process
is robust to noise.Comment: 22 pages, 4 figures, accepted for publication in Physica A (2006
How to understand the cell by breaking it: network analysis of gene perturbation screens
Modern high-throughput gene perturbation screens are key technologies at the
forefront of genetic research. Combined with rich phenotypic descriptors they
enable researchers to observe detailed cellular reactions to experimental
perturbations on a genome-wide scale. This review surveys the current
state-of-the-art in analyzing perturbation screens from a network point of
view. We describe approaches to make the step from the parts list to the wiring
diagram by using phenotypes for network inference and integrating them with
complementary data sources. The first part of the review describes methods to
analyze one- or low-dimensional phenotypes like viability or reporter activity;
the second part concentrates on high-dimensional phenotypes showing global
changes in cell morphology, transcriptome or proteome.Comment: Review based on ISMB 2009 tutorial; after two rounds of revisio
Spectral analysis of gene expression profiles using gene networks
Microarrays have become extremely useful for analysing genetic phenomena, but
establishing a relation between microarray analysis results (typically a list
of genes) and their biological significance is often difficult. Currently, the
standard approach is to map a posteriori the results onto gene networks to
elucidate the functions perturbed at the level of pathways. However,
integrating a priori knowledge of the gene networks could help in the
statistical analysis of gene expression data and in their biological
interpretation. Here we propose a method to integrate a priori the knowledge of
a gene network in the analysis of gene expression data. The approach is based
on the spectral decomposition of gene expression profiles with respect to the
eigenfunctions of the graph, resulting in an attenuation of the high-frequency
components of the expression profiles with respect to the topology of the
graph. We show how to derive unsupervised and supervised classification
algorithms of expression profiles, resulting in classifiers with biological
relevance. We applied the method to the analysis of a set of expression
profiles from irradiated and non-irradiated yeast strains. It performed at
least as well as the usual classification but provides much more biologically
relevant results and allows a direct biological interpretation
A General Framework for Complex Network Applications
Complex network theory has been applied to solving practical problems from
different domains. In this paper, we present a general framework for complex
network applications. The keys of a successful application are a thorough
understanding of the real system and a correct mapping of complex network
theory to practical problems in the system. Despite of certain limitations
discussed in this paper, complex network theory provides a foundation on which
to develop powerful tools in analyzing and optimizing large interconnected
systems.Comment: 8 page
- …