2,710 research outputs found
Data-driven network alignment
Biological network alignment (NA) aims to find a node mapping between
species' molecular networks that uncovers similar network regions, thus
allowing for transfer of functional knowledge between the aligned nodes.
However, current NA methods do not end up aligning functionally related nodes.
A likely reason is that they assume it is topologically similar nodes that are
functionally related. However, we show that this assumption does not hold well.
So, a paradigm shift is needed with how the NA problem is approached. We
redefine NA as a data-driven framework, TARA (daTA-dRiven network Alignment),
which attempts to learn the relationship between topological relatedness and
functional relatedness without assuming that topological relatedness
corresponds to topological similarity, like traditional NA methods do. TARA
trains a classifier to predict whether two nodes from different networks are
functionally related based on their network topological patterns. We find that
TARA is able to make accurate predictions. TARA then takes each pair of nodes
that are predicted as related to be part of an alignment. Like traditional NA
methods, TARA uses this alignment for the across-species transfer of functional
knowledge. Clearly, TARA as currently implemented uses topological but not
protein sequence information for this task. We find that TARA outperforms
existing state-of-the-art NA methods that also use topological information,
WAVE and SANA, and even outperforms or complements a state-of-the-art NA method
that uses both topological and sequence information, PrimAlign. Hence, adding
sequence information to TARA, which is our future work, is likely to further
improve its performance
Identification of direct residue contacts in protein-protein interaction by message passing
Understanding the molecular determinants of specificity in protein-protein
interaction is an outstanding challenge of postgenome biology. The availability
of large protein databases generated from sequences of hundreds of bacterial
genomes enables various statistical approaches to this problem. In this context
covariance-based methods have been used to identify correlation between amino
acid positions in interacting proteins. However, these methods have an
important shortcoming, in that they cannot distinguish between directly and
indirectly correlated residues. We developed a method that combines covariance
analysis with global inference analysis, adopted from use in statistical
physics. Applied to a set of >2,500 representatives of the bacterial
two-component signal transduction system, the combination of covariance with
global inference successfully and robustly identified residue pairs that are
proximal in space without resorting to ad hoc tuning parameters, both for
heterointeractions between sensor kinase (SK) and response regulator (RR)
proteins and for homointeractions between RR proteins. The spectacular success
of this approach illustrates the effectiveness of the global inference approach
in identifying direct interaction based on sequence information alone. We
expect this method to be applicable soon to interaction surfaces between
proteins present in only 1 copy per genome as the number of sequenced genomes
continues to expand. Use of this method could significantly increase the
potential targets for therapeutic intervention, shed light on the mechanism of
protein-protein interaction, and establish the foundation for the accurate
prediction of interacting protein partners.Comment: Supplementary information available on
http://www.pnas.org/content/106/1/67.abstrac
- …