73,374 research outputs found
Predicting protein functions with message passing algorithms
Motivation: In the last few years a growing interest in biology has been
shifting towards the problem of optimal information extraction from the huge
amount of data generated via large scale and high-throughput techniques. One of
the most relevant issues has recently become that of correctly and reliably
predicting the functions of observed but still functionally undetermined
proteins starting from information coming from the network of co-observed
proteins of known functions.
Method: The method proposed in this article is based on a message passing
algorithm known as Belief Propagation, which takes as input the network of
proteins physical interactions and a catalog of known proteins functions, and
returns the probabilities for each unclassified protein of having one chosen
function. The implementation of the algorithm allows for fast on-line analysis,
and can be easily generalized to more complex graph topologies taking into
account hyper-graphs, {\em i.e.} complexes of more than two interacting
proteins.Comment: 12 pages, 9 eps figures, 1 additional html tabl
The Parallelism Motifs of Genomic Data Analysis
Genomic data sets are growing dramatically as the cost of sequencing
continues to decline and small sequencing devices become available. Enormous
community databases store and share this data with the research community, but
some of these genomic data analysis problems require large scale computational
platforms to meet both the memory and computational requirements. These
applications differ from scientific simulations that dominate the workload on
high end parallel systems today and place different requirements on programming
support, software libraries, and parallel architectural design. For example,
they involve irregular communication patterns such as asynchronous updates to
shared data structures. We consider several problems in high performance
genomics analysis, including alignment, profiling, clustering, and assembly for
both single genomes and metagenomes. We identify some of the common
computational patterns or motifs that help inform parallelization strategies
and compare our motifs to some of the established lists, arguing that at least
two key patterns, sorting and hashing, are missing
Disordered proteins and network disorder in network descriptions of protein structure, dynamics and function. Hypotheses and a comprehensive review
During the last decade, network approaches became a powerful tool to describe protein structure and dynamics. Here we review the links between disordered proteins and the associated networks, and describe the consequences of local, mesoscopic and global network disorder on changes in protein structure and dynamics. We introduce a new classification of protein networks into ‘cumulus-type’, i.e., those similar to puffy (white) clouds, and ‘stratus-type’, i.e., those similar to flat, dense (dark) low-lying clouds, and relate these network types to protein disorder dynamics and to differences in energy transmission processes. In the first class, there is limited overlap between the modules, which implies higher rigidity of the individual units; there the conformational changes can be described by an ‘energy transfer’ mechanism. In the second class, the topology presents a compact structure with significant overlap between the modules; there the conformational changes can be described by ‘multi-trajectories’; that is, multiple highly populated pathways. We further propose that disordered protein regions evolved to help other protein segments reach ‘rarely visited’ but functionally-related states. We also show the role of disorder in ‘spatial games’ of amino acids; highlight the effects of intrinsically disordered proteins (IDPs) on cellular networks and list some possible studies linking protein disorder and protein structure networks
Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems
A huge amount of genetic information is available thanks to the recent advances in sequencing technologies and the larger computational capabilities, but the interpretation of such genetic data at phenotypic level remains elusive. One of the reasons is that proteins are not acting alone, but are specifically interacting with other proteins and biomolecules, forming intricate interaction networks that are essential for the majority of cell processes and pathological conditions. Thus, characterizing such interaction networks is an important step in understanding how information flows from gene to phenotype. Indeed, structural characterization of protein–protein interactions at atomic resolution has many applications in biomedicine, from diagnosis and vaccine design, to drug discovery. However, despite the advances of experimental structural determination, the number of interactions for which there is available structural data is still very small. In this context, a complementary approach is computational modeling of protein interactions by docking, which is usually composed of two major phases: (i) sampling of the possible binding modes between the interacting molecules and (ii) scoring for the identification of the correct orientations. In addition, prediction of interface and hot-spot residues is very useful in order to guide and interpret mutagenesis experiments, as well as to understand functional and mechanistic aspects of the interaction. Computational docking is already being applied to specific biomedical problems within the context of personalized medicine, for instance, helping to interpret pathological mutations involved in protein–protein interactions, or providing modeled structural data for drug discovery targeting protein–protein interactions.Spanish Ministry of Economy grant number BIO2016-79960-R; D.B.B. is supported by a
predoctoral fellowship from CONACyT; M.R. is supported by an FPI fellowship from the
Severo Ochoa program. We are grateful to the Joint BSC-CRG-IRB Programme in
Computational Biology.Peer ReviewedPostprint (author's final draft
- …