586,315 research outputs found
Historical contingency and entrenchment in protein evolution under purifying selection
The fitness contribution of an allele at one genetic site may depend on
alleles at other sites, a phenomenon known as epistasis. Epistasis can
profoundly influence the process of evolution in populations under selection,
and can shape the course of protein evolution across divergent species. Whereas
epistasis between adaptive substitutions has been the subject of extensive
study, relatively little is known about epistasis under purifying selection.
Here we use mechanistic models of thermodynamic stability in a ligand-binding
protein to explore the structure of epistatic interactions between
substitutions that fix in protein sequences under purifying selection. We find
that the selection coefficients of mutations that are nearly-neutral when they
fix are highly contingent on the presence of preceding mutations. Conversely,
mutations that are nearly-neutral when they fix are subsequently entrenched due
to epistasis with later substitutions. Our evolutionary model includes
insertions and deletions, as well as point mutations, and so it allows us to
quantify epistasis within each of these classes of mutations, and also to study
the evolution of protein length. We find that protein length remains largely
constant over time, because indels are more deleterious than point mutations.
Our results imply that, even under purifying selection, protein sequence
evolution is highly contingent on history and so it cannot be predicted by the
phenotypic effects of mutations assayed in the wild-type sequence.Comment: 42 pages, 13 figure
Multi-omics integration accurately predicts cellular state in unexplored conditions for Escherichia coli.
A significant obstacle in training predictive cell models is the lack of integrated data sources. We develop semi-supervised normalization pipelines and perform experimental characterization (growth, transcriptional, proteome) to create Ecomics, a consistent, quality-controlled multi-omics compendium for Escherichia coli with cohesive meta-data information. We then use this resource to train a multi-scale model that integrates four omics layers to predict genome-wide concentrations and growth dynamics. The genetic and environmental ontology reconstructed from the omics data is substantially different and complementary to the genetic and chemical ontologies. The integration of different layers confers an incremental increase in the prediction performance, as does the information about the known gene regulatory and protein-protein interactions. The predictive performance of the model ranges from 0.54 to 0.87 for the various omics layers, which far exceeds various baselines. This work provides an integrative framework of omics-driven predictive modelling that is broadly applicable to guide biological discovery
The SWISS-MODEL Repository: new features and functionalities
The SWISS-MODEL Repository is a database of annotated 3D protein structure models generated by the SWISS-MODEL homology-modelling pipeline. As of September 2005, the repository contained 675 000 models for 604 000 different protein sequences of the UniProt database. Regular updates ensure that the content of the repository reflects the current state of sequence and structure databases, integrating new or modified target sequences, and making use of new template structures. Each Repository entry consists of one or more 3D models accompanied by detailed information about the target protein and the model building process: functional annotation, a detailed template selection log, target-template alignment, summary of the model building and model quality assessment. The SWISS-MODEL Repository is freely accessible at http://swissmodel.expasy.org/repositor
The SWISS-MODEL Repository: new features and functionalities
The SWISS-MODEL Repository is a database of annotated 3D protein structure models generated by the SWISS-MODEL homology-modelling pipeline. As of September 2005, the repository contained 675 000 models for 604 000 different protein sequences of the UniProt database. Regular updates ensure that the content of the repository reflects the current state of sequence and structure databases, integrating new or modified target sequences, and making use of new template structures. Each Repository entry consists of one or more 3D models accompanied by detailed information about the target protein and the model building process: functional annotation, a detailed template selection log, target-template alignment, summary of the model building and model quality assessment. The SWISS-MODEL Repository is freely accessible at
Systematic analysis of primary sequence domain segments for the discrimination between class C GPCR subtypes
G-protein-coupled receptors (GPCRs) are a large and diverse super-family of eukaryotic cell membrane proteins that play an important physiological role as transmitters of extracellular signal. In this paper, we investigate Class C, a member of this super-family that has attracted much attention in pharmacology. The limited knowledge about the complete 3D crystal structure of Class C receptors makes necessary the use of their primary amino acid sequences for analytical purposes. Here, we provide a systematic analysis of distinct receptor sequence segments with regard to their ability to differentiate between seven class C GPCR subtypes according to their topological location in the extracellular, transmembrane, or intracellular domains. We build on the results from the previous research that provided preliminary evidence of the potential use of separated domains of complete class C GPCR sequences as the basis for subtype classification. The use of the extracellular N-terminus domain alone was shown to result in a minor decrease in subtype discrimination in comparison with the complete sequence, despite discarding much of the sequence information. In this paper, we describe the use of Support Vector Machine-based classification models to evaluate the subtype-discriminating capacity of the specific topological sequence segments.Peer ReviewedPostprint (author's final draft
- …