34 research outputs found
Potentials of Mean Force for Protein Structure Prediction Vindicated, Formalized and Generalized
Understanding protein structure is of crucial importance in science, medicine
and biotechnology. For about two decades, knowledge based potentials based on
pairwise distances -- so-called "potentials of mean force" (PMFs) -- have been
center stage in the prediction and design of protein structure and the
simulation of protein folding. However, the validity, scope and limitations of
these potentials are still vigorously debated and disputed, and the optimal
choice of the reference state -- a necessary component of these potentials --
is an unsolved problem. PMFs are loosely justified by analogy to the reversible
work theorem in statistical physics, or by a statistical argument based on a
likelihood function. Both justifications are insightful but leave many
questions unanswered. Here, we show for the first time that PMFs can be seen as
approximations to quantities that do have a rigorous probabilistic
justification: they naturally arise when probability distributions over
different features of proteins need to be combined. We call these quantities
reference ratio distributions deriving from the application of the reference
ratio method. This new view is not only of theoretical relevance, but leads to
many insights that are of direct practical use: the reference state is uniquely
defined and does not require external physical insights; the approach can be
generalized beyond pairwise distances to arbitrary features of protein
structure; and it becomes clear for which purposes the use of these quantities
is justified. We illustrate these insights with two applications, involving the
radius of gyration and hydrogen bonding. In the latter case, we also show how
the reference ratio method can be iteratively applied to sculpt an energy
funnel. Our results considerably increase the understanding and scope of energy
functions derived from known biomolecular structures
Recommended from our members
Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak
In its largest outbreak, Ebola virus disease is spreading through Guinea, Liberia, Sierra Leone, and Nigeria. We sequenced 99 Ebola virus genomes from 78 patients in Sierra Leone to ~2000× coverage. We observed a rapid accumulation of interhost and intrahost genetic variation, allowing us to characterize patterns of viral transmission over the initial weeks of the epidemic. This West African variant likely diverged from central African lineages around 2004, crossed from Guinea to Sierra Leone in May 2014, and has exhibited sustained human-to-human transmission subsequently, with no evidence of additional zoonotic sources. Because many of the mutations alter protein sequences and other biologically meaningful targets, they should be monitored for impact on diagnostics, vaccines, and therapies critical to outbreak response.Organismic and Evolutionary Biolog
Clinical Illness and Outcomes in Patients with Ebola in Sierra Leone
Background
Limited clinical and laboratory data are available on patients with Ebola virus disease (EVD). The Kenema Government Hospital in Sierra Leone, which had an existing infrastructure for research regarding viral hemorrhagic fever, has received and cared for patients with EVD since the beginning of the outbreak in Sierra Leone in May 2014.
Methods
We reviewed available epidemiologic, clinical, and laboratory records of patients in whom EVD was diagnosed between May 25 and June 18, 2014. We used quantitative reverse-transcriptase–polymerase-chain-reaction assays to assess the load of Ebola virus (EBOV, Zaire species) in a subgroup of patients.
Results
Of 106 patients in whom EVD was diagnosed, 87 had a known outcome, and 44 had detailed clinical information available. The incubation period was estimated to be 6 to 12 days, and the case fatality rate was 74%. Common findings at presentation included fever (in 89% of the patients), headache (in 80%), weakness (in 66%), dizziness (in 60%), diarrhea (in 51%), abdominal pain (in 40%), and vomiting (in 34%). Clinical and laboratory factors at presentation that were associated with a fatal outcome included fever, weakness, dizziness, diarrhea, and elevated levels of blood urea nitrogen, aspartate aminotransferase, and creatinine. Exploratory analyses indicated that patients under the age of 21 years had a lower case fatality rate than those over the age of 45 years (57% vs. 94%, P=0.03), and patients presenting with fewer than 100,000 EBOV copies per milliliter had a lower case fatality rate than those with 10 million EBOV copies per milliliter or more (33% vs. 94%, P=0.003). Bleeding occurred in only 1 patient.
Conclusions
The incubation period and case fatality rate among patients with EVD in Sierra Leone are similar to those observed elsewhere in the 2014 outbreak and in previous outbreaks. Although bleeding was an infrequent finding, diarrhea and other gastrointestinal manifestations were common. (Funded by the National Institutes of Health and others.
J. Biomol. Struct. Dyn.
Making use of an ab-initio folding simulator, we generate in vitro pathways leading to the native fold in moderate size single-domain proteins. The assessment of pathway diversity is not biased by any a-priori information on the native fold. We focus on two study cases, hyperthermophile variant of protein G domain (1gb4) and ubiquitin (lubi), with the same topology but different context dependence in their native folds. We demonstrate that a quenching of structural fluctuations is achieved once the proteins find a stationary plateau maximizing the number of highly protected hydrogen bonds. This enables us to identify the folding nucleus and show that folding does not become expeditious until a concerted event takes place generating a topology able to prevent water attack on a maximal number of hydrogen bonds. This result is consistent with the standard nucleation mechanism postulated for two-state folders. Pathway diversity is correlated with the extent of conflict between local structural propensity and large-scale context, rather than with contact order: In highly context-dependent proteins, the success of folding cannot rely on a single fortuitous event in which local propensity is overruled by large-scale effects. We predict mutational (D values on individual pathways, compute ensemble averages and predict extent of surface burial and percentage of hydrogen bonding on each component of the transition state ensemble, thus deconvoluting individual folding-route contributions to the averaged two-state kinetic picture. Our predicted kinetic isotopic effects find experimental support and lead to further probes. Finally, the molecular redesign potentiality of the method, aimed at increasing folding expediency, is explored
Pathway diversity and concertedness in protein folding: An ab- initio approach
Making use of an ab-initio folding simulator, we generate in vitro pathways leading to the native fold in moderate size single-domain proteins. The assessment of pathway diversity is not biased by any a-priori information on the native fold. We focus on two study cases, hyperthermophile variant of protein G domain (1gb4) and ubiquitin (lubi), with the same topology but different context dependence in their native folds. We demonstrate that a quenching of structural fluctuations is achieved once the proteins find a stationary plateau maximizing the number of highly protected hydrogen bonds. This enables us to identify the folding nucleus and show that folding does not become expeditious until a concerted event takes place generating a topology able to prevent water attack on a maximal number of hydrogen bonds. This result is consistent with the standard nucleation mechanism postulated for two-state folders. Pathway diversity is correlated with the extent of conflict between local structural propensity and large-scale context, rather than with contact order: In highly context-dependent proteins, the success of folding cannot rely on a single fortuitous event in which local propensity is overruled by large-scale effects. We predict mutational (D values on individual pathways, compute ensemble averages and predict extent of surface burial and percentage of hydrogen bonding on each component of the transition state ensemble, thus deconvoluting individual folding-route contributions to the averaged two-state kinetic picture. Our predicted kinetic isotopic effects find experimental support and lead to further probes. Finally, the molecular redesign potentiality of the method, aimed at increasing folding expediency, is explored
Machine-learning Prognostic Models from the 2014-16 Ebola Outbreak: Data-harmonization Challenges, Validation Strategies, and mHealth Applications.
Ebola virus disease (EVD) plagues low-resource and difficult-to-access settings. Machine learning prognostic models and mHealth tools could improve the understanding and use of evidence-based care guidelines in such settings. However, data incompleteness and lack of interoperability limit model generalizability. This study harmonizes diverse datasets from the 2014-16 EVD epidemic and generates several prognostic models incorporated into the novel Ebola Care Guidelines app that provides informed access to recommended evidence-based guidelines.
Multivariate logistic regression was applied to investigate survival outcomes in 470 patients admitted to five Ebola treatment units in Liberia and Sierra Leone at various timepoints during 2014-16. We generated a parsimonious model (viral load, age, temperature, bleeding, jaundice, dyspnea, dysphagia, and time-to-presentation) and several fallback models for when these variables are unavailable. All were externally validated against two independent datasets and compared to further models including expert observational wellness assessments. Models were incorporated into an app highlighting the signs/symptoms with the largest contribution to prognosis.
The parsimonious model approached the predictive power of observational assessments by experienced clinicians (Area-Under-the-Curve, AUC = 0.70-0.79, accuracy = 0.64-0.74) and maintained its performance across subcohorts with different healthcare seeking behaviors. Age and viral load contributed > 5-fold the weighting of other features and including them in a minimal model had a similar AUC, albeit at the cost of specificity.
Clinically guided prognostic models can recapitulate clinical expertise and be useful when such expertise is unavailable. Incorporating these models into mHealth tools may facilitate their interpretation and provide informed access to comprehensive clinical guidelines.
Howard Hughes Medical Institute, US National Institutes of Health, Bill & Melinda Gates Foundation, International Medical Corps, UK Department for International Development, and GOAL Global