24 research outputs found
mCSM-AB: a web server for predicting antibody-antigen affinity changes upon mutation with graph-based signatures.
Computational methods have traditionally struggled to predict the effect of mutations in antibody-antigen complexes on binding affinity. This has limited their usefulness during antibody engineering and development, and their ability to predict biologically relevant escape mutations. Here we present mCSM-AB, a user-friendly web server for accurately predicting antibody-antigen affinity changes upon mutation which relies on graph-based signatures. We show that mCSM-AB performs better than comparable methods that have been previously used for antibody engineering. mCSM-AB web server is available at http://structure.bioc.cam.ac.uk/mcsm_ab.This is the final published version. It first appeared at http://nar.oxfordjournals.org/content/early/2016/05/23/nar.gkw458.full
Recommended from our members
DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach.
Cancer genome and other sequencing initiatives are generating extensive data on non-synonymous single nucleotide polymorphisms (nsSNPs) in human and other genomes. In order to understand the impacts of nsSNPs on the structure and function of the proteome, as well as to guide protein engineering, accurate in silicomethodologies are required to study and predict their effects on protein stability. Despite the diversity of available computational methods in the literature, none has proven accurate and dependable on its own under all scenarios where mutation analysis is required. Here we present DUET, a web server for an integrated computational approach to study missense mutations in proteins. DUET consolidates two complementary approaches (mCSM and SDM) in a consensus prediction, obtained by combining the results of the separate methods in an optimized predictor using Support Vector Machines (SVM). We demonstrate that the proposed method improves overall accuracy of the predictions in comparison with either method individually and performs as well as or better than similar methods. The DUET web server is freely and openly available at http://structure.bioc.cam.ac.uk/duet
Adapting Pretrained Language Models for Solving Tabular Prediction Problems in the Electronic Health Record
We propose an approach for adapting the DeBERTa model for electronic health
record (EHR) tasks using domain adaptation. We pretrain a small DeBERTa model
on a dataset consisting of MIMIC-III discharge summaries, clinical notes,
radiology reports, and PubMed abstracts. We compare this model's performance
with a DeBERTa model pre-trained on clinical texts from our institutional EHR
(MeDeBERTa) and an XGBoost model. We evaluate performance on three benchmark
tasks for emergency department outcomes using the MIMIC-IV-ED dataset. We
preprocess the data to convert it into text format and generate four versions
of the original datasets to compare data processing and data inclusion. The
results show that our proposed approach outperforms the alternative models on
two of three tasks (p<0.001) and matches performance on the third task, with
the use of descriptive columns improving performance over the original column
names
mCSM-lig: quantifying the effects of mutations on protein-small molecule affinity in genetic disease and emergence of drug resistance.
The ability to predict how a mutation affects ligand binding is an essential step in understanding, anticipating and improving the design of new treatments for drug resistance, and in understanding genetic diseases. Here we present mCSM-lig, a structure-guided computational approach for quantifying the effects of single-point missense mutations on affinities of small molecules for proteins. mCSM-lig uses graph-based signatures to represent the wild-type environment of mutations, and small-molecule chemical features and changes in protein stability as evidence to train a predictive model using a representative set of protein-ligand complexes from the Platinum database. We show our method provides a very good correlation with experimental data (up to ρ = 0.67) and is effective in predicting a range of chemotherapeutic, antiviral and antibiotic resistance mutations, providing useful insights for genotypic screening and to guide drug development. mCSM-lig also provides insights into understanding Mendelian disease mutations and as a tool for guiding protein design. mCSM-lig is freely available as a web server at http://structure.bioc.cam.ac.uk/mcsm_lig.Newton Fund RCUK-CONFAP Grant awarded by The Medical Research Council (MRC) and Fundação de
Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG) [MR/M026302/1 to D.E.V.P, T.L.B. and D.B.A.].
René Rachou Research Center (CPqRR/FIOCRUZ Minas), Brazil [to D.E.V.P.]; NHMRC CJ Martin Fellowship
[APP1072476 to D.B.A.]; University of Cambridge and The Wellcome Trust for facilities and support [to T.L.B.].This is the final version of the article. It first appeared from Nature Publishing Group at http://dx.doi.org/10.1038/srep29575
pkCSM: Predicting Small-Molecule Pharmacokinetic and Toxicity Properties Using Graph-Based Signatures.
Drug development has a high attrition rate, with poor pharmacokinetic and safety properties a significant hurdle. Computational approaches may help minimize these risks. We have developed a novel approach (pkCSM) which uses graph-based signatures to develop predictive models of central ADMET properties for drug development. pkCSM performs as well or better than current methods. A freely accessible web server (http://structure.bioc.cam.ac.uk/pkcsm), which retains no information submitted to it, provides an integrated platform to rapidly evaluate pharmacokinetic and toxicity properties.Newton Fund RCUK-CONFAP grant awarded by The Medical
Research Council (MRC) and Fundac
a
o de Amparo a
Pesquisa
do Estado de Minas Gerais (FAPEMIG) [to D.E.V.P., T.L.B,.
and D.B.A.]; Conselho Nacional de Desenvolvimento
Cienti
fi
co e Tecnolo
gico (CNPq), and Centro de Pesquisas
Rene
Rachou (CPqRR/FIOCRUZ Minas), Brazil [to
D.E.V.P.]; NHMRC CJ Martin Fellowship [APP1072476 to
D.B.A.]; University of Cambridge and The Wellcome Trust for
facilities and support [to T.L.B.]. Funding for open access
charge: The Wellcome Trust.This is the final version. It was first published by ACS at http://pubs.acs.org/doi/abs/10.1021/acs.jmedchem.5b00104
DynaMut: predicting the impact of mutations on protein conformation, flexibility and stability.
Proteins are highly dynamic molecules, whose function is intrinsically linked to their molecular motions. Despite the pivotal role of protein dynamics, their computational simulation cost has led to most structure-based approaches for assessing the impact of mutations on protein structure and function relying upon static structures. Here we present DynaMut, a web server implementing two distinct, well established normal mode approaches, which can be used to analyze and visualize protein dynamics by sampling conformations and assess the impact of mutations on protein dynamics and stability resulting from vibrational entropy changes. DynaMut integrates our graph-based signatures along with normal mode dynamics to generate a consensus prediction of the impact of a mutation on protein stability. We demonstrate our approach outperforms alternative approaches to predict the effects of mutations on protein stability and flexibility (P-value < 0.001), achieving a correlation of up to 0.70 on blind tests. DynaMut also provides a comprehensive suite for protein motion and flexibility analysis and visualization via a freely available, user friendly web server at http://biosig.unimelb.edu.au/dynamut/
DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach.
Cancer genome and other sequencing initiatives are generating extensive data on non-synonymous single nucleotide polymorphisms (nsSNPs) in human and other genomes. In order to understand the impacts of nsSNPs on the structure and function of the proteome, as well as to guide protein engineering, accurate in silicomethodologies are required to study and predict their effects on protein stability. Despite the diversity of available computational methods in the literature, none has proven accurate and dependable on its own under all scenarios where mutation analysis is required. Here we present DUET, a web server for an integrated computational approach to study missense mutations in proteins. DUET consolidates two complementary approaches (mCSM and SDM) in a consensus prediction, obtained by combining the results of the separate methods in an optimized predictor using Support Vector Machines (SVM). We demonstrate that the proposed method improves overall accuracy of the predictions in comparison with either method individually and performs as well as or better than similar methods. The DUET web server is freely and openly available at http://structure.bioc.cam.ac.uk/duet
Recommended from our members
pdCSM-GPCR: predicting potent GPCR ligands with graph-based signatures.
Funder: Newton Fund RCUK-CONFAPFunder: Victorian Government’s Operational Infrastructure Support ProgramMOTIVATION: G protein-coupled receptors (GPCRs) can selectively bind to many types of ligands, ranging from light-sensitive compounds, ions, hormones, pheromones and neurotransmitters, modulating cell physiology. Considering their role in many essential cellular processes, they are one of the most targeted protein families, with over a third of all approved drugs modulating GPCR signalling. Despite this, the large diversity of receptors and their multipass transmembrane architectures make the identification and development of novel specific, and safe GPCR ligands a challenge. While computational approaches have the potential to assist GPCR drug development, they have presented limited performance and generalization capabilities. Here, we explored the use of graph-based signatures to develop pdCSM-GPCR, a method capable of rapidly and accurately screening potential GPCR ligands. RESULTS: Bioactivity data (IC50, EC50, Ki and Kd) for individual GPCRs were curated. After curation, we used the data for developing predictive models for 36 major GPCR targets, across 4 classes (A, B, C and F). Our models compose the most comprehensive computational resource for GPCR bioactivity prediction to date. Across stratified 10-fold cross-validation and blind tests, our approach achieved Pearson's correlations of up to 0.89, significantly outperforming previous methods. Interpreting our results, we identified common important features of potent GPCRs ligands, which tend to have bicyclic rings, leading to higher levels of aromaticity. We believe pdCSM-GPCR will be an invaluable tool to assist screening efforts, enriching compound libraries and ranking candidates for further experimental validation. AVAILABILITY AND IMPLEMENTATION: pdCSM-GPCR predictive models and datasets used have been made available via a freely accessible and easy-to-use web server at http://biosig.unimelb.edu.au/pdcsm_gpcr/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online
In silico functional dissection of saturation mutagenesis: Interpreting the relationship between phenotypes and changes in protein stability, interactions and activity.
Despite interest in associating polymorphisms with clinical or experimental phenotypes, functional interpretation of mutation data has lagged behind generation of data from modern high-throughput techniques and the accurate prediction of the molecular impact of a mutation remains a non-trivial task. We present here an integrated knowledge-driven computational workflow designed to evaluate the effects of experimental and disease missense mutations on protein structure and interactions. We exemplify its application with analyses of saturation mutagenesis of DBR1 and Gal4 and show that the experimental phenotypes for over 80% of the mutations correlate well with predicted effects of mutations on protein stability and RNA binding affinity. We also show that analysis of mutations in VHL using our workflow provides valuable insights into the effects of mutations, and their links to the risk of developing renal carcinoma. Taken together the analyses of the three examples demonstrate that structural bioinformatics tools, when applied in a systematic, integrated way, can rapidly analyse a given system to provide a powerful approach for predicting structural and functional effects of thousands of mutations in order to reveal molecular mechanisms leading to a phenotype. Missense or non-synonymous mutations are nucleotide substitutions that alter the amino acid sequence of a protein. Their effects can range from modifying transcription, translation, processing and splicing, localization, changing stability of the protein, altering its dynamics or interactions with other proteins, nucleic acids and ligands, including small molecules and metal ions. The advent of high-throughput techniques including sequencing and saturation mutagenesis has provided large amounts of phenotypic data linked to mutations. However, one of the hurdles has been understanding and quantifying the effects of a particular mutation, and how they translate into a given phenotype. One approach to overcome this is to use robust, accurate and scalable computational methods to understand and correlate structural effects of mutations with disease.Newton Fund RCUK-CONFAP Grant awarded by The Medical Research Council (MRC) and Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG) [to D.E.V.P, T.L.B. and D.B.A.]. Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) and René Rachou Research Center (CPqRR/FIOCRUZ Minas), Brazil [to D.E.V.P.]; NHMRC CJ Martin Fellowship [APP1072476 to D.B.A.]; University of Cambridge and The Wellcome Trust for facilities and support [to T.L.B.]. Funding for open access charge: The Wellcome Trust.This is the final version of the article. It first appeared from Nature Publishing Group via http://dx.doi.org/10.1038/srep1984
Recommended from our members
mCSM-PPI2: predicting the effects of mutations on protein-protein interactions.
Protein-protein Interactions are involved in most fundamental biological processes, with disease causing mutations enriched at their interfaces. Here we present mCSM-PPI2, a novel machine learning computational tool designed to more accurately predict the effects of missense mutations on protein-protein interaction binding affinity. mCSM-PPI2 uses graph-based structural signatures to model effects of variations on the inter-residue interaction network, evolutionary information, complex network metrics and energetic terms to generate an optimised predictor. We demonstrate that our method outperforms previous methods, ranking first among 26 others on CAPRI blind tests. mCSM-PPI2 is freely available as a user friendly webserver at http://biosig.unimelb.edu.au/mcsm_ppi2/.This work was supported by the Australian Government Research Training Program Scholarship [to C.H.M.R and Y.M.]; the Jack Brockhoff Foundation [JBF 4186, 2016 to D.B.A.]; a Newton Fund RCUK-CONFAP Grant awarded by The Medical Research Council (MRC) and Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG) [MR/M026302/1 to D.B.A. and D.E.V.P.]; the National Health and Medical Research Council of Australia [APP1072476 to D.B.A.]; the Victorian Life Sciences Computation Initiative (VLSCI), an initiative of the Victorian Government, Australia, on its Facility hosted at the University of Melbourne [UOM0017]; the Instituto René Rachou (IRR/FIOCRUZ Minas), Brazil and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) [to D.E.V.P.] and the Department of Biochemistry and Molecular Biology, University of Melbourne [to D.B.A.]. Supported in part by the Victorian Government's OIS Program