30 research outputs found

    Improving protein succinylation sites prediction using embeddings from protein language model

    Get PDF
    Protein succinylation is an important post-translational modification (PTM) responsible for many vital metabolic activities in cells, including cellular respiration, regulation, and repair. Here, we present a novel approach that combines features from supervised word embedding with embedding from a protein language model called ProtT5-XL-UniRef50 (hereafter termed, ProtT5) in a deep learning framework to predict protein succinylation sites. To our knowledge, this is one of the first attempts to employ embedding from a pre-trained protein language model to predict protein succinylation sites. The proposed model, dubbed LMSuccSite, achieves state-of-the-art results compared to existing methods, with performance scores of 0.36, 0.79, 0.79 for MCC, sensitivity, and specificity, respectively. LMSuccSite is likely to serve as a valuable resource for exploration of succinylation and its role in cellular physiology and disease

    Impact of primary kidney disease on the effects of empagliflozin in patients with chronic kidney disease: secondary analyses of the EMPA-KIDNEY trial

    Get PDF
    Background: The EMPA KIDNEY trial showed that empagliflozin reduced the risk of the primary composite outcome of kidney disease progression or cardiovascular death in patients with chronic kidney disease mainly through slowing progression. We aimed to assess how effects of empagliflozin might differ by primary kidney disease across its broad population. Methods: EMPA-KIDNEY, a randomised, controlled, phase 3 trial, was conducted at 241 centres in eight countries (Canada, China, Germany, Italy, Japan, Malaysia, the UK, and the USA). Patients were eligible if their estimated glomerular filtration rate (eGFR) was 20 to less than 45 mL/min per 1·73 m2, or 45 to less than 90 mL/min per 1·73 m2 with a urinary albumin-to-creatinine ratio (uACR) of 200 mg/g or higher at screening. They were randomly assigned (1:1) to 10 mg oral empagliflozin once daily or matching placebo. Effects on kidney disease progression (defined as a sustained ≥40% eGFR decline from randomisation, end-stage kidney disease, a sustained eGFR below 10 mL/min per 1·73 m2, or death from kidney failure) were assessed using prespecified Cox models, and eGFR slope analyses used shared parameter models. Subgroup comparisons were performed by including relevant interaction terms in models. EMPA-KIDNEY is registered with ClinicalTrials.gov, NCT03594110. Findings: Between May 15, 2019, and April 16, 2021, 6609 participants were randomly assigned and followed up for a median of 2·0 years (IQR 1·5–2·4). Prespecified subgroupings by primary kidney disease included 2057 (31·1%) participants with diabetic kidney disease, 1669 (25·3%) with glomerular disease, 1445 (21·9%) with hypertensive or renovascular disease, and 1438 (21·8%) with other or unknown causes. Kidney disease progression occurred in 384 (11·6%) of 3304 patients in the empagliflozin group and 504 (15·2%) of 3305 patients in the placebo group (hazard ratio 0·71 [95% CI 0·62–0·81]), with no evidence that the relative effect size varied significantly by primary kidney disease (pheterogeneity=0·62). The between-group difference in chronic eGFR slopes (ie, from 2 months to final follow-up) was 1·37 mL/min per 1·73 m2 per year (95% CI 1·16–1·59), representing a 50% (42–58) reduction in the rate of chronic eGFR decline. This relative effect of empagliflozin on chronic eGFR slope was similar in analyses by different primary kidney diseases, including in explorations by type of glomerular disease and diabetes (p values for heterogeneity all >0·1). Interpretation: In a broad range of patients with chronic kidney disease at risk of progression, including a wide range of non-diabetic causes of chronic kidney disease, empagliflozin reduced risk of kidney disease progression. Relative effect sizes were broadly similar irrespective of the cause of primary kidney disease, suggesting that SGLT2 inhibitors should be part of a standard of care to minimise risk of kidney failure in chronic kidney disease. Funding: Boehringer Ingelheim, Eli Lilly, and UK Medical Research Council

    An expanded evaluation of protein function prediction methods shows an improvement in accuracy

    Get PDF
    Background: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.Results: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2.Conclusions: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent

    Bioinformatic Analyses of Peroxiredoxins and RF-Prx: A Random Forest-Based Predictor and Classifier for Prxs

    No full text
    Peroxiredoxins (Prxs) are a protein superfamily, present in all organisms, that play a critical role in protecting cellular macromolecules from oxidative damage but also regulate intracellular and intercellular signaling processes involving redox-regulated proteins and pathways. Bioinformatic approaches using computational tools that focus on active site-proximal sequence fragments (known as active site signatures) and iterative clustering and searching methods (referred to as TuLIP and MISST) have recently enabled the recognition of over 38,000 peroxiredoxins, as well as their classification into six functionally relevant groups. With these data providing so many examples of Prxs in each class, machine learning approaches offer an opportunity to extract additional information about features characteristic of these protein groups. In this study, we developed a novel computational method named “RF-Prx” based on a random forest (RF) approach integrated with K-space amino acid pairs (KSAAP) to identify peroxiredoxins and classify them into one of six subgroups. Our process performed in a superior manner compared to other machine learning classifiers. Thus the RF approach integrated with K-space amino acid pairs enabled the detection of class-specific conserved sequences outside the known functional centers and with potential importance. For example, drugs designed to target Prx proteins would likely suffer from cross-reactivity among distinct Prxs if targeted to conserved active sites, but this may be avoidable if remote, class-specific regions could be targeted instead

    FEPS: A Tool for Feature Extraction from Protein Sequence

    No full text
    Machine learning has become one of the most popular choices for developing computational approaches in protein structural bioinformatics. The ability to extract features from protein sequence/structure often becomes one of the crucial steps for the development of machine learning-based approaches. Over the years, various sequence, structural, and physicochemical descriptors have been developed for proteins and these descriptors have been used to predict/solve various bioinformatics problems. Hence, several feature extraction tools have been developed over the years to help researchers to generate numeric features from protein sequences. Most of these tools have some limitations regarding the number of sequences they can handle and the subsequent preprocessing that is required for the generated features before they can be fed to machine learning methods. Here, we present Feature Extraction from Protein Sequences (FEPS), a toolkit for feature extraction. FEPS is a versatile software package for generating various descriptors from protein sequences and can handle several sequences: the number of which is limited only by the computational resources. In addition, the features extracted from FEPS do not require subsequent processing and are ready to be fed to the machine learning techniques as it provides various output formats as well as the ability to concatenate these generated features. FEPS is made freely available via an online web server as well as a stand-alone toolkit. FEPS, a comprehensive toolkit for feature extraction, will help spur the development of machine learning-based models for various bioinformatics problems

    Is bleeding on probing a reliable clinical indicator of peri- implant diseases?

    Full text link
    Bleeding on probing (BOP) is regarded as an indispensable diagnostic tool for evaluating periodontal disease activity; however, its role in peri- implant disease is more intricate. Much of the confusion about the interpretation originates from drawing parallels between periodontal and peri- implant conditions. BOP can originate from two forms of probing in implants: traumatic or pathologic induction. This, in addition to the dichotomous scale of BOP can complicate diagnosis. The objective of this commentary is to discuss the following: 1) the value of BOP as a diagnostic tool for peri- implant diseases; 2) the reasons it should be distinct from value for diagnosing periodontal and peri- implant diseases; and 3) the current best evidence on how to implement it in daily clinical practice. A comprehensive bleeding index is proposed for evaluating and monitoring peri- implant conditions. BOP should be used in addition to other parameters such as visual signs of inflammation, probing depth, and progressive bone loss before a peri- implant diagnosis is established.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/171232/1/jper10765_am.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/171232/2/jper10765.pd

    RF-Phos: A Novel General Phosphorylation Site Prediction Tool Based on Random Forest

    No full text
    Protein phosphorylation is one of the most widespread regulatory mechanisms in eukaryotes. Over the past decade, phosphorylation site prediction has emerged as an important problem in the field of bioinformatics. Here, we report a new method, termed Random Forest-based Phosphosite predictor 2.0 (RF-Phos 2.0), to predict phosphorylation sites given only the primary amino acid sequence of a protein as input. RF-Phos 2.0, which uses random forest with sequence and structural features, is able to identify putative sites of phosphorylation across many protein families. In side-by-side comparisons based on 10-fold cross validation and an independent dataset, RF-Phos 2.0 compares favorably to other popular mammalian phosphosite prediction methods, such as PhosphoSVM, GPS2.1, and Musite

    Long term comparison of the prognostic performance of PerioRisk, periodontal risk assessment, periodontal risk calculator, and staging and grading systems

    No full text
    Background: Clinicians predominantly use personal judgment for risk assessment. Periodontal risk assessment tools (PRATs) provide an effective and logical system to stratify patients based on their individual treatment needs. This retrospective longitudinal study aimed to validate the association of different risk categories of four PRATs (Staging and grading; Periodontal Risk Assessment (PRA); Periodontal Risk Calculator; and PerioRisk) with periodontal related tooth loss (TLP), and to compare their prognostic performance. Methods: Data on medical history, smoking status, and clinical periodontal parameters were retrieved from patients who received surgical and non-surgical periodontal treatment. A comparison of the rate of TLP and non-periodontal related tooth loss (TLO) within the risk tool classes were performed by means of Kruskal-Wallis test followed by post-hoc comparison with the Bonferroni test. Both univariate and multivariate Cox Proportional hazard regression models were built to analyze the prognostic significance for each single risk assessment tool class on TLP. Results: A total of 167 patients with 4321 teeth followed up for a mean period of 26 years were assigned to four PRATs. PerioRisk class 5 had a hazard ratio of 18.43, Stage 4 had a hazard ratio of 7.99, and PRA class 3 had a hazard ratio of 6.13 compared with class/stage I. With respect to prognostic performance, PerioRisk tool demonstrated the best discrimination and model fit followed by PRA. Conclusion: All PRATs displayed very good predictive capability of TLP. PerioRisk showed the best discrimination and model fit, followed by PRA
    corecore