238 research outputs found

    cDNA-detector: detection and removal of cDNA contamination in DNA sequencing libraries

    Get PDF
    BACKGROUND: Exogenous cDNA introduced into an experimental system, either intentionally or accidentally, can appear as added read coverage over that gene in next-generation sequencing libraries derived from this system. If not properly recognized and managed, this cross-contamination with exogenous signal can lead to incorrect interpretation of research results. Yet, this problem is not routinely addressed in current sequence processing pipelines. RESULTS: We present cDNA-detector, a computational tool to identify and remove exogenous cDNA contamination in DNA sequencing experiments. We demonstrate that cDNA-detector can identify cDNAs quickly and accurately from alignment files. A source inference step attempts to separate endogenous cDNAs (retrocopied genes) from potential cloned, exogenous cDNAs. cDNA-detector provides a mechanism to decontaminate the alignment from detected cDNAs. Simulation studies show that cDNA-detector is highly sensitive and specific, outperforming existing tools. We apply cDNA-detector to several highly-cited public databases (TCGA, ENCODE, NCBI SRA) and show that contaminant genes appear in sequencing experiments where they lead to incorrect coverage peak calls. CONCLUSIONS: cDNA-detector is a user-friendly and accurate tool to detect and remove cDNA detection in NGS libraries. This two-step design reduces the risk of true variant removal since it allows for manual review of candidates. We find that contamination with intentionally and accidentally introduced cDNAs is an underappreciated problem even in widely-used consortium datasets, where it can lead to spurious results. Our findings highlight the importance of sensitive detection and removal of contaminant cDNA from NGS libraries before downstream analysis

    PLAS-5k: Dataset of Protein-Ligand Affinities from Molecular Dynamics for Machine Learning Applications

    Get PDF
    Computational methods and recently modern machine learning methods have played a key role in structure-based drug design. Though several benchmarking datasets are available for machine learning applications in virtual screening, accurate prediction of binding affinity for a protein-ligand complex remains a major challenge. New datasets that allow for the development of models for predicting binding affinities better than the state-of-the-art scoring functions are important. For the first time, we have developed a dataset, PLAS-5k comprised of 5000 protein-ligand complexes chosen from PDB database. The dataset consists of binding affinities along with energy components like electrostatic, van der Waals, polar and non-polar solvation energy calculated from molecular dynamics simulations using MMPBSA (Molecular Mechanics Poisson-Boltzmann Surface Area) method. The calculated binding affinities outperformed docking scores and showed a good correlation with the available experimental values. The availability of energy components may enable optimization of desired components during machine learning-based drug design. Further, OnionNet model has been retrained on PLAS-5k dataset and is provided as a baseline for the prediction of binding affinities

    The influence of metabolically engineered glucosinolates profiles in Arabidopsis thaliana on Plutella xylostella preference and performance

    Get PDF
    The oviposition preference and larval performance of the diamondback moth (DBM), Plutella xylostella, was studied using Arabidopsis thaliana plants with modified glucosinolate (GS) profiles containing novel GSs as a result of the introduction of individual CYP79 genes. The insect parameters were determined in a series of bioassays. The GS content of the plants as well as the number of trichomes were measured. Multivariate analysis was used to determine the possible relationships among insect and plant variables. The novel GSs in the tested lines did not appear to have any unequivocal effect on the DBM. Instead, the plant characteristics that affected larval performance and larval preference did not influence oviposition preference. Trichomes did not affect oviposition, but influenced larval parameters negatively. Although the tested A. thaliana lines had earlier been shown to influence disease resistance, in this study no clear results were found for P. xylostella

    Mass spectrometry and multivariate analysis to classify cervical intraepithelial neoplasia from blood plasma: an untargeted lipidomic study

    Get PDF
    Cervical cancer is still an important issue of public health since it is the fourth most frequent type of cancer in women worldwide. Much effort has been dedicated to combating this cancer, in particular by the early detection of cervical pre-cancerous lesions. For this purpose, this paper reports the use of mass spectrometry coupled with multivariate analysis as an untargeted lipidomic approach to classifying 76 blood plasma samples into negative for intraepithelial lesion or malignancy (NILM, n = 42) and squamous intraepithelial lesion (SIL, n = 34). The crude lipid extract was directly analyzed with mass spectrometry for untargeted lipidomics, followed by multivariate analysis based on the principal component analysis (PCA) and genetic algorithm (GA) with support vector machines (SVM), linear (LDA) and quadratic (QDA) discriminant analysis. PCA-SVM models outperformed LDA and QDA results, achieving sensitivity and specificity values of 80.0% and 83.3%, respectively. Five types of lipids contributing to the distinction between NILM and SIL classes were identified, including prostaglandins, phospholipids, and sphingolipids for the former condition and Tetranor-PGFM and hydroperoxide lipid for the latter. These findings highlight the potentiality of using mass spectrometry associated with chemometrics to discriminate between healthy women and those suffering from cervical pre-cancerous lesions

    Time series analysis of dengue incidence in Guadeloupe, French West Indies: Forecasting models using climate variables as predictors

    Get PDF
    BACKGROUND: During the last decades, dengue viruses have spread throughout the Americas region, with an increase in the number of severe forms of dengue. The surveillance system in Guadeloupe (French West Indies) is currently operational for the detection of early outbreaks of dengue. The goal of the study was to improve this surveillance system by assessing a modelling tool to predict the occurrence of dengue epidemics few months ahead and thus to help an efficient dengue control. METHODS: The Box-Jenkins approach allowed us to fit a Seasonal Autoregressive Integrated Moving Average (SARIMA) model of dengue incidence from 2000 to 2006 using clinical suspected cases. Then, this model was used for calculating dengue incidence for the year 2007 compared with observed data, using three different approaches: 1 year-ahead, 3 months-ahead and 1 month-ahead. Finally, we assessed the impact of meteorological variables (rainfall, temperature and relative humidity) on the prediction of dengue incidence and outbreaks, incorporating them in the model fitting the best. RESULTS: The 3 months-ahead approach was the most appropriate for an effective and operational public health response, and the most accurate (Root Mean Square Error, RMSE = 0.85). Relative humidity at lag-7 weeks, minimum temperature at lag-5 weeks and average temperature at lag-11 weeks were variables the most positively correlated to dengue incidence in Guadeloupe, meanwhile rainfall was not. The predictive power of SARIMA models was enhanced by the inclusion of climatic variables as external regressors to forecast the year 2007. Temperature significantly affected the model for better dengue incidence forecasting (p-value = 0.03 for minimum temperature lag-5, p-value = 0.02 for average temperature lag-11) but not humidity. Minimum temperature at lag-5 weeks was the best climatic variable for predicting dengue outbreaks (RMSE = 0.72). CONCLUSION: Temperature improves dengue outbreaks forecasts better than humidity and rainfall. SARIMA models using climatic data as independent variables could be easily incorporated into an early (3 months-ahead) and reliably monitoring system of dengue outbreaks. This approach which is practicable for a surveillance system has public health implications in helping the prediction of dengue epidemic and therefore the timely appropriate and efficient implementation of prevention activities

    Rhabdomyoblastic Differentiation in Head and Neck Malignancies Other Than Rhabdomyosarcoma

    Get PDF
    Rhabdomyosarcoma is a relatively common soft tissue sarcoma that frequently affects children and adolescents and may involve the head and neck. Rhabdomyosarcoma is defined by skeletal muscle differentiation which can be suggested by routine histology and confirmed by immunohistochemistry for the skeletal muscle-specific markers myogenin or myoD1. At the same time, it must be remembered that when it comes to head and neck malignancies, skeletal muscle differentiation is not limited to rhabdomyosarcoma. A lack of awareness of this phenomenon could lead to misdiagnosis and, subsequently, inappropriate therapeutic interventions. This review focuses on malignant neoplasms of the head and neck other than rhabdomyosarcoma that may exhibit rhabdomyoblastic differentiation, with an emphasis on strategies to resolve the diagnostic dilemmas these tumors may present. Axiomatically, no primary central nervous system tumors will be discussed.info:eu-repo/semantics/publishedVersio

    Effect of sitagliptin on cardiovascular outcomes in type 2 diabetes

    Get PDF
    BACKGROUND: Data are lacking on the long-term effect on cardiovascular events of adding sitagliptin, a dipeptidyl peptidase 4 inhibitor, to usual care in patients with type 2 diabetes and cardiovascular disease. METHODS: In this randomized, double-blind study, we assigned 14,671 patients to add either sitagliptin or placebo to their existing therapy. Open-label use of antihyperglycemic therapy was encouraged as required, aimed at reaching individually appropriate glycemic targets in all patients. To determine whether sitagliptin was noninferior to placebo, we used a relative risk of 1.3 as the marginal upper boundary. The primary cardiovascular outcome was a composite of cardiovascular death, nonfatal myocardial infarction, nonfatal stroke, or hospitalization for unstable angina. RESULTS: During a median follow-up of 3.0 years, there was a small difference in glycated hemoglobin levels (least-squares mean difference for sitagliptin vs. placebo, -0.29 percentage points; 95% confidence interval [CI], -0.32 to -0.27). Overall, the primary outcome occurred in 839 patients in the sitagliptin group (11.4%; 4.06 per 100 person-years) and 851 patients in the placebo group (11.6%; 4.17 per 100 person-years). Sitagliptin was noninferior to placebo for the primary composite cardiovascular outcome (hazard ratio, 0.98; 95% CI, 0.88 to 1.09; P<0.001). Rates of hospitalization for heart failure did not differ between the two groups (hazard ratio, 1.00; 95% CI, 0.83 to 1.20; P = 0.98). There were no significant between-group differences in rates of acute pancreatitis (P = 0.07) or pancreatic cancer (P = 0.32). CONCLUSIONS: Among patients with type 2 diabetes and established cardiovascular disease, adding sitagliptin to usual care did not appear to increase the risk of major adverse cardiovascular events, hospitalization for heart failure, or other adverse events

    Haploinsufficiency of interferon regulatory factor 4 strongly protects against autoimmune diabetes in NOD mice

    Get PDF
    Aims/hypothesis: Interferon regulatory factor (IRF)4 plays a critical role in lymphoid development and the regulation of immune responses. Genetic deletion of IRF4 has been shown to suppress autoimmune disease in several mouse models, but its role in autoimmune diabetes in NOD mice remains unknown. Methods: To address the role of IRF4 in the pathogenesis of autoimmune diabetes in NOD mice, we generated IRF4-knockout NOD mice and investigated the impact of the genetic deletion of IRF4 on diabetes, insulitis and insulin autoantibody; the effector function of T cells in vivo and in vitro; and the proportion of dendritic cell subsets. Results: Heterozygous IRF4-deficient NOD mice maintained the number and phenotype of T cells at levels similar to NOD mice. However, diabetes and autoantibody production were completely suppressed in both heterozygous and homozygous IRF4-deficient NOD mice. The level of insulitis was strongly suppressed in both heterozygous and homozygous IRF4-deficient mice, with minimal insulitis observed in heterozygous mice. An adoptive transfer study revealed that IRF4 deficiency conferred disease resistance in a gene-dose-dependent manner in recipient NOD/severe combined immunodeficiency mice. Furthermore, the proportion of migratory dendritic cells in lymph nodes was reduced in heterozygous and homozygous IRF4-deficient NOD mice in an IRF4 dose-dependent manner. These results suggest that the levels of IRF4 in T cells and dendritic cells are important for the pathogenesis of diabetes in NOD mice. Conclusions/interpretation: Haploinsufficiency of IRF4 halted disease development in NOD mice. Our findings suggest that an IRF4-targeted strategy might be useful for modulating autoimmunity in type 1 diabetes
    corecore