294 research outputs found

    CPEM: Accurate cancer type classification based on somatic alterations using an ensemble of a random forest and a deep neural network

    Get PDF
    With recent advances in DNA sequencing technologies, fast acquisition of large-scale genomic data has become commonplace. For cancer studies, in particular, there is an increasing need for the classification of cancer type based on somatic alterations detected from sequencing analyses. However, the ever-increasing size and complexity of the data make the classification task extremely challenging. In this study, we evaluate the contributions of various input features, such as mutation profiles, mutation rates, mutation spectra and signatures, and somatic copy number alterations that can be derived from genomic data, and further utilize them for accurate cancer type classification. We introduce a novel ensemble of machine learning classifiers, called CPEM (Cancer Predictor using an Ensemble Model), which is tested on 7,002 samples representing over 31 different cancer types collected from The Cancer Genome Atlas (TCGA) database. We first systematically examined the impact of the input features. Features known to be associated with specific cancers had relatively high importance in our initial prediction model. We further investigated various machine learning classifiers and feature selection methods to derive the ensemble-based cancer type prediction model achieving up to 84% classification accuracy in the nested 10-fold cross-validation. Finally, we narrowed down the target cancers to the six most common types and achieved up to 94% accuracy

    Computational Approach to Identify Enzymes That Are Potential Therapeutic Candidates for Psoriasis

    Get PDF
    Psoriasis is well known as a chronic inflammatory dermatosis. The disease affects persons of all ages and is a burden worldwide. Psoriasis is associated with various diseases such as arthritis. The disease is characterized by well-demarcated lesions on the skin of the elbows and knees. Various genetic and environmental factors are related to the pathogenesis of psoriasis. In order to identify enzymes that are potential therapeutic targets for psoriasis, we utilized a computational approach, combining microarray analysis and protein interaction prediction. We found 6,437 genes (3,264 upregulated and 3,173 downregulated) that have significant differences in expression between regions with and without lesions in psoriasis patients. We identified potential candidates through protein-protein interaction predictions made using various protein interaction resources. By analyzing the hub protein of the networks with metrics such as degree and centrality, we detected 32 potential therapeutic candidates. After filtering these candidates through the ENZYME nomenclature database, we selected 5 enzymes: DNA helicase (RUVBL2), proteasome endopeptidase complex (PSMA2), nonspecific protein-tyrosine kinase (ZAP70), I-kappa-B kinase (IKBKE), and receptor protein-tyrosine kinase (EGFR). We adopted a computational approach to detect potential therapeutic targets; this approach may become an effective strategy for the discovery of new drug targets for psoriasis

    A lab-on-a-disc platform enables serial monitoring of individual CTCs associated with tumor progression during EGFR-targeted therapy for patients with NSCLC

    Get PDF
    Rationale: Unlike traditional biopsy, liquid biopsy, which is a largely non-invasive diagnostic and monitoring tool, can be performed more frequently to better track tumors and mutations over time and to validate the efficiency of a cancer treatment. Circulating tumor cells (CTCs) are considered promising liquid biopsy biomarkers; however, their use in clinical settings is limited by high costs and a low throughput of standard platforms for CTC enumeration and analysis. In this study, we used a label-free, high-throughput method for CTC isolation directly from whole blood of patients using a standalone, clinical setting-friendly platform. Methods: A CTC-based liquid biopsy approach was used to examine the efficacy of therapy and emergent drug resistance via longitudinal monitoring of CTC counts, DNA mutations, and single-cell-level gene expression in a prospective cohort of 40 patients with epidermal growth factor receptor (EGFR)-mutant non-small cell lung cancer. Results: The change ratio of the CTC counts was associated with tumor response, detected by CT scan, while the baseline CTC counts did not show association with progression-free survival or overall survival. We achieved a 100% concordance rate for the detection of EGFR mutation, including emergence of T790M, between tumor tissue and CTCs. More importantly, our data revealed the importance of the analysis of the epithelial/mesenchymal signature of individual pretreatment CTCs to predict drug responsiveness in patients. Conclusion: The fluid-assisted separation technology disc platform enables serial monitoring of CTC counts, DNA mutations, as well as unbiased molecular characterization of individual CTCs associated with tumor progression during targeted therapy

    RAMP: response-aware multi-task learning with contrastive regularization for cancer drug response prediction

    Get PDF
    The accurate prediction of cancer drug sensitivity according to the multiomics profiles of individual patients is crucial for precision cancer medicine. However, the development of prediction models has been challenged by the complex crosstalk of input features and the resistance-dominant drug response information contained in public databases. In this study, we propose a novel multidrug response prediction framework, response-aware multitask prediction (RAMP), via a Bayesian neural network and restrict it by soft-supervised contrastive regularization. To utilize network embedding vectors as representation learning features for heterogeneous networks, we harness response-aware negative sampling, which applies cell line???drug response information to the training of network embeddings. RAMP overcomes the prediction accuracy limitation induced by the imbalance of trained response data based on the comprehensive selection and utilization of drug response features. When trained on the Genomics of Drug Sensitivity in Cancer dataset, RAMP achieved an area under the receiver operating characteristic curve > 89%, an area under the precision-recall curve > 59% and an F1 score > 52% and outperformed previously developed methods on both balanced and imbalanced datasets. Furthermore, RAMP predicted many missing drug responses that were not included in the public databases. Our results showed that RAMP will be suitable for the high-throughput prediction of cancer drug sensitivity and will be useful for guiding cancer drug selection processes. The Python implementation for RAMP is available at https://github.com/hvcl/RAMP

    Presumed Metastasis of Breast Cancer to the Abducens Nucleus Presenting as Gaze Palsy

    Get PDF
    A 51-year-old woman with breast cancer presented with progressive diplopia. Neuro-ophthalmologic examination revealed right gaze palsy and peripheral facial nerve palsy. Brain magnetic resonance imaging (MRI) was normal. However, two months later a repeat brain MRI revealed an enhancing round nodular mass at the right facial colliculus of the lower pons, at the location of the abducens nucleus. Localized metastasis to the abducens nucleus can cause gaze palsy in a patient with breast cancer

    Depression and suicide risk prediction models using blood-derived multi-omics data

    Get PDF
    More than 300 million people worldwide experience depression; annually, ~800,000 people die by suicide. Unfortunately, conventional interview-based diagnosis is insufficient to accurately predict a psychiatric status. We developed machine learning models to predict depression and suicide risk using blood methylome and transcriptome data from 56 suicide attempters (SAs), 39 patients with major depressive disorder (MDD), and 87 healthy controls. Our random forest classifiers showed accuracies of 92.6% in distinguishing SAs from MDD patients, 87.3% in distinguishing MDD patients from controls, and 86.7% in distinguishing SAs from controls. We also developed regression models for predicting psychiatric scales with R2 values of 0.961 and 0.943 for Hamilton Rating Scale for Depression???17 and Scale for Suicide Ideation, respectively. Multi-omics data were used to construct psychiatric status prediction models for improved mental health treatment

    Microbiome of Saliva and Plaque in Children According to Age and Dental Caries Experience

    Get PDF
    Dental caries are one of the chronic diseases caused by organic acids made from oral microbes. However, there was a lack of knowledge about the oral microbiome of Korean children. The aim of this study was to analyze the metagenome data of the oral microbiome obtained from Korean children and to discover bacteria highly related to dental caries with machine learning models. Saliva and plaque samples from 120 Korean children aged below 12 years were collected. Bacterial composition was identified using Illumina HiSeq sequencing based on the V3-V4 hypervariable region of the 16S rRNA gene. Ten major genera accounted for approximately 70% of the samples on average, including Streptococcus, Neisseria, Corynebacterium, and Fusobacterium. Differential abundant analyses revealed that Scardovia wiggsiae and Leptotrichia wadei were enriched in the caries samples, while Neisseria oralis was abundant in the non-caries samples of children aged below 6 years. The caries and non-caries samples of children aged 6-12 years were enriched in Streptococcus mutans and Corynebacterium durum, respectively. The machine learning models based on these differentially enriched taxa showed accuracies of up to 83%. These results confirmed significant alterations in the oral microbiome according to dental caries and age, and these differences can be used as diagnostic biomarkers

    Profiling age-related epigenetic markers of stomach adenocarcinoma in young and old subjects

    Get PDF
    The purpose of our study is to identify epigenetic markers that are differently expressed in the stomach adenocarcinoma (STAD) condition. Based on data from The Cancer Genome Atlas (TCGA), we were able to detect an age-related difference in methylation patterns and changes in gene and miRNA expression levels in young (n = 14) and old (n = 70) STAD subjects. Our analysis identified 323 upregulated and 653 downregulated genes in old STAD subjects. We also found 76 miRNAs with age-related expression patterns and 113 differentially methylated genes (DMGs), respectively. Our further analysis revealed that significant upregulated genes (n = 35) were assigned to the cell cycle, while the muscle system process (n = 27) and cell adhesion-related genes (n = 57) were downregulated. In addition, by comparing gene and miRNA expression with methylation change, we identified that three upregulated genes (ELF3, IL1??, and MMP13) known to be involved in inflammatory responses and cell growth were significantly hypomethylated in the promoter region. We further detected target candidates for age-related, downregulated miRNAs (hsa-mir-124-3, hsa-mir-204, and hsa-mir-125b-2) in old STAD subjects. This is the first report of the results from a study exploring age-related epigenetic biomarkers of STAD using high-throughput data and provides evidence for a complex clinicopathological condition expressed by the age-related STAD progression. © the authors, publisher and licensee Libertas Academica Limitedopen
    corecore