64 research outputs found

    Novel personalized pathway-based metabolomics models reveal key metabolic pathways for breast cancer diagnosis

    Get PDF
    Comparison of logistic regression, SVM and random forest performance in the plasma training data set. Table S2. Pathway significance and relative log fold changes in our metabolomics data and TCGA breast cancer RNA-Seq data. Table S3. Detected metabolites and their differential test results among the two models. a All-stage diagnosis model. b Early-stage diagnosis model. Table S4. Single-variate logistic analysis of metabolites or pathways selected as features in the metabolite-based or pathway-based early-stage diagnosis model. Table S5. Comparison of pathway features in the full-size (101 input pathways) and half-size (51 input pathways) pathway-based early-stage diagnosis models. (DOCX 34 kb

    DeepImpute: an accurate, fast, and scalable deep neural network method to impute single-cell RNA-seq data

    Full text link
    Abstract Single-cell RNA sequencing (scRNA-seq) offers new opportunities to study gene expression of tens of thousands of single cells simultaneously. We present DeepImpute, a deep neural network-based imputation algorithm that uses dropout layers and loss functions to learn patterns in the data, allowing for accurate imputation. Overall, DeepImpute yields better accuracy than other six publicly available scRNA-seq imputation methods on experimental data, as measured by the mean squared error or Pearson’s correlation coefficient. DeepImpute is an accurate, fast, and scalable imputation tool that is suited to handle the ever-increasing volume of scRNA-seq data, and is freely available at https://github.com/lanagarmire/DeepImpute .https://deepblue.lib.umich.edu/bitstream/2027.42/152237/1/13059_2019_Article_1837.pd

    Genome-scale hypomethylation in the cord blood DNAs associated with early onset preeclampsia

    Get PDF
    Background: Preeclampsia is one of the leading causes of fetal and maternal morbidity and mortality worldwide. Preterm babies of mothers with early onset preeclampsia (EOPE) are at higher risks for various diseases later on in life, including cardiovascular diseases. We hypothesized that genome-wide epigenetic alterations occur in cord blood DNAs in association with EOPE and conducted a case control study to compare the genome-scale methylome differences in cord blood DNAs between 12 EOPE-associated and 8 normal births. Results: Bioinformatics analysis of methylation data from the Infinium HumanMethylation450 BeadChip shows a genome-scale hypomethylation pattern in EOPE, with 51,486 hypomethylated CpG sites and 12,563 hypermethylated sites (adjusted P <0.05). A similar trend also exists in the proximal promoters (TSS200) associated with protein-coding genes. Using summary statistics on the CpG sites in TSS200 regions, promoters of 643 and 389 genes are hypomethylated and hypermethylated, respectively. Promoter-based differential methylation (DM) analysis reveals that genes in the farnesoid X receptor and liver X receptor (FXR/LXR) pathway are enriched, indicating dysfunction of lipid metabolism in cord blood cells. Additional biological functional alterations involve inflammation, cell growth, and hematological system development. A two-way ANOVA analysis among coupled cord blood and amniotic membrane samples shows that a group of genes involved in inflammation, lipid metabolism, and proliferation are persistently differentially methylated in both tissues, including IL12B, FAS, PIK31, and IGF1. Conclusions: These findings provide, for the first time, evidence of prominent genome-scale DNA methylation modifications in cord blood DNAs associated with EOPE. They may suggest a connection between inflammation and lipid dysregulation in EOPE-associated newborns and a higher risk of cardiovascular diseases later in adulthood

    Challenges and perspectives in computational deconvolution in genomics data

    Full text link
    Deciphering cell type heterogeneity is crucial for systematically understanding tissue homeostasis and its dysregulation in diseases. Computational deconvolution is an efficient approach to estimate cell type abundances from a variety of omics data. Despite significant methodological progress in computational deconvolution in recent years, challenges are still outstanding. Here we enlist four significant challenges from availability of the reference data, generation of simulation data, limitations of computational methodologies, and benchmarking design and implementation. Finally, we make recommendations on reference data generation, new directions of computational methodologies and strategies to promote rigorous benchmarking

    A Global Clustering Algorithm to Identify Long Intergenic Non-Coding RNA - with Applications in Mouse Macrophages

    Get PDF
    Identification of diffuse signals from the chromatin immunoprecipitation and high-throughput massively parallel sequencing (ChIP-Seq) technology poses significant computational challenges, and there are few methods currently available. We present a novel global clustering approach to enrich diffuse CHIP-Seq signals of RNA polymerase II and histone 3 lysine 4 trimethylation (H3K4Me3) and apply it to identify putative long intergenic non-coding RNAs (lincRNAs) in macrophage cells. Our global clustering method compares favorably to the local clustering method SICER that was also designed to identify diffuse CHIP-Seq signals. The validity of the algorithm is confirmed at several levels. First, 8 out of a total of 11 selected putative lincRNA regions in primary macrophages respond to lipopolysaccharides (LPS) treatment as predicted by our computational method. Second, the genes nearest to lincRNAs are enriched with biological functions related to metabolic processes under resting conditions but with developmental and immune-related functions under LPS treatment. Third, the putative lincRNAs have conserved promoters, modestly conserved exons, and expected secondary structures by prediction. Last, they are enriched with motifs of transcription factors such as PU.1 and AP.1, previously shown to be important lineage determining factors in macrophages, and 83% of them overlap with distal enhancers markers. In summary, GCLS based on RNA polymerase II and H3K4Me3 CHIP-Seq method can effectively detect putative lincRNAs that exhibit expected characteristics, as exemplified by macrophages in the study

    The Pediatric Cell Atlas:Defining the Growth Phase of Human Development at Single-Cell Resolution

    Get PDF
    Single-cell gene expression analyses of mammalian tissues have uncovered profound stage-specific molecular regulatory phenomena that have changed the understanding of unique cell types and signaling pathways critical for lineage determination, morphogenesis, and growth. We discuss here the case for a Pediatric Cell Atlas as part of the Human Cell Atlas consortium to provide single-cell profiles and spatial characterization of gene expression across human tissues and organs. Such data will complement adult and developmentally focused HCA projects to provide a rich cytogenomic framework for understanding not only pediatric health and disease but also environmental and genetic impacts across the human lifespan
    • …
    corecore