53 research outputs found

    Supplementary datasets to publication about Genome wide co-expression analysis of steroid receptors in the mouse brain

    No full text
    These are the supplementary datasets to our publication "Genome wide co-expression analysis of steroid receptors in the mouse brain: identification of dedicated signaling pathways and functionally coordinated regions". Steroid receptors are pleiotropic transcription factors that coordinate adaptation to different physiological states. An important target organ is the brain, but even though their effects are well studied in specific regions, brain-wide steroid receptor targets and mediators remain largely unknown due to the brain complexity. Here, we tested the idea that novel aspects of steroid action can be identified through spatial correlation of steroid receptors with genome-wide mRNA expression across different regions in the mouse brain. First, we observed significant co-expression of receptors with sets of steroid target genes that were identified in single brain regions. These co-expression relationships were also present in distinct other brain regions, suggestive of as yet unidentified coordinate regulation of brain regions by e.g. glucocorticoids and estrogens. Second, co-expression of a set of 62 known nuclear receptor co-regulators and the six steroid receptors in 12 non-overlapping mouse brain regions revealed selective downstream pathways, such as Pak6 as a mediator for androgen and glucocorticoid receptor’s effects on dopaminergic transmission. Third, Magel2 and Irs4 were identified and validated as strongly responsive to the estrogen diethylstilbesterol in the mouse hypothalamus. The brain- and genome wide correlations of mRNA expression levels that we provide constitute a rich resource for further prediction and understanding brain modulation by steroid hormones

    Computational analysis of brain transcriptome atlases: Understanding molecular mechanisms

    No full text
    Pattern Recognition and Bioinformatic

    Consequences and opportunities arising due to sparser single-cell RNA-seq datasets

    No full text
    With the number of cells measured in single-cell RNA sequencing (scRNA-seq) datasets increasing exponentially and concurrent increased sparsity due to more zero counts being measured for many genes, we demonstrate here that downstream analyses on binary-based gene expression give similar results as count-based analyses. Moreover, a binary representation scales up to ~ 50-fold more cells that can be analyzed using the same computational resources. We also highlight the possibilities provided by binarized scRNA-seq data. Development of specialized tools for bit-aware implementations of downstream analytical tasks will enable a more fine-grained resolution of biological heterogeneity.</p

    Hierarchical progressive learning of cell identities in single-cell data

    Get PDF
    Supervised methods are increasingly used to identify cell populations in single-cell data. Yet, current methods are limited in their ability to learn from multiple datasets simultaneously, are hampered by the annotation of datasets at different resolutions, and do not preserve annotations when retrained on new datasets. The latter point is especially important as researchers cannot rely on downstream analysis performed using earlier versions of the dataset. Here, we present scHPL, a hierarchical progressive learning method which allows continuous learning from single-cell data by leveraging the different resolutions of annotations across multiple datasets to learn and continuously update a classification tree. We evaluate the classification and tree learning performance using simulated as well as real datasets and show that scHPL can successfully learn known cellular hierarchies from multiple datasets while preserving the original annotations. scHPL is available at https://github.com/lcmmichielsen/scHPL.Pattern Recognition and Bioinformatic

    CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq

    Get PDF
    The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters.Pattern Recognition and Bioinformatic

    A comprehensive mouse kidney atlas enables rare cell population characterization and robust marker discovery

    Get PDF
    The kidney's cellular diversity is on par with its physiological intricacy; yet identifying cell populations and their markers remains challenging. Here, we created a comprehensive atlas of the healthy adult mouse kidney (MKA: Mouse Kidney Atlas) by integrating 140.000 cells and nuclei from 59 publicly available single-cell and single-nuclei RNA-sequencing datasets from eight independent studies. To harmonize annotations across datasets, we built a hierarchical model of the cell populations. Our model allows the incorporation of novel cell populations and the refinement of known profiles as more datasets become available. Using MKA and the learned model of cellular hierarchies, we predicted previously missing cell annotations from several studies. The MKA allowed us to identify reproducible markers across studies for poorly understood cell types and transitional states, which we verified using existing data from micro-dissected samples and spatial transcriptomics.</p

    Co-expression Network Analysis of the Developing Human Brain Implicates Synaptogenesis and Mitochondrial Function as Central Mechanisms in Autism

    No full text
    We analyzed the spatial-temporal co-expression relationships of 455 genes previously implicated in Autism spectrum disorder (ASD) using the BrainSpan transcriptome atlas. Understanding how the heterogenous set of ASD-related genes contribute to normal brain development helps identifying cellular/molecular processes which are commonly disrupted in ASD. First, we discovered modules among ASD candidates with biologically relevant temporal co-expression dynamics. These modules were related to the processes of synaptogenesis, apoptosis, and the neurotransmitter y-aminobutyric acid (GABA). Second, we created a transcriptome-wide co-expression network to discover significant Molecular Interaction Modules, and demonstrated that ASD candidate genes are enriched in modules related to the processes of synaptogenesis, mitochondrial function, protein translation, and ubiquitination. Finally, we identified hub genes within the ASD-enriched Molecular Interaction Modules, which may serve as additional ASD candidate genes, potential biomarkers, or therapeutic targets

    Cell type matching across species using protein embeddings and transfer learning

    No full text
    Motivation: Knowing the relation between cell types is crucial for translating experimental results from mice to humans. Establishing cell type matches, however, is hindered by the biological differences between the species. A substantial amount of evolutionary information between genes that could be used to align the species is discarded by most of the current methods since they only use one-to-one orthologous genes. Some methods try to retain the information by explicitly including the relation between genes, however, not without caveats. Results: In this work, we present a model to transfer and align cell types in cross-species analysis (TACTiCS). First, TACTiCS uses a natural language processing model to match genes using their protein sequences. Next, TACTiCS employs a neural network to classify cell types within a species. Afterward, TACTiCS uses transfer learning to propagate cell type labels between species. We applied TACTiCS on scRNA-seq data of the primary motor cortex of human, mouse, and marmoset. Our model can accurately match and align cell types on these datasets. Moreover, our model outperforms Seurat and the state-of-the-art method SAMap. Finally, we show that our gene matching method results in better cell type matches than BLAST in our model.Pattern Recognition and Bioinformatic

    scMoC: single-cell multi-omics clustering

    No full text
    MotivationSingle-cell multi-omics assays simultaneously measure different molecular features from the same cell. A key question is how to benefit from the complementary data available and perform cross-modal clustering of cells.ResultsWe propose Single-Cell Multi-omics Clustering (scMoC), an approach to identify cell clusters from data with comeasurements of scRNA-seq and scATAC-seq from the same cell. We overcome the high sparsity of the scATAC-seq data by using an imputation strategy that exploits the less-sparse scRNA-seq data available from the same cell. Subsequently, scMoC identifies clusters of cells by merging clusterings derived from both data domains individually. We tested scMoC on datasets generated using different protocols with variable data sparsity levels. We show that scMoC (i) is able to generate informative scATAC-seq data due to its RNA-guided imputation strategy and (ii) results in integrated clusters based on both RNA and ATAC information that are biologically meaningful either from the RNA or from the ATAC perspective.Availability and implementationThe data used in this manuscript is publicly available, and we refer to the original manuscript for their description and availability. For convience sci-CAR data is available at NCBI GEO under the accession number of GSE117089. SNARE-seq data is available at NCBI GEO under the accession number of GSE126074. The 10X multiome data is available at the following link https://www.10xgenomics.com/resources/datasets/pbmc-from-a-healthy-donor-no-cell-sorting-3-k-1-standard-2-0-0.Pattern Recognition and Bioinformatic

    SpaGE: Spatial Gene Enhancement using scRNA-seq

    Get PDF
    Single-cell technologies are emerging fast due to their ability to unravel the heterogeneity of biological systems. While scRNA-seq is a powerful tool that measures whole-transcriptome expression of single cells, it lacks their spatial localization. Novel spatial transcriptomics methods do retain cells spatial information but some methods can only measure tens to hundreds of transcripts. To resolve this discrepancy, we developed SpaGE, a method that integrates spatial and scRNA-seq datasets to predict whole-transcriptome expressions in their spatial configuration. Using five dataset-pairs, SpaGE outperformed previously published methods and showed scalability to large datasets. Moreover, SpaGE predicted new spatial gene patterns that are confirmed independently using in situ hybridization data from the Allen Mouse Brain Atlas.Pattern Recognition and Bioinformatic
    • …
    corecore