46 research outputs found

    Computational analysis of brain transcriptome atlases: Understanding molecular mechanisms

    No full text
    Pattern Recognition and Bioinformatic

    Hierarchical progressive learning of cell identities in single-cell data

    Get PDF
    Supervised methods are increasingly used to identify cell populations in single-cell data. Yet, current methods are limited in their ability to learn from multiple datasets simultaneously, are hampered by the annotation of datasets at different resolutions, and do not preserve annotations when retrained on new datasets. The latter point is especially important as researchers cannot rely on downstream analysis performed using earlier versions of the dataset. Here, we present scHPL, a hierarchical progressive learning method which allows continuous learning from single-cell data by leveraging the different resolutions of annotations across multiple datasets to learn and continuously update a classification tree. We evaluate the classification and tree learning performance using simulated as well as real datasets and show that scHPL can successfully learn known cellular hierarchies from multiple datasets while preserving the original annotations. scHPL is available at https://github.com/lcmmichielsen/scHPL.Pattern Recognition and Bioinformatic

    CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq

    Get PDF
    The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters.Pattern Recognition and Bioinformatic

    Cell type matching across species using protein embeddings and transfer learning

    No full text
    Motivation: Knowing the relation between cell types is crucial for translating experimental results from mice to humans. Establishing cell type matches, however, is hindered by the biological differences between the species. A substantial amount of evolutionary information between genes that could be used to align the species is discarded by most of the current methods since they only use one-to-one orthologous genes. Some methods try to retain the information by explicitly including the relation between genes, however, not without caveats. Results: In this work, we present a model to transfer and align cell types in cross-species analysis (TACTiCS). First, TACTiCS uses a natural language processing model to match genes using their protein sequences. Next, TACTiCS employs a neural network to classify cell types within a species. Afterward, TACTiCS uses transfer learning to propagate cell type labels between species. We applied TACTiCS on scRNA-seq data of the primary motor cortex of human, mouse, and marmoset. Our model can accurately match and align cell types on these datasets. Moreover, our model outperforms Seurat and the state-of-the-art method SAMap. Finally, we show that our gene matching method results in better cell type matches than BLAST in our model.Pattern Recognition and Bioinformatic

    scMoC: single-cell multi-omics clustering

    No full text
    MotivationSingle-cell multi-omics assays simultaneously measure different molecular features from the same cell. A key question is how to benefit from the complementary data available and perform cross-modal clustering of cells.ResultsWe propose Single-Cell Multi-omics Clustering (scMoC), an approach to identify cell clusters from data with comeasurements of scRNA-seq and scATAC-seq from the same cell. We overcome the high sparsity of the scATAC-seq data by using an imputation strategy that exploits the less-sparse scRNA-seq data available from the same cell. Subsequently, scMoC identifies clusters of cells by merging clusterings derived from both data domains individually. We tested scMoC on datasets generated using different protocols with variable data sparsity levels. We show that scMoC (i) is able to generate informative scATAC-seq data due to its RNA-guided imputation strategy and (ii) results in integrated clusters based on both RNA and ATAC information that are biologically meaningful either from the RNA or from the ATAC perspective.Availability and implementationThe data used in this manuscript is publicly available, and we refer to the original manuscript for their description and availability. For convience sci-CAR data is available at NCBI GEO under the accession number of GSE117089. SNARE-seq data is available at NCBI GEO under the accession number of GSE126074. The 10X multiome data is available at the following link https://www.10xgenomics.com/resources/datasets/pbmc-from-a-healthy-donor-no-cell-sorting-3-k-1-standard-2-0-0.Pattern Recognition and Bioinformatic

    SpaGE: Spatial Gene Enhancement using scRNA-seq

    Get PDF
    Single-cell technologies are emerging fast due to their ability to unravel the heterogeneity of biological systems. While scRNA-seq is a powerful tool that measures whole-transcriptome expression of single cells, it lacks their spatial localization. Novel spatial transcriptomics methods do retain cells spatial information but some methods can only measure tens to hundreds of transcripts. To resolve this discrepancy, we developed SpaGE, a method that integrates spatial and scRNA-seq datasets to predict whole-transcriptome expressions in their spatial configuration. Using five dataset-pairs, SpaGE outperformed previously published methods and showed scalability to large datasets. Moreover, SpaGE predicted new spatial gene patterns that are confirmed independently using in situ hybridization data from the Allen Mouse Brain Atlas.Pattern Recognition and Bioinformatic

    Brain transcriptome atlases: A computational perspective

    Get PDF
    The immense complexity of the mammalian brain is largely reflected in the underlying molecular signatures of its billions of cells. Brain transcriptome atlases provide valuable insights into gene expression patterns across different brain areas throughout the course of development. Such atlases allow researchers to probe the molecular mechanisms which define neuronal identities, neuroanatomy, and patterns of connectivity. Despite the immense effort put into generating such atlases, to answer fundamental questions in neuroscience, an even greater effort is needed to develop methods to probe the resulting high-dimensional multivariate data. We provide a comprehensive overview of the various computational methods used to analyze brain transcriptome atlases.Pattern Recognition and Bioinformatic

    scTopoGAN: unsupervised manifold alignment of single-cell data

    No full text
    Motivation: Single-cell technologies allow deep characterization of different molecular aspects of cells. Integrating these modalities provides a comprehensive view of cellular identity. Current integration methods rely on overlapping features or cells to link datasets measuring different modalities, limiting their application to experiments where different molecular layers are profiled in different subsets of cells. Results: We present scTopoGAN, a method for unsupervised manifold alignment of single-cell datasets with non-overlapping cells or features. We use topological autoencoders (topoAE) to obtain latent representations of each modality separately. A topology-guided Generative Adversarial Network then aligns these latent representations into a common space. We show that scTopoGAN outperforms state-of-the-art manifold alignment methods in complete unsupervised settings. Interestingly, the topoAE for individual modalities also showed better performance in preserving the original structure of the data in the low-dimensional representations when compared to other manifold projection methods. Taken together, we show that the concept of topology preservation might be a powerful tool to align multiple single modality datasets, unleashing the potential of multi-omic interpretations of cells.Pattern Recognition and Bioinformatic

    How metabolic state may regulate fear: Presence of metabolic receptors in the fear circuitry

    Get PDF
    Metabolic status impacts on the emotional brain to induce behavior that maintains energy balance. While hunger suppresses the fear circuitry to promote explorative food-seeking behavior, satiety or obesity may increase fear to prevent unnecessary risk-taking. Here we aimed to unravel which metabolic factors, that transfer information about the acute and the chronic metabolic status, are of primary importance to regulate fear, and to identify their sites of action within fear-related brain regions. We performed a de novo analysis of central and peripheral metabolic factors that can penetrate the blood-brain barrier using genome-wide expression data across the mouse brain from the Allen Brain Atlas (ABA). The central fear circuitry, as defined by subnuclei of the amygdala, the afferent hippocampus, the medial prefrontal cortex and the efferent periaqueductal gray, was enriched with metabolic receptors. Some of their corresponding ligands were known to modulate fear (e.g., estrogen and thyroid hormones) while others had not been associated with fear before (e.g., glucagon, ACTH). Additionally, several of these enriched metabolic receptors were coexpressed with well-described fear-modulating genes (Crh, Crhr1, or Crhr2). Co-expression analysis of monoamine markers and metabolic receptors suggested that monoaminergic nuclei have differential sensitivity to metabolic alterations. Serotonergic neurons expressed a large number of metabolic receptors (e.g., estrogen receptors, fatty acid receptors), suggesting a wide responsivity to metabolic changes. The noradrenergic system seemed to be specifically sensitive to hypocretin/orexin modulation. Taken together, we identified a number of novel metabolic factors (glucagon, ACTH) that have the potential to modulate the fear response. We additionally propose novel cerebral targets for metabolic factors (e.g., thyroid hormones) that modulate fear, but of which the sites of action are (largely) unknown.Pattern Recognition and Bioinformatic

    Timing and localization of myasthenia gravis-related gene expression

    Get PDF
    Myasthenia gravis (MG) is an acquired autoimmune disorder caused by autoantibodies binding acetylcholine receptors (AChR), muscle-specific kinase (MuSK), agrin or low-density lipoprotein receptor-related protein 4 (Lrp4). These autoantibodies inhibit neuromuscular transmission by blocking the function of these proteins and thereby cause fluctuating skeletal muscle weakness. Several reports suggest that these autoantibodies might also affect the central nervous system (CNS) in MG patients. A comprehensive overview of the timing and localization of the expression of MG-related antigens in other organs is currently lacking. To investigate the spatio-temporal expression of MG-related genes outside skeletal muscle, we used in silico tools to assess public expression databases. Acetylcholine esterase, nicotinic AChR α1 subunit, agrin, collagen Q, downstream of kinase-7, Lrp4, MuSK and rapsyn were included as MG-related genes because of their well-known involvement in either congenital or autoimmune MG. We investigated expression of MG-related genes in (1) all human tissues using GTEx data, (2) specific brain regions, (3) neurodevelopmental stages, and (4) cell types using datasets from the Allen Institute for Brain Sciences. MG-related genes show heterogenous spatio-temporal expression patterns in the human body as well as in the CNS. For each of these genes, several (new) tissues, brain areas and cortical cell types with (relatively) high expression were identified suggesting a potential role for these genes outside skeletal muscle. The possible presence of MG-related antigens outside skeletal muscle suggests that autoimmune MG, congenital MG or treatments targeting the same proteins may affect MG-related protein function in other organs.Pattern Recognition and Bioinformatic
    corecore