39 research outputs found
Recommended from our members
Transcriptomic evidence that von Economo neurons are regionally specialized extratelencephalic-projecting excitatory neurons.
von Economo neurons (VENs) are bipolar, spindle-shaped neurons restricted to layer 5 of human frontoinsula and anterior cingulate cortex that appear to be selectively vulnerable to neuropsychiatric and neurodegenerative diseases, although little is known about other VEN cellular phenotypes. Single nucleus RNA-sequencing of frontoinsula layer 5 identifies a transcriptomically-defined cell cluster that contained VENs, but also fork cells and a subset of pyramidal neurons. Cross-species alignment of this cell cluster with a well-annotated mouse classification shows strong homology to extratelencephalic (ET) excitatory neurons that project to subcerebral targets. This cluster also shows strong homology to a putative ET cluster in human temporal cortex, but with a strikingly specific regional signature. Together these results suggest that VENs are a regionally distinctive type of ET neuron. Additionally, we describe the first patch clamp recordings of VENs from neurosurgically-resected tissue that show distinctive intrinsic membrane properties relative to neighboring pyramidal neurons
A Guide to the Brain Initiative Cell Census Network Data Ecosystem
Characterizing cellular diversity at different levels of biological organization and across data modalities is a prerequisite to understanding the function of cell types in the brain. Classification of neurons is also essential to manipulate cell types in controlled ways and to understand their variation and vulnerability in brain disorders. The BRAIN Initiative Cell Census Network (BICCN) is an integrated network of data-generating centers, data archives, and data standards developers, with the goal of systematic multimodal brain cell type profiling and characterization. Emphasis of the BICCN is on the whole mouse brain with demonstration of prototype feasibility for human and nonhuman primate (NHP) brains. Here, we provide a guide to the cellular and spatial approaches employed by the BICCN, and to accessing and using these data and extensive resources, including the BRAIN Cell Data Center (BCDC), which serves to manage and integrate data across the ecosystem. We illustrate the power of the BICCN data ecosystem through vignettes highlighting several BICCN analysis and visualization tools. Finally, we present emerging standards that have been developed or adopted toward Findable, Accessible, Interoperable, and Reusable (FAIR) neuroscience. The combined BICCN ecosystem provides a comprehensive resource for the exploration and analysis of cell types in the brain
A comprehensive collection of systems biology data characterizing the host response to viral infection
The Systems Biology for Infectious Diseases Research program was established by the U.S. National Institute of Allergy and Infectious Diseases to investigate host-pathogen interactions at a systems level. This program generated 47 transcriptomic and proteomic datasets from 30 studies that investigate in vivo and in vitro host responses to viral infections. Human pathogens in the Orthomyxoviridae and Coronaviridae families, especially pandemic H1N1 and avian H5N1 influenza A viruses and severe acute respiratory syndrome coronavirus (SARS-CoV), were investigated. Study validation was demonstrated via experimental quality control measures and meta-analysis of independent experiments performed under similar conditions. Primary assay results are archived at the GEO and PeptideAtlas public repositories, while processed statistical results together with standardized metadata are publically available at the Influenza Research Database (www.fludb.org) and the Virus Pathogen Resource (www.viprbrc.org). By comparing data from mutant versus wild-type virus and host strains, RNA versus protein differential expression, and infection with genetically similar strains, these data can be used to further investigate genetic and physiological determinants of host responses to viral infection
Comparative cellular analysis of motor cortex in human, marmoset and mouse
The primary motor cortex (M1) is essential for voluntary fine-motor control and is functionally conserved across mammals1. Here, using high-throughput transcriptomic and epigenomic profiling of more than 450,000 single nuclei in humans, marmoset monkeys and mice, we demonstrate a broadly conserved cellular makeup of this region, with similarities that mirror evolutionary distance and are consistent between the transcriptome and epigenome. The core conserved molecular identities of neuronal and non-neuronal cell types allow us to generate a cross-species consensus classification of cell types, and to infer conserved properties of cell types across species. Despite the overall conservation, however, many species-dependent specializations are apparent, including differences in cell-type proportions, gene expression, DNA methylation and chromatin state. Few cell-type marker genes are conserved across species, revealing a short list of candidate genes and regulatory mechanisms that are responsible for conserved features of homologous cell types, such as the GABAergic chandelier cells. This consensus transcriptomic classification allows us to use patch-seq (a combination of whole-cell patch-clamp recordings, RNA sequencing and morphological characterization) to identify corticospinal Betz cells from layer 5 in non-human primates and humans, and to characterize their highly specialized physiology and anatomy. These findings highlight the robust molecular underpinnings of cell-type diversity in M1 across mammals, and point to the genes and regulatory pathways responsible for the functional identity of cell types and their species-specific adaptations
Machine learning for cell type classification from single nucleus RNA sequencing data
With the advent of single cell/nucleus RNA sequencing (sc/snRNA-seq), the field of cell phenotyping is now a data-driven exercise providing statistical evidence to support cell type/state categorization. However, the task of classifying cells into specific, well-defined categories with the empirical data provided by sc/snRNA-seq remains nontrivial due to the difficulty in determining specific differences between related cell types with close transcriptional similarities, resulting in challenges with matching cell types identified in separate experiments. To investigate possible approaches to overcome these obstacles, we explored the use of supervised machine learning methods-logistic regression, support vector machines, random forests, neural networks, and light gradient boosting machine (LightGBM)-as approaches to classify cell types using snRNA-seq datasets from human brain middle temporal gyrus (MTG) and human kidney. Classification accuracy was evaluated using an F-beta score weighted in favor of precision to account for technical artifacts of gene expression dropout. We examined the impact of hyperparameter optimization and feature selection methods on F-beta score performance. We found that the best performing model for granular cell type classification in both datasets is a multinomial logistic regression classifier and that an effective feature selection step was the most influential factor in optimizing the performance of the machine learning pipelines
FR-Match: robust matching of cell type clusters from single cell RNA sequencing data using the Friedman–Rafsky non-parametric test
Single cell/nucleus RNA sequencing (scRNAseq) is emerging as an essential tool to unravel the phenotypic heterogeneity of cells in complex biological systems. While computational methods for scRNAseq cell type clustering have advanced, the ability to integrate datasets to identify common and novel cell types across experiments remains a challenge. Here, we introduce a cluster-to-cluster cell type matching method-FR-Match-that utilizes supervised feature selection for dimensionality reduction and incorporates shared information among cells to determine whether two cell type clusters share the same underlying multivariate gene expression distribution. FR-Match is benchmarked with existing cell-to-cell and cell-to-cluster cell type matching methods using both simulated and real scRNAseq data. FR-Match proved to be a stringent method that produced fewer erroneous matches of distinct cell subtypes and had the unique ability to identify novel cell phenotypes in new datasets. In silico validation demonstrated that the proposed workflow is the only self-contained algorithm that was robust to increasing numbers of true negatives (i.e. non-represented cell types). FR-Match was applied to two human brain scRNAseq datasets sampled from cortical layer 1 and full thickness middle temporal gyrus. When mapping cell types identified in specimens isolated from these overlapping human brain regions, FR-Match precisely recapitulated the laminar characteristics of matched cell type clusters, reflecting their distinct neuroanatomical distributions. An R package and Shiny application are provided at https://github.com/JCVenterInstitute/FRmatch for users to interactively explore and match scRNAseq cell type clusters with complementary visualization tools
Recommended from our members
FastMix: a versatile data integration pipeline for cell type-specific biomarker inference.
MotivationFlow cytometry (FCM) and transcription profiling are the two widely used assays in translational immunology research. However, there is no data integration pipeline for analyzing these two types of assays together with experiment variables for biomarker inference. Current FCM data analysis mainly relies on subjective manual gating analysis, which is difficult to be directly integrated with other automated computational methods. Existing deconvolutional analysis of bulk transcriptomics relies on predefined marker genes in the transcriptomics data, which are unavailable for novel cell types and does not utilize the FCM data that provide canonical phenotypic definitions of the cell types.ResultsWe developed a novel analytics pipeline-FastMix-for computational immunology, which integrates flow cytometry, bulk transcriptomics and clinical covariates for identifying cell type-specific gene expression signatures and biomarker genes. FastMix addresses the 'large p, small n' problem in the gene expression and flow cytometry integration analysis via a linear mixed effects model (LMER) for both cross-sectional and longitudinal studies. Its novel moment-based estimator not only reduces bias in parameter estimation but also is more efficient than iterative optimization. The FastMix pipeline also includes a cutting-edge flow cytometry data analysis method-DAFi-for identifying cell populations of interest and their characteristics. Simulation studies showed that FastMix produced smaller type I/II errors than competing methods. Validation using real data of two vaccine studies showed that FastMix identified a consistent set of signature genes as in independent single-cell RNA-seq analysis, producing additional interesting findings.Availability and implementationSource code of FastMix is publicly available at https://github.com/terrysun0302/FastMix.Supplementary informationSupplementary data are available at Bioinformatics online
The genome and preliminary single-nuclei transcriptome of Lemna minuta reveals mechanisms of invasiveness
The ability to trace every cell in some model organisms has led to the fundamental understanding of development and cellular function. However, in plants the complexity of cell number, organ size, and developmental time makes this a challenge even in the diminutive model plant Arabidopsis (Arabidopsis thaliana). Duckweed, basal nongrass aquatic monocots, provide an opportunity to follow every cell of an entire plant due to their small size, reduced body plan, and fast clonal growth habit. Here we present a chromosome-resolved genome for the highly invasive Lesser Duckweed (Lemna minuta) and generate a preliminary cell atlas leveraging low cell coverage single-nuclei sequencing. We resolved the 360 megabase genome into 21 chromosomes, revealing a core nonredundant gene set with only the ancient tau whole-genome duplication shared with all monocots, and paralog expansion as a result of tandem duplications related to phytoremediation. Leveraging SMARTseq2 single-nuclei sequencing, which provided higher gene coverage yet lower cell count, we profiled 269 nuclei covering 36.9% (8,457) of the L. minuta transcriptome. Since molecular validation was not possible in this nonmodel plant, we leveraged gene orthology with model organism single-cell expression datasets, gene ontology, and cell trajectory analysis to define putative cell types. We found that the tissue that we computationally defined as mesophyll expressed high levels of elemental transport genes consistent with this tissue playing a role in L. minuta wastewater detoxification. The L. minuta genome and preliminary cell map provide a paradigm to decipher developmental genes and pathways for an entire plant