21 research outputs found

    Deep generative modeling for single-cell transcriptomics.

    Get PDF
    Single-cell transcriptome measurements can reveal unexplored biological diversity, but they suffer from technical noise and bias that must be modeled to account for the resulting uncertainty in downstream analyses. Here we introduce single-cell variational inference (scVI), a ready-to-use scalable framework for the probabilistic representation and analysis of gene expression in single cells ( https://github.com/YosefLab/scVI ). scVI uses stochastic optimization and deep neural networks to aggregate information across similar cells and genes and to approximate the distributions that underlie observed expression values, while accounting for batch effects and limited sensitivity. We used scVI for a range of fundamental analysis tasks including batch correction, visualization, clustering, and differential expression, and achieved high accuracy for each task

    Phenotype-driven identification of epithelial signalling clusters

    Get PDF
    In metazoans, epithelial architecture provides a context that dynamically modulates most if not all epithelial cell responses to intrinsic and extrinsic signals, including growth or survival signalling and transforming oncogene action. Three-dimensional ( 3D) epithelial culture systems provide tractable models to interrogate the function of human genetic determinants in establishment of context-dependency. We performed an arrayed genetic shRNA screen in mammary epithelial 3D cultures to identify new determinants of epithelial architecture, finding that the key phenotype impacting shRNAs altered not only the data population average but even more noticeably the population distribution. The broad distributions were attributable to sporadic gene silencing actions by shRNA in unselected populations. We employed Maximum Mean Discrepancy concept to capture similar population distribution patterns and demonstrate here the feasibility of the test in identifying an impact of shRNA in populations of 3D structures. Integration of the clustered morphometric data with protein-protein interactions data enabled hypothesis generation of novel biological pathways underlying similar 3D phenotype alterations. The results present a new strategy for 3D phenotype-driven pathway analysis, which is expected to accelerate discovery of context-dependent gene functions in epithelial biology and tumorigenesis.Peer reviewe

    Novel multiparameter correlates of \u3cem\u3eCoxiella burnetii\u3c/em\u3e infection and vaccination identified by longitudinal deep immune profiling

    Get PDF
    Q-fever is a flu-like illness caused by Coxiella burnetii (Cb), a highly infectious intracellular bacterium. There is an unmet need for a safe and effective vaccine for Q-fever. Correlates of immune protection to Cb infection are limited. We proposed that analysis by longitudinal high dimensional immune (HDI) profiling using mass cytometry combined with other measures of vaccination and protection could be used to identify novel correlates of effective vaccination and control of Cb infection. Using a vaccine-challenge model in HLA-DR transgenic mice, we demonstrated significant alterations in circulating T-cell and innate immune populations that distinguished vaccinated from naïve mice within 10 days, and persisted until at least 35 days post-vaccination. Following challenge, vaccinated mice exhibited reduced bacterial burden and splenomegaly, along with distinct effector T-cell and monocyte profiles. Correlation of HDI data to serological and pathological measurements was performed. Our data indicate a Th1-biased response to Cb, consistent with previous reports, and identify Ly6C, CD73, and T-bet expression in T-cell, NK-cell, and monocytic populations as distinguishing features between vaccinated and naïve mice. This study refines the understanding of the integrated immune response to Cb vaccine and challenge, which can inform the assessment of candidate vaccines for Cb

    Sekvenciranje RNK na ravni posameznih celic: revolucionarna tehnologija, ki nadgrajuje razumevanje kompleksnih bolezni in spodbuja oseben pristop k zdravljenju – primer melanoma kože

    Get PDF
    Tehnologija sekvenciranja RNK na ravni posameznih celic (scRNAseq) nam omogoča, da z visoko ločljivostjo in natančnostjo naenkrat določimo nabor vseh molekul RNK v vsaki posamezni celici, ki se nahaja v določenem vzorcu oz. tkivu. Danes je scRNAseq pomembno orodje predvsem za proučevanje kompleksnih bioloških sistemov in tkiv, kot je tumorsko tkivo, kjer je velika celična raznolikost ključnega pomena. V članku navajamo primer melanoma kože, ki je eden najpogostejših in najbolj agresivnih rakov v razvitem svetu. Čeprav se je v zadnjem času z uvedbo imunske terapije napoved izida melanoma bistveno izboljšala, pa je še vedno približno 30–40 % bolnikov, pri katerih tovrstno zdravljenje ni uspešno. Novi podatki, pridobljeni z uporabo scRNAseq, so razkrili, da je mehanizem odpornosti na zdravljenje z zaviralci imunskih nadzornih točk zelo kompleksen, da na to poleg prisotnosti in fenotipa izčrpanih limfocitov T CD8+ vpliva tudi mutacija v genu BRAF, fenotip melanocitov, prisotnost in fenotip celic mieloičnega izvora, prisotnost fibroblastov različnega fenotipa ter interakcije med vsemi celicami, ki tvorijo tumorsko mikrookolje. V prihodnosti bo torej vse bolj pomemben oseben pristop zdravljenja, ki bo temeljil na molekularni in celični opredelitvi tumorja in njegovega mikrookolja ter na napovednih bioloških označevalcih. Z uporabo tehnologije scRNAseq se bomo lahko cilju osebne medicine zelo približali, saj nam omogoča identifikacijo posameznih celic in celičnih označevalcev, ki bi lahko napovedali odziv bolnika na zdravljenje in omogoča bolj ciljano odločitev za vrsto zdravljenja za posameznega bolnika. Na ta način bi se izognili principu zdravljenja, ki temelji na “poskusu in napaki” ter tako bistveno izboljšali učinkovitost zdravljenja. Zaenkrat pa se tehnologija scRNAseq uporablja zgolj v raziskovalne namene, zato zaradi določenih omejitev ni uvedena v dejansko klinično prakso

    Model-based deep autoencoders for characterizing discrete data with application to genomic data analysis

    Get PDF
    Deep learning techniques have achieved tremendous successes in a wide range of real applications in recent years. For dimension reduction, deep neural networks (DNNs) provide a natural choice to parameterize a non-linear transforming function that maps the original high dimensional data to a lower dimensional latent space. Autoencoder is a kind of DNNs used to learn efficient feature representation in an unsupervised manner. Deep autoencoder has been widely explored and applied to analysis of continuous data, while it is understudied for characterizing discrete data. This dissertation focuses on developing model-based deep autoencoders for modeling discrete data. A motivating example of discrete data is the count data matrix generated by single-cell RNA sequencing (scRNA-seq) technology which is widely used in biological and medical fields. scRNA-seq promises to provide higher resolution of cellular differences than bulk RNA sequencing and has helped researchers to better understand complex biological questions. The recent advances in sequencing technology have enabled a dramatic increase in the throughput to thousands of cells for scRNA-seq. However, analysis of scRNA-seq data remains a statistical and computational challenge. A major problem is the pervasive dropout events obscuring the discrete matrix with prevailing \u27false\u27 zero count observations, which is caused by the shallow sequencing depth per cell. To make downstream analysis more effective, imputation, which recovers the missing values, is often conducted as the first step in preprocessing scRNA-seq data. Several imputation methods have been proposed. Of note is a deep autoencoder model, which proposes to explicitly characterize the count distribution, over-dispersion, and sparsity of scRNA-seq data using a zero-inflated negative binomial (ZINB) model. This dissertation introduces a model-based deep learning clustering model ? scDeepCluster for clustering analysis of scRNA-seq data. The scDeepCluster is a deep autoencoder which simultaneously learns feature representation and clustering via explicit modeling of scRNA-seq data generation using the ZINB model. Based on testing extensive simulated datasets and real datasets from different representative single-cell sequencing platforms, scDeepCluster outperformed several state-of-the-art methods under various clustering performance metrics and exhibited improved scalability, with running time increasing linearly with the sample size. Although this model-based deep autoencoder approach has demonstrated superior performance, it is over-permissive in defining ZINB model space, which can lead to an unidentifiable model and make results unstable. Next, this dissertation proposes to impose a regularization that takes dropout events into account. The regularization uses a differentiable categorical distribution - Gumbel-Softmax to explicitly model the dropout events, and minimizes the Maximum Mean Discrepancy (MMD) between the reconstructed randomly masked matrix and the raw count matrix. Imputation analyses showed that the proposed regularized model-based autoencoder significantly outperformed the vanilla model-based deep autoencoder
    corecore