26 research outputs found

    sc-OTGM: Single-Cell Perturbation Modeling by Solving Optimal Mass Transport on the Manifold of Gaussian Mixtures

    Full text link
    Influenced by breakthroughs in LLMs, single-cell foundation models are emerging. While these models show successful performance in cell type clustering, phenotype classification, and gene perturbation response prediction, it remains to be seen if a simpler model could achieve comparable or better results, especially with limited data. This is important, as the quantity and quality of single-cell data typically fall short of the standards in textual data used for training LLMs. Single-cell sequencing often suffers from technical artifacts, dropout events, and batch effects. These challenges are compounded in a weakly supervised setting, where the labels of cell states can be noisy, further complicating the analysis. To tackle these challenges, we present sc-OTGM, streamlined with less than 500K parameters, making it approximately 100x more compact than the foundation models, offering an efficient alternative. sc-OTGM is an unsupervised model grounded in the inductive bias that the scRNAseq data can be generated from a combination of the finite multivariate Gaussian distributions. The core function of sc-OTGM is to create a probabilistic latent space utilizing a GMM as its prior distribution and distinguish between distinct cell populations by learning their respective marginal PDFs. It uses a Hit-and-Run Markov chain sampler to determine the OT plan across these PDFs within the GMM framework. We evaluated our model against a CRISPR-mediated perturbation dataset, called CROP-seq, consisting of 57 one-gene perturbations. Our results demonstrate that sc-OTGM is effective in cell state classification, aids in the analysis of differential gene expression, and ranks genes for target identification through a recommender system. It also predicts the effects of single-gene perturbations on downstream gene regulation and generates synthetic scRNA-seq data conditioned on specific cell states.Comment: ICLR 2024, Machine Learning for Genomics Explorations Worksho

    Development of the proepicardial organ in the zebrafish.

    No full text
    The epicardium is the last layer of the vertebrate heart to form, surrounding the heart muscle during embryogenesis and providing signaling cues essential to the continued growth and differentiation of the heart. This outer layer of the heart develops from a transient structure, the proepicardial organ (PEO). Despite its essential roles, the early signals required for the formation of the PEO and the epicardium remain poorly understood. The molecular markers wt1 and tcf21 are used to identify the epicardial layer in the zebrafish heart, to trace its development and to determine genes required for its normal development. Disruption of lateral plate mesoderm (LPM) migration through knockdown of miles apart or casanova leads to cardia bifida with each bilateral heart associated with its own PEO, suggesting that the earliest progenitors of the epicardium lie in the LPM. Using a gene knockdown approach, a genetic framework for PEO development is outlined. The pandora/spt6 gene is required for multiple cardiac lineages, the zinc-finger transcription factor wt1 is required for the epicardial lineage only and finally, the cell polarity genes heart and soul and nagie oko are required for proper PEO morphogenesis

    Development of the proepicardial organ in the zebrafish.

    Get PDF
    The epicardium is the last layer of the vertebrate heart to form, surrounding the heart muscle during embryogenesis and providing signaling cues essential to the continued growth and differentiation of the heart. This outer layer of the heart develops from a transient structure, the proepicardial organ (PEO). Despite its essential roles, the early signals required for the formation of the PEO and the epicardium remain poorly understood. The molecular markers wt1 and tcf21 are used to identify the epicardial layer in the zebrafish heart, to trace its development and to determine genes required for its normal development. Disruption of lateral plate mesoderm (LPM) migration through knockdown of miles apart or casanova leads to cardia bifida with each bilateral heart associated with its own PEO, suggesting that the earliest progenitors of the epicardium lie in the LPM. Using a gene knockdown approach, a genetic framework for PEO development is outlined. The pandora/spt6 gene is required for multiple cardiac lineages, the zinc-finger transcription factor wt1 is required for the epicardial lineage only and finally, the cell polarity genes heart and soul and nagie oko are required for proper PEO morphogenesis

    olig1 Expression identifies developing oligodendrocytes in zebrafish and requires hedgehog and notch signaling.

    No full text
    Myelin, the isolating sheath around large diameter axons, is formed in the central nervous system (CNS) by oligodendrocytes. We isolated the zebrafish ortholog of olig1, a bHLH transcription factor, and describe the origin and development of oligodendrocytes in the zebrafish brain. Olig1:mem-eGFP transgenic animals demonstrate the highly dynamic nature of oligodendrocyte membrane processes, providing a tool for studying in vivo oligodendrocyte development. Formation of oligodendrocytes and initiation of olig1 expression are under the control of long-range hedgehog and notch signaling while maintenance of olig1 expression only depends on hedgehog. Over-expression of olig1 did not affect myelin formation in the brain and combined over-expression of olig1 and olig2 could not rescue loss of hedgehog signaling, indicating that critical factors other than olig1 and olig2 are necessary. Lastly, knockdown of Olig1 in an Olig2-sensitized background did result in defects in CNS myelination, indicating a functional overlap between Olig1 and Olig2 proteins

    Partitioning of Tissue Expression Accompanies Multiple Duplications of the Na+/K+ ATPase α Subunit Gene

    No full text
    Vertebrate genomes contain multiple copies of related genes that arose through gene duplication. In the past it has been proposed that these duplicated genes were retained because of acquisition of novel beneficial functions. A more recent model, the duplication-degeneration-complementation hypothesis (DDC), posits that the functions of a single gene may become separately allocated among the duplicated genes, rendering both duplicates essential. Thus far, empirical evidence for this model has been limited to the engrailed and sox family of developmental regulators, and it has been unclear whether it may also apply to ubiquitously expressed genes with essential functions for cell survival. Here we describe the cloning of three zebrafish α subunits of the Na(+),K(+)-ATPase and a comprehensive evolutionary analysis of this gene family. The predicted amino acid sequences are extremely well conserved among vertebrates. The evolutionary relationships and the map positions of these genes and of other α-like sequences indicate that both tandem and ploidy duplications contributed to the expansion of this gene family in the teleost lineage. The duplications are accompanied by acquisition of clear functional specialization, consistent with the DDC model of genome evolution. [The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AY028628, AY028629, and AY028630

    Genetic Control of Collective Behavior in Zebrafish

    No full text
    Many animals, including humans, have evolved to live and move in groups. In humans, disrupted social interactions are a fundamental feature of many psychiatric disorders. However, we know little about how genes regulate social behavior. Zebrafish may serve as a powerful model to explore this question. By comparing the behavior of wild-type fish with 90 mutant lines, we show that mutations of genes associated with human psychiatric disorders can alter the collective behavior of adult zebrafish. We identify three categories of behavioral variation across mutants: “scattered,” in which fish show reduced cohesion; “coordinated,” in which fish swim more in aligned schools; and “huddled,” in which fish form dense but disordered groups. Changes in individual interaction rules can explain these differences. This work demonstrates how emergent patterns in animal groups can be altered by genetic changes in individuals and establishes a framework for understanding the fundamentals of social information processing.publishe

    ALKALs are in vivo ligands for ALK family receptor tyrosine kinases in the neural crest and derived cells

    No full text
    Mutations in anaplastic lymphoma kinase (ALK) are implicated in somatic and familial neuroblastoma, a pediatric tumor of neural crest-derived tissues. Recently, biochemical analyses have identified secreted small ALKAL proteins (FAM150, AUG) as potential ligands for human ALK and the related leukocyte tyrosine kinase (LTK). In the zebrafish Danio rerio, DrLtk, which is similar to human ALK in sequence and domain structure, controls the development of iridophores, neural crest-derived pigment cells. Hence, the zebrafish system allows studying Alk/Ltk and Alkals involvement in neural crest regulation in vivo. Using zebrafish pigment pattern formation, Drosophila eye patterning, and cell culture-based assays, we show that zebrafish Alkals potently activate zebrafish Ltk and human ALK driving downstream signaling events. Overexpression of the three DrAlkals cause ectopic iridophore development, whereas loss-of-function alleles lead to spatially distinct patterns of iridophore loss in zebrafish larvae and adults. alkal loss-of-function triple mutants completely lack iridophores and are larval lethal as is the case for ltk null mutants. Our results provide in vivo evidence of (i) activation of ALK/LTK family receptors by ALKALs and (ii) an involvement of these ligand-receptor complexes in neural crest development
    corecore