Search CORE

24 research outputs found

Recommended from our members

Translator: A Transfer Learning Approach to Facilitate Single-Cell ATAC-Seq Data Analysis from Reference Dataset.

Author: Dai Yi
Girgenti Matthew
Hwang Ahyeon
Lee Cheyu
Skarica Mario
Xu Siwei
Zhang Jing
Publication venue: eScholarship, University of California
Publication date: 01/07/2022
Field of study

Recent advances in single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) have allowed simultaneous epigenetic profiling over thousands of individual cells to dissect the cellular heterogeneity and elucidate regulatory mechanisms at the finest possible resolution. However, scATAC-seq is challenging to model computationally due to the ultra-high dimensionality, low signal-to-noise ratio, complex feature interactions, and high vulnerability to various confounding factors. In this study, we present Translator, an efficient transfer learning approach to capture generalizable chromatin interactions from high-quality (HQ) reference scATAC-seq data to obtain robust cell representations in low-to-moderate quality target scATAC-seq data. We applied Translator on various simulated and real scATAC-seq datasets and demonstrated that Translator could learn more biologically meaningful cell representations than other methods by incorporating information learned from the reference data, thus facilitating various downstream analyses such as clustering and motif enrichment measurements. Moreover, Translators block-wise deep learning framework can handle nonlinear relationships with restricted connections using fewer parameters to boost computational efficiency through Graphics Processing Unit (GPU) parallelism. Finally, we have implemented Translator as a free software package available for the community to leverage large-scale, HQ reference data to study target scATAC-seq data

eScholarship - University of California

Recommended from our members

InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification.

Author: Girgenti Matthew
Gong Yanwen
Hwang Ahyeon
Srinivasan Shushrruth
Xu Min
Xu Siwei
Zhang Jing
Publication venue: eScholarship, University of California
Publication date: 30/03/2022
Field of study

Mapping chromatin insulator loops is crucial to investigating genome evolution, elucidating critical biological functions, and ultimately quantifying variant impact in diseases. However, chromatin conformation profiling assays are usually expensive, time-consuming, and may report fuzzy insulator annotations with low resolution. Therefore, we propose a weakly supervised deep learning method, InsuLock, to address these challenges. Specifically, InsuLock first utilizes a Siamese neural network to predict the existence of insulators within a given region (up to 2000 bp). Then, it uses an object detection module for precise insulator boundary localization via gradient-weighted class activation mapping (~40 bp resolution). Finally, it quantifies variant impacts by comparing the insulator score differences between the wild-type and mutant alleles. We applied InsuLock on various bulk and single-cell datasets for performance testing and benchmarking. We showed that it outperformed existing methods with an AUROC of ~0.96 and condensed insulator annotations to ~2.5% of their original size while still demonstrating higher conservation scores and better motif enrichments. Finally, we utilized InsuLock to make cell-type-specific variant impacts from brain scATAC-seq data and identified a schizophrenia GWAS variant disrupting an insulator loop proximal to a known risk gene, indicating a possible new mechanism of action for the disease

eScholarship - University of California

InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification

Author: Ahyeon Hwang
Jing Zhang
Matthew J. Girgenti
Min Xu
Shushrruth Sai Srinivasan
Siwei Xu
Yanwen Gong
Publication venue: MDPI AG
Publication date: 01/03/2022
Field of study

Directory of Open Access Journals

InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification

Author: Ahyeon Hwang
Jing Zhang
Matthew J. Girgenti
Min Xu
Shushrruth Sai Srinivasan
Siwei Xu
Yanwen Gong
Publication venue: 'MDPI AG'
Publication date: 30/03/2022
Field of study

Multidisciplinary Digital Publishing Institute

Recommended from our members

iHerd: an integrative hierarchical graph representation learning framework to quantify network changes and prioritize risk genes in disease.

Author: Dai Yi
Duan Ziheng
Girgenti Matthew
Hwang Ahyeon
Lee Cheyu
Xiao Chutong
Xie Kaichi
Xu Min
Zhang Jing
Publication venue: eScholarship, University of California
Publication date: 01/09/2023
Field of study

Different genes form complex networks within cells to carry out critical cellular functions, while network alterations in this process can potentially introduce downstream transcriptome perturbations and phenotypic variations. Therefore, developing efficient and interpretable methods to quantify network changes and pinpoint driver genes across conditions is crucial. We propose a hierarchical graph representation learning method, called iHerd. Given a set of networks, iHerd first hierarchically generates a series of coarsened sub-graphs in a data-driven manner, representing network modules at different resolutions (e.g., the level of signaling pathways). Then, it sequentially learns low-dimensional node representations at all hierarchical levels via efficient graph embedding. Lastly, iHerd projects separate gene embeddings onto the same latent space in its graph alignment module to calculate a rewiring index for driver gene prioritization. To demonstrate its effectiveness, we applied iHerd on a tumor-to-normal GRN rewiring analysis and cell-type-specific GCN analysis using single-cell multiome data of the brain. We showed that iHerd can effectively pinpoint novel and well-known risk genes in different diseases. Distinct from existing models, iHerds graph coarsening for hierarchical learning allows us to successfully classify network driver genes into early and late divergent genes (EDGs and LDGs), emphasizing genes with extensive network changes across and within signaling pathway levels. This unique approach for driver gene classification can provide us with deeper molecular insights. The code is freely available at https://github.com/aicb-ZhangLabs/iHerd. All other relevant data are within the manuscript and supporting information files

eScholarship - University of California

iHerd: an integrative hierarchical graph representation learning framework to quantify network changes and prioritize risk genes in disease.

Author: Ahyeon Hwang
Cheyu Lee
Chutong Xiao
Jing Zhang
Kaichi Xie
Matthew J Girgenti
Min Xu
Yi Dai
Ziheng Duan
Publication venue: Public Library of Science (PLoS)
Publication date: 01/09/2023
Field of study

Different genes form complex networks within cells to carry out critical cellular functions, while network alterations in this process can potentially introduce downstream transcriptome perturbations and phenotypic variations. Therefore, developing efficient and interpretable methods to quantify network changes and pinpoint driver genes across conditions is crucial. We propose a hierarchical graph representation learning method, called iHerd. Given a set of networks, iHerd first hierarchically generates a series of coarsened sub-graphs in a data-driven manner, representing network modules at different resolutions (e.g., the level of signaling pathways). Then, it sequentially learns low-dimensional node representations at all hierarchical levels via efficient graph embedding. Lastly, iHerd projects separate gene embeddings onto the same latent space in its graph alignment module to calculate a rewiring index for driver gene prioritization. To demonstrate its effectiveness, we applied iHerd on a tumor-to-normal GRN rewiring analysis and cell-type-specific GCN analysis using single-cell multiome data of the brain. We showed that iHerd can effectively pinpoint novel and well-known risk genes in different diseases. Distinct from existing models, iHerd's graph coarsening for hierarchical learning allows us to successfully classify network driver genes into early and late divergent genes (EDGs and LDGs), emphasizing genes with extensive network changes across and within signaling pathway levels. This unique approach for driver gene classification can provide us with deeper molecular insights. The code is freely available at https://github.com/aicb-ZhangLabs/iHerd. All other relevant data are within the manuscript and supporting information files

Directory of Open Access Journals

Recommended from our members

Characterizing dysregulations via cell-cell communications in Alzheimers brains using single-cell transcriptomes.

Author: Duan Ziheng
Hwang Ahyeon
Lee Che
Lei Yutong
Momtaz Nadia
Pariser Joseph
Riffle Dylan
Sikdar Diptanshu
Xiong Yifeng
Zhang Jing
Publication venue: eScholarship, University of California
Publication date: 13/05/2024
Field of study

BACKGROUND: Alzheimers disease (AD) is a devastating neurodegenerative disorder affecting 44 million people worldwide, leading to cognitive decline, memory loss, and significant impairment in daily functioning. The recent single-cell sequencing technology has revolutionized genetic and genomic resolution by enabling scientists to explore the diversity of gene expression patterns at the finest resolution. Most existing studies have solely focused on molecular perturbations within each cell, but cells live in microenvironments rather than in isolated entities. Here, we leveraged the large-scale and publicly available single-nucleus RNA sequencing in the human prefrontal cortex to investigate cell-to-cell communication in healthy brains and their perturbations in AD. We uniformly processed the snRNA-seq with strict QCs and labeled canonical cell types consistent with the definitions from the BRAIN Initiative Cell Census Network. From ligand and receptor gene expression, we built a high-confidence cell-to-cell communication network to investigate signaling differences between AD and healthy brains. RESULTS: Specifically, we first performed broad communication pattern analyses to highlight that biologically related cell types in normal brains rely on largely overlapping signaling networks and that the AD brain exhibits the irregular inter-mixing of cell types and signaling pathways. Secondly, we performed a more focused cell-type-centric analysis and found that excitatory neurons in AD have significantly increased their communications to inhibitory neurons, while inhibitory neurons and other non-neuronal cells globally decreased theirs to all cells. Then, we delved deeper with a signaling-centric view, showing that canonical signaling pathways CSF, TGFβ, and CX3C are significantly dysregulated in their signaling to the cell type microglia/PVM and from endothelial to neuronal cells for the WNT pathway. Finally, after extracting 23 known AD risk genes, our intracellular communication analysis revealed a strong connection of extracellular ligand genes APP, APOE, and PSEN1 to intracellular AD risk genes TREM2, ABCA1, and APP in the communication from astrocytes and microglia to neurons. CONCLUSIONS: In summary, with the novel advances in single-cell sequencing technologies, we show that cellular signaling is regulated in a cell-type-specific manner and that improper regulation of extracellular signaling genes is linked to intracellular risk genes, giving the mechanistic intra- and inter-cellular picture of AD

eScholarship - University of California

The parameter tuning for <i>iHerd</i>.

Author: Ahyeon Hwang (16965074)
Cheyu Lee (16965077)
Chutong Xiao (16965083)
Jing Zhang (23775)
Kaichi Xie (16965080)
Matthew J. Girgenti (14076794)
Min Xu (15203)
Yi Dai (47974)
Ziheng Duan (14020445)
Publication venue
Publication date: 11/09/2023
Field of study

(a) The bar plot of the number of nodes per level for controls and disease samples under excitatory neurons and microglia. (b) The line plot of running time with different embedding dimensions and different learning frameworks for controls under excitatory neurons and microglia. (c) The line plot of network modality with different coarsen times (zero coarsen times indicates the initial state).</p

FigShare

Simulated GRN experiments.

Author: Ahyeon Hwang (16965074)
Cheyu Lee (16965077)
Chutong Xiao (16965083)
Jing Zhang (23775)
Kaichi Xie (16965080)
Matthew J. Girgenti (14076794)
Min Xu (15203)
Yi Dai (47974)
Ziheng Duan (14020445)
Publication venue
Publication date: 11/09/2023
Field of study

(a) Simulation scheme on GRNs. (b) The violin plot of the false positive test. (c) The distributions of the node change distance for the false positive test.</p

FigShare