5 research outputs found

    TGMI: an efficient algorithm for identifying pathway regulators through evaluation of triple-gene mutual interaction

    Get PDF
    Despite their important roles, the regulators for most metabolic pathways and biological processes remain elusive. Presently, the methods for identifying metabolic pathway and biological process regulators are intensively sought after. We developed a novel algorithm called triple-gene mutual interaction (TGMI) for identifying these regulators using high-throughput gene expression data. It first calculated the regulatory interactions among triple gene blocks (two pathway genes and one transcription factor (TF)), using conditional mutual information, and then identifies significantly interacted triple genes using a newly identified novel mutual interaction measure (MIM), which was substantiated to reflect strengths of regulatory interactions within each triple gene block. The TGMI calculated the MIM for each triple gene block and then examined its statistical significance using bootstrap. Finally, the frequencies of all TFs present in all significantly interacted triple gene blocks were calculated and ranked. We showed that the TFs with higher frequencies were usually genuine pathway regulators upon evaluating multiple pathways in plants, animals and yeast. Comparison of TGMI with several other algorithms demonstrated its higher accuracy. Therefore, TGMI will be a valuable tool that can help biologists to identify regulators of metabolic pathways and biological processes from the exploded high-throughput gene expression data in public repositories

    Bacterium-Enabled Transient Gene Activation by Artificial Transcription Factor for Resolving Gene Regulation in Maize

    Get PDF
    Cellular functions are diversified through intricate transcription regulations, and an understanding gene regulation networks is essential to elucidating many developmental processes and environmental responses. Here, we employed the Transcriptional-Activator Like effectors (TALes), which represent a family of transcription factors that are synthesized by members of the γ-proteobacterium genus Xanthomonas and secreted to host cells for activation of targeted host genes. Through delivery by the maize pathogen, Xanthomonas vasicola pv. vasculorum, designer TALes (dTALes), which are synthetic TALes, were used to induce the expression of the maize gene glossy3 (gl3), a MYB transcription factor gene involved in the cuticular wax biosynthesis. RNA-Seq analysis of leaf samples identified 146 gl3 downstream genes. Eight of the nine known genes known to be involved in the cuticular wax biosynthesis were up-regulated by at least one dTALe. A top-down Gaussian graphical model predicted that 68 gl3 downstream genes were directly regulated by GL3. A chemically induced mutant of the gene Zm00001d017418 from the gl3 downstream gene, encoding aldehyde dehydrogenase, exhibited a typical glossy leaf phenotype and reduced epicuticular waxes. The bacterial protein delivery of artificial transcription factors, dTALes, proved to be a straightforward and powerful approach for the revelation of gene regulation in plants

    Construction of a hierarchical gene regulatory network centered around a transcription factor

    No full text
    We have modified a multitude of transcription factors (TFs) in numerous plant species and some animal species, and obtained transgenic lines that exhibit phenotypic alterations. Whenever we observe phenotypic changes in a TF’s transgenic lines, we are always eager to identify its target genes, collaborative regulators and even upstream high hierarchical regulators. This issue can be addressed by establishing a multilayered hierarchical gene regulatory network (ML-hGRN) centered around a given TF. In this article, a practical approach for constructing an ML-hGRN centered on a TF using a combined approach of top-down and bottom-up network construction methods is described. Strategies for constructing ML-hGRNs are vitally important, as these networks provide key information to advance our understanding of how biological processes are regulated

    Statistical methods for gene selection and genetic association studies

    Get PDF
    This dissertation includes five Chapters. A brief description of each chapter is organized as follows. In Chapter One, we propose a signed bipartite genotype and phenotype network (GPN) by linking phenotypes and genotypes based on the statistical associations. It provides a new insight to investigate the genetic architecture among multiple correlated phenotypes and explore where phenotypes might be related at a higher level of cellular and organismal organization. We show that multiple phenotypes association studies by considering the proposed network are improved by incorporating the genetic information into the phenotype clustering. In Chapter Two, we first illustrate the proposed GPN to GWAS summary statistics. Then, we assess contributions to constructing a well-defined GPN with a clear representation of genetic associations by comparing the network properties with a random network, including connectivity, centrality, and community structure. The network topology annotations based on the sparse representations of GPN can be used to understand the disease heritability for the highly correlated phenotypes. In applications of phenome-wide association studies, the proposed GPN can identify more significant pairs of genetic variant and phenotype categories. In Chapter Three, a powerful and computationally efficient gene-based association test is proposed, aggregating information from different gene-based association tests and also incorporating expression quantitative trait locus information. We show that the proposed method controls the type I error rates very well and has higher power in the simulation studies and can identify more significant genes in the real data analyses. In Chapter Four, we develop six statistical selection methods based on the penalized regression for inferring target genes of a transcription factor (TF). In this study, the proposed selection methods combine statistics, machine learning , and convex optimization approach, which have great efficacy in identifying the true target genes. The methods will fill the gap of lacking the appropriate methods for predicting target genes of a TF, and are instrumental for validating experimental results yielding from ChIP-seq and DAP-seq, and conversely, selection and annotation of TFs based on their target genes. In Chapter Five, we propose a gene selection approach by capturing gene-level signals in network-based regression into case-control association studies with DNA sequence data or DNA methylation data, inspired by the popular gene-based association tests using a weighted combination of genetic variants to capture the combined effect of individual genetic variants within a gene. We show that the proposed gene selection approach have higher true positive rates than using traditional dimension reduction techniques in the simulation studies and select potentially rheumatoid arthritis related genes that are missed by existing methods