10 research outputs found

    DEVELOPMENT OF BIOINFORMATICS TOOLS AND ALGORITHMS FOR IDENTIFYING PATHWAY REGULATORS, INFERRING GENE REGULATORY RELATIONSHIPS AND VISUALIZING GENE EXPRESSION DATA

    Get PDF
    In the era of genetics and genomics, the advent of big data is transforming the field of biology into a data-intensive discipline. Novel computational algorithms and software tools are in demand to address the data analysis challenges in this growing field. This dissertation comprises the development of a novel algorithm, web-based data analysis tools, and a data visualization platform. Triple Gene Mutual Interaction (TGMI) algorithm, presented in Chapter 2 is an innovative approach to identify key regulatory transcription factors (TFs) that govern a particular biological pathway or a process through interaction among three genes in a triple gene block, which consists of a pair of pathway genes and a TF. The identification of key TFs controlling a biological pathway or a process allows biologists to understand the complex regulatory mechanisms in living organisms. TF-Miner, presented in Chapter 3, is a high-throughput gene expression data analysis web application that was developed by integrating two highly efficient algorithms; TF-cluster and TF-Finder. TF-Cluster can be used to obtain collaborative TFs that coordinately control a biological pathway or a process using genome-wide expression data. On the other hand, TF-Finder can identify regulatory TFs involved in or associated with a specific biological pathway or a process using Adaptive Sparse Canonical Correlation Analysis (ASCCA). Chapter 4 presents ExactSearch; a suffix tree based motif search algorithm, implemented in a web-based tool. This tool can identify the locations of a set of motif sequences in a set of target promoter sequences. ExactSearch also provides the functionality to search for a set of motif sequences in flanking regions from 50 plant genomes, which we have incorporated into the web tool. Chapter 5 presents STTM JBrowse; a web-based RNA-Seq data visualization system built using the JBrowse open source platform. STTM JBrowse is a unified repository to share/produce visualizations created from large RNA-Seq datasets generated from a variety of model and crop plants in which miRNAs were destroyed using Short Tandem Target Mimic (STTM) Technology

    TGMI: an efficient algorithm for identifying pathway regulators through evaluation of triple-gene mutual interaction

    Get PDF
    Despite their important roles, the regulators for most metabolic pathways and biological processes remain elusive. Presently, the methods for identifying metabolic pathway and biological process regulators are intensively sought after. We developed a novel algorithm called triple-gene mutual interaction (TGMI) for identifying these regulators using high-throughput gene expression data. It first calculated the regulatory interactions among triple gene blocks (two pathway genes and one transcription factor (TF)), using conditional mutual information, and then identifies significantly interacted triple genes using a newly identified novel mutual interaction measure (MIM), which was substantiated to reflect strengths of regulatory interactions within each triple gene block. The TGMI calculated the MIM for each triple gene block and then examined its statistical significance using bootstrap. Finally, the frequencies of all TFs present in all significantly interacted triple gene blocks were calculated and ranked. We showed that the TFs with higher frequencies were usually genuine pathway regulators upon evaluating multiple pathways in plants, animals and yeast. Comparison of TGMI with several other algorithms demonstrated its higher accuracy. Therefore, TGMI will be a valuable tool that can help biologists to identify regulators of metabolic pathways and biological processes from the exploded high-throughput gene expression data in public repositories

    DNA methylation at a nutritionally sensitive region of the PAX8 gene is associated with thyroid volume and function in Gambian children.

    Get PDF
    Funder: Wellcome TrustPAX8 is a key thyroid transcription factor implicated in thyroid gland differentiation and function, and PAX8 gene methylation is reported to be sensitive to the periconceptional environment. Using a novel recall-by-epigenotype study in Gambian children, we found that PAX8 hypomethylation at age 2 years is associated with a 21% increase in thyroid volume and an increase in free thyroxine (T4) at 5 to 8 years, the latter equivalent to 8.4% of the normal range. Free T4 was associated with a decrease in DXA-derived body fat and bone mineral density. Furthermore, offspring PAX8 methylation was associated with periconceptional maternal nutrition, and methylation variability was influenced by genotype, suggesting that sensitivity to environmental exposures may be under partial genetic control. Together, our results demonstrate a possible link between early environment, PAX8 gene methylation and thyroid gland development and function, with potential implications for early embryonic programming of thyroid-related health and disease

    A genomic atlas of systemic interindividual epigenetic variation in humans.

    Get PDF
    BACKGROUND: DNA methylation is thought to be an important determinant of human phenotypic variation, but its inherent cell type specificity has impeded progress on this question. At exceptional genomic regions, interindividual variation in DNA methylation occurs systemically. Like genetic variants, systemic interindividual epigenetic variants are stable, can influence phenotype, and can be assessed in any easily biopsiable DNA sample. We describe an unbiased screen for human genomic regions at which interindividual variation in DNA methylation is not tissue-specific. RESULTS: For each of 10 donors from the NIH Genotype-Tissue Expression (GTEx) program, CpG methylation is measured by deep whole-genome bisulfite sequencing of genomic DNA from tissues representing the three germ layer lineages: thyroid (endoderm), heart (mesoderm), and brain (ectoderm). We develop a computational algorithm to identify genomic regions at which interindividual variation in DNA methylation is consistent across all three lineages. This approach identifies 9926 correlated regions of systemic interindividual variation (CoRSIVs). These regions, comprising just 0.1% of the human genome, are inter-correlated over long genomic distances, associated with transposable elements and subtelomeric regions, conserved across diverse human ethnic groups, sensitive to periconceptional environment, and associated with genes implicated in a broad range of human disorders and phenotypes. CoRSIV methylation in one tissue can predict expression of associated genes in other tissues. CONCLUSIONS: In addition to charting a previously unexplored molecular level of human individuality, this atlas of human CoRSIVs provides a resource for future population-based investigations into how interindividual epigenetic variation modulates risk of disease

    A new era for epigenetic epidemiology

    No full text

    Systemic interindividual epigenetic variation in humans is associated with transposable elements and under strong genetic control

    No full text
    Abstract Background Genetic variants can modulate phenotypic outcomes via epigenetic intermediates, for example at methylation quantitative trait loci (mQTL). We present the first large-scale assessment of mQTL at human genomic regions selected for interindividual variation in CpG methylation, which we call correlated regions of systemic interindividual variation (CoRSIVs). These can be assayed in blood DNA and do not reflect interindividual variation in cellular composition. Results We use target-capture bisulfite sequencing to assess DNA methylation at 4086 CoRSIVs in multiple tissues from each of 188 donors in the NIH Gene-Tissue Expression (GTEx) program. At CoRSIVs, DNA methylation in peripheral blood correlates with methylation and gene expression in internal organs. We also discover unprecedented mQTL at these regions. Genetic influences on CoRSIV methylation are extremely strong (median R 2=0.76), cumulatively comprising over 70-fold more human mQTL than detected in the most powerful previous study. Moreover, mQTL beta coefficients at CoRSIVs are highly skewed (i.e., the major allele predicts higher methylation). Both surprising findings are independently validated in a cohort of 47 non-GTEx individuals. Genomic regions flanking CoRSIVs show long-range enrichments for LINE-1 and LTR transposable elements; the skewed beta coefficients may therefore reflect evolutionary selection of genetic variants that promote their methylation and silencing. Analyses of GWAS summary statistics show that mQTL polymorphisms at CoRSIVs are associated with metabolic and other classes of disease. Conclusions A focus on systemic interindividual epigenetic variants, clearly enhanced in mQTL content, should likewise benefit studies attempting to link human epigenetic variation to the risk of disease

    Co-expression analysis aids in the identification of genes in the cuticular wax pathway in maize.

    No full text
    Epicuticular waxes provide a hydrophobic barrier that protects land plants from environmental stresses. To elucidate the molecular functions of maize glossy mutants that reduce the accumulation of epicuticular waxes, eight non-allelic glossy mutants were subjected to transcriptomic comparisons with their respective wild-type siblings. Transcriptomic comparisons identified 2279 differentially expressed (DE) genes. Other glossy genes tended to be down-regulated in glossy mutants; by contrast stress-responsive pathways were induced in mutants. Gene co-expression network (GCN) analysis found that glossy genes were clustered, suggestive of co-regulation. Genes that potentially regulate the accumulation of glossy gene transcripts were identified via a pathway level co-expression analysis. Expression data from diverse organs showed that maize glossy genes are generally active in young leaves, silks, and tassels, while largely inactive in seeds and roots. Through reverse genetics, a DE gene homologous to Arabidopsis CER8 and co-expressed with known glossy genes was confirmed to participate in epicuticular wax accumulation. GCN data-informed forward genetics approach enabled cloning of the gl14 gene, which encodes a putative membrane-associated protein. Our results deepen understanding of the transcriptional regulation of the genes involved in the accumulation of epicuticular wax, and provide two maize glossy genes and a number of candidate genes for further characterization

    Additional file 1: of A genomic atlas of systemic interindividual epigenetic variation in humans

    Get PDF
    DNA methylation is thought to be an important determinant of human phenotypic variation, but its inherent cell type specificity has impeded progress on this question. At exceptional genomic regions, interindividual variation in DNA methylation occurs systemically. Like genetic variants, systemic interindividual epigenetic variants are stable, can influence phenotype, and can be assessed in any easily biopsiable DNA sample. We describe an unbiased screen for human genomic regions at which interindividual variation in DNA methylation is not tissue-specific

    Bottom-up GGM algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways or processes

    No full text
    BACKGROUND: Multilayered hierarchical gene regulatory networks (ML-hGRNs) are very important for understanding genetics regulation of biological pathways. However, there are currently no computational algorithms available for directly building ML-hGRNs that regulate biological pathways. RESULTS: A bottom-up graphic Gaussian model (GGM) algorithm was developed for constructing ML-hGRN operating above a biological pathway using small- to medium-sized microarray or RNA-seq data sets. The algorithm first placed genes of a pathway at the bottom layer and began to construct a ML-hGRN by evaluating all combined triple genes: two pathway genes and one regulatory gene. The algorithm retained all triple genes where a regulatory gene significantly interfered two paired pathway genes. The regulatory genes with highest interference frequency were kept as the second layer and the number kept is based on an optimization function. Thereafter, the algorithm was used recursively to build a ML-hGRN in layer-by-layer fashion until the defined number of layers was obtained or terminated automatically. CONCLUSIONS: We validated the algorithm and demonstrated its high efficiency in constructing ML-hGRNs governing biological pathways. The algorithm is instrumental for biologists to learn the hierarchical regulators associated with a given biological pathway from even small-sized microarray or RNA-seq data sets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0981-1) contains supplementary material, which is available to authorized users
    corecore