7 research outputs found

    Efficient decoding algorithms for generalized hidden Markov model gene finders

    Get PDF
    BACKGROUND: The Generalized Hidden Markov Model (GHMM) has proven a useful framework for the task of computational gene prediction in eukaryotic genomes, due to its flexibility and probabilistic underpinnings. As the focus of the gene finding community shifts toward the use of homology information to improve prediction accuracy, extensions to the basic GHMM model are being explored as possible ways to integrate this homology information into the prediction process. Particularly prominent among these extensions are those techniques which call for the simultaneous prediction of genes in two or more genomes at once, thereby increasing significantly the computational cost of prediction and highlighting the importance of speed and memory efficiency in the implementation of the underlying GHMM algorithms. Unfortunately, the task of implementing an efficient GHMM-based gene finder is already a nontrivial one, and it can be expected that this task will only grow more onerous as our models increase in complexity. RESULTS: As a first step toward addressing the implementation challenges of these next-generation systems, we describe in detail two software architectures for GHMM-based gene finders, one comprising the common array-based approach, and the other a highly optimized algorithm which requires significantly less memory while achieving virtually identical speed. We then show how both of these architectures can be accelerated by a factor of two by optimizing their content sensors. We finish with a brief illustration of the impact these optimizations have had on the feasibility of our new homology-based gene finder, TWAIN. CONCLUSIONS: In describing a number of optimizations for GHMM-based gene finders and making available two complete open-source software systems embodying these methods, it is our hope that others will be more enabled to explore promising extensions to the GHMM framework, thereby improving the state-of-the-art in gene prediction techniques

    Enhancing gene detection with computer generated intergenic regions

    Full text link

    An evaluation of contemporary hidden Markov model genefinders with a predicted exon taxonomy

    Get PDF
    We present an independent evaluation of six recent hidden Markov model (HMM) genefinders. Each was tested on the new dataset (FSH298), the results of which showed no dramatic improvement over the genefinders tested five years ago. In addition, we introduce a comprehensive taxonomy of predicted exons and classify each resulting exon accordingly. These results are useful in measuring (with finer granularity) the effects of changes in a genefinder. We present an analysis of these results and identify four patterns of inaccuracy common in all HMM-based results

    Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome

    Get PDF
    The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymenas germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum.</p

    Roles of Wrky proteins in mediating the crosstalk of hormone signaling pathways: An approach integrating bioinformatics and experimental biology

    Full text link
    The goal of my research is to understand the molecular mechanism by which hormones control seed germination. Gibberellins (GA) promote, while abscisic acid (ABA) and salicylic acid (SA) inhibit seed germination. Key molecules in these signaling pathways include receptors, secondary messengers, protein kinases and phosphatases, and transcription factors. My study focuses on how WRKY transcription factors modulate the expression of an alpha-amylase gene (Amy32b), which is up-regulated by GA, but down-regulated by ABA and SA in the aleurone cells of germinating seeds; Chapter 2 started with the annotation and phylogenetic analyses of the WRKY gene superfamily, followed by functional studies of WRKY proteins in mediating ABA responses. Eighty-one WRKY genes were identified in the rice genome through computational analyses. Phylogenetic analyses based on WRKY domain sequences suggest that extensive duplications and losses of the WRKY domain occurred during evolution of this gene family. Transient expression studies suggest that among four WRKY genes that are ABA-inducible in aleurone cells, OsWRKY72 and OsWRKY77 function as transcription activators while OsWRKY24 and OsWRKY45 are repressors of the ABA-inducible HVA22 gene; Chapter 3 presents cellular and biochemical data to support a novel model that two transcription repressors OsWRKY51 and OsWRKY71 mediate the crosstalk of GA and ABA signaling. Both genes are ABA-inducible, but GA-repressible in the embryos and aleurone cells of germinating rice seeds. The interaction of OsWRKY51 and OsWRKY71 synergistically represses the GA-induced Amy32b expression, likely by functionally interfering with the GA-inducible transactivator, GAMYB; The mechanism underlying the crosstalk of SA, ABA and GA is reported in Chapter 4. Similar to ABA, SA blocks seed germination likely through repressing the expression of alpha-amylase genes such as Amy32b. Over-expression of the SA-inducible and GA repressible HvWRKY38 in aleurone cells blocks GA-induced Amy32b expression. Therefore, HvWRKY38 might mediate the crosstalk of SA, ABA and GA signaling in regulating alpha-amylase expression and seed germination; My research demonstrates the high efficiency of the approach integrating bioinformatics and experimental biology in addressing the cell-biological signaling network. The finding derived from this research helps advance towards achieving our long-term goal: to improve the yield and quality of rice and other cereals
    corecore