Search CORE

2,400 research outputs found

Leveraging EST Evidence to Automatically Predict Alternatively Spliced Genes, Master\u27s Thesis, December 2006

Author: Zimmermann Robert
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2007
Field of study

Current methods for high-throughput automatic annotation of newly sequenced genomes are largely limited to tools which predict only one transcript per gene locus. Evidence suggests that 20-50% of genes in higher eukariotic organisms are alternatively spliced. This leaves the remainder of the transcripts to be annotated by hand, an expensive time-consuming process. Genomes are being sequenced at a much higher rate than they can be annotated. We present three methods for using the alignments of inexpensive Expressed Sequence Tags in combination with HMM-based gene prediction with N-SCAN EST to recreate the vast majority of hand annotations in the D.melanogaster genome. In our ﬁrst method, we “piece together” N-SCAN EST predictions with clustered EST alignments to increase the number of transcripts per locus predicted. This is shown to be a sensitve and accurate method, predicting the vast majority of known transcripts in the D.melanogaster genome. We present an approach of using these clusters of EST alignments to construct a Multi-Pass gene prediction phase, again, piecing it together with clusters of EST alignments. While time consuming, Multi-Pass gene prediction is very accurate and more sensitive than single-pass. Finally, we present a new Hidden Markov Model instance, which augments the current N-SCAN EST HMM, that predicts multiple splice forms in a single pass of prediction. This method is less time consuming, and performs nearly as well as the multi-pass approach

Washington University St. Louis: Open Scholarship

The role of linker histone globular domains in chromatosome formation

Author: Shen Chang-Hui
Publication venue: The University of Edinburgh
Publication date: 01/01/1998
Field of study

Quantitative genome-wide studies of RNA metabolism in yeast

Author: Eser Philipp
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 02/05/2016
Field of study

Gene expression and its regulation are fundamental processes in every living cell and organism. RNA molecules hereby play a central role by translating the genetic information into proteins, by regulating gene activity and by forming structural components. The kinetics of RNA metabolism differ widely between genes and conditions and play an important role for cellular processes, but how this is achieved remains poorly understood. Here, we used a novel experimental protocol that allows profiling of newly transcribed RNAs in conjunction with an advanced computational modeling pipeline to explore the kinetics of RNA metabolism and the underlying genetic determinants.In the first study, we investigated cell cycle regulated gene expression and the contributions of synthesis and degradation to mRNA levels in S.cerevisiae. During the cell cycle, the levels of hundreds of mRNAs change in a periodic manner, but how this is carried out by alterations in the rates of mRNA synthesis and degradation has not been studied systematically. We were able to derive mRNA synthesis and degradation rates every 5 minutes during the cell cycle, and thus provide for the first time a high-resolution time series of RNA metabolism during the cell cycle. A novel statistical model identified 479 genes that show periodic changes in mRNA synthesis and generally also periodic changes in their mRNA degradation rates. Peaks of mRNA degradation follow peaks of mRNA synthesis, resulting in sharp and high peaks of mRNA levels at defined times during the cell cycle. Whereas the timing of mRNA synthesis is set by upstream DNA motifs and their associated transcription factors (TFs), the synthesis rate of a periodically expressed gene is apparently set by its core promoter. In the second study, we developed metabolic labeling with RNA-Seq (4tU-Seq) and novel computational methods to gain further insights into the kinetics of RNA metabolism and its regulation. To decrypt the regulatory code of the genome, sequence elements must be defined that determine RNA turnover and thus gene expression. Here we attempt such decryption in an eukaryotic model organism, the fission yeast S. pombe. We first derived an improved genome annotation that redefines borders of 36% of expressed mRNAs and adds 487 non-coding RNAs (ncRNAs). We then combined RNA labeling in-vivo with mathematical modeling to obtain rates of RNA synthesis and degradation for 5,484 expressed RNAs and splicing rates for 4,958 introns. We identified functional sequence elements in DNA and RNA that control RNA metabolic rates, and quantified the contributions of individual nucleotides to RNA synthesis, splicing, and degradation. Our approach reveals distinct kinetics of mRNA and ncRNA metabolism, separates antisense regulation by transcription interference from RNA interference, and provides a general tool for studying the regulatory code of genomes

Organization and evolution of information within eukaryotic genomes.

Author: Links Matthew Graham
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2007
Field of study

Use of wavelet-packet transforms to develop an engineering model for multifractal characterization of mutation dynamics in pathological and nonpathological gene sequences

Author: Walker David Lee
Publication venue: The Research Repository @ WVU
Publication date: 01/05/1999
Field of study

This study uses dynamical analysis to examine in a quantitative fashion the information coding mechanism in DNA sequences. This exceeds the simple dichotomy of either modeling the mechanism by comparing DNA sequence walks as Fractal Brownian Motion (fbm) processes. The 2-D mappings of the DNA sequences for this research are from Iterated Function System (IFS) (Also known as the Chaos Game Representation (CGR)) mappings of the DNA sequences. This technique converts a 1-D sequence into a 2-D representation that preserves subsequence structure and provides a visual representation. The second step of this analysis involves the application of Wavelet Packet Transforms, a recently developed technique from the field of signal processing. A multi-fractal model is built by using wavelet transforms to estimate the Hurst exponent, H. The Hurst exponent is a non-parametric measurement of the dynamism of a system. This procedure is used to evaluate gene-coding events in the DNA sequence of cystic fibrosis mutations. The H exponent is calculated for various mutation sites in this gene. The results of this study indicate the presence of anti-persistent, random walks and persistent sub-periods in the sequence. This indicates the hypothesis of a multi-fractal model of DNA information encoding warrants further consideration.;This work examines the model\u27s behavior in both pathological (mutations) and non-pathological (healthy) base pair sequences of the cystic fibrosis gene. These mutations both natural and synthetic were introduced by computer manipulation of the original base pair text files. The results show that disease severity and system information dynamics correlate. These results have implications for genetic engineering as well as in mathematical biology. They suggest that there is scope for more multi-fractal models to be developed

The Research Repository @ WVU (West Virginia University)