617 research outputs found
Computational prediction of transcription-factor binding site locations
Identifying genomic locations of transcription-factor binding sites, particularly in higher eukaryotic genomes, has been an enormous challenge. Various experimental and computational approaches have been used to detect these sites; methods involving computational comparisons of related genomes have been particularly successful
Recommended from our members
Phylogenetic reconstruction of Phalaenopsis (Orchidaceae) using nuclear and chloroplast DNA sequence data and using Phalaenopsis as a natural system for assessing methods to reconstruct hybrid evolution in phylogenetic analyses
textTwo phylogenies of Phalaenopsis (Orchidaceae) are presented, one from combined chloroplast DNA data and one from a nuclear actin gene. We used these phylogenies to assess and modify the classification of Phalaenopsis and to examine several morphological characters and geographical distribution patterns. Our results support Christenson’s (2001) treatment of Phalaenopsis as a broadly defined genus that includes the species previously placed in the genera Doritis and Kingidium. Some of Christenson’s subgeneric groups needed to be recircumscribed to reflect a natural classification. We recognized four subgenera and six sections, subgenera Aphyllae, Parishianae (with sections Conspicuum, Delisiosae, Esmeralda, and Parishianae), Phalaenopsis, and Polychilos (with sections Fuscatae and Polychilos). In order to find a set of universally amplifiable, phylogenetically informative, single-copy nuclear regions, we conducted a whole genome comparison of the rice (Oryza sativa) and Arabidopsis thaliana genomes. We constructed a database of both genomes and searched for pairs of sequences using criteria we felt would ensure primers that would reliably amplify using standard PCR protocols. We tested the most promising 142 primer pairs in the lab on eighteen taxa and found four potentially informative markers in Phalaenopsis and one in Helianthus. Our results indicated that it will be difficult to find universal nuclear markers, however our database provides an important tool for finding informative nuclear markers within specific groups. The full set of primer combinations is available online at, “The Conserved Primer Pair Project,” http://aug.csres.utexas.edu:8080/cpp/index.html. We used fourteen Phalaenopsis species and seven horticultural hybrids to create a real dataset with which to test phylogenetic network reconstruction methods. We tested the performance of Neighbor-Net, implemented in SplitsTree, under four different categories of complexity: one hybrid, two independent hybrids (hybrids with no parents in common), three independent hybrids, and two non-independent hybrids (one parent was shared between hybrids). Neighbor-Net was able to predict accurately the parents of hybrids in only about half of the datasets we tested, and there were so many false positives that it was impossible to distinguish the hybrids from the species. We plan to use this dataset to test methods, such as RIATA and RGNet, when they become available.Biological Sciences, School o
A survey of DNA motif finding algorithms
Background: Unraveling the mechanisms that regulate gene expression is a major challenge in biology. An important task in this challenge is to identify regulatory elements, especially the binding sites in deoxyribonucleic acid (DNA) for transcription factors. These binding sites are short DNA segments that are called motifs. Recent advances in genome sequence availability and in high-throughput gene expression analysis technologies have allowed for the development of computational methods for motif finding. As a result, a large number of motif finding algorithms have been implemented and applied to various motif models over the past decade. This survey reviews the latest developments in DNA motif finding algorithms.Results: Earlier algorithms use promoter sequences of coregulated genes from single genome and search for statistically overrepresented motifs. Recent algorithms are designed to use phylogenetic footprinting or orthologous sequences and also an integrated approach where promoter sequences of coregulated genes and phylogenetic footprinting are used. All the algorithms studied have been reported to correctly detect the motifs that have been previously detected by laboratory experimental approaches, and some algorithms were able to find novel motifs. However, most of these motif finding algorithms have been shown to work successfully in yeast and other lower organisms, but perform significantly worse in higher organisms.Conclusion: Despite considerable efforts to date, DNA motif finding remains a complex challenge for biologists and computer scientists. Researchers have taken many different approaches in developing motif discovery tools and the progress made in this area of research is very encouraging. Performance comparison of different motif finding tools and identification of the best tools have proven to be a difficult task because tools are designed based on algorithms and motif models that are diverse and complex and our incomplete understanding of the biology of regulatory mechanism does not always provide adequate evaluation of underlying algorithms over motif models.Peer reviewedComputer Scienc
Evaluation of phylogenetic footprint discovery for predicting bacterial cis-regulatory elements and revealing their evolution
The detection of conserved motifs in promoters of orthologous genes (phylogenetic footprints) has become a common strategy to predict cis-acting regulatory elements. Several software tools are routinely used to raise hypotheses about regulation. However, these tools are generally used as black boxes, with default parameters. A systematic evaluation of optimal parameters for a footprint discovery strategy can bring a sizeable improvement to the predictions.Journal ArticleResearch Support, Non-U.S. Gov'tSCOPUS: ar.jinfo:eu-repo/semantics/publishe
Evolutionary Approaches to the Study of Small Noncoding Regulatory RNA Pathways: A Dissertation
Short noncoding RNAs play roles in regulating nearly every biological process, in nearly every organism, yet the exact function and importance of these molecules remains a subject of some debate. In order to gain a better understanding of the contexts in which these regulators have evolved, I have undertaken a variety of approaches to study the evolutionary history of the components that make up these pathways, in the form of two main research efforts. In the first chapter, I have used a combination of population genetics and molecular evolution techniques to show that proteins involved in the piRNA pathway are rapidly evolving, and that different components of the pathway seem to be evolving rapidly on different timescales. These rapidly evolving piRNA pathway proteins can be loosely separated into two groups. The first group appears to evolve quickly at the species level, perhaps in response to transposons that invade across species lines, while the second group appears to evolve quickly at the level of individual populations, perhaps in response to transposons that are paternally present yet novel to the maternal genome. In the second chapter of my research, I have used molecular evolution techniques and carefully devised controls to show that the binding sites of well-conserved miRNAs are among the most slowly changing short motifs in the genome, consistent with a conserved function for these short RNAs in regulatory pathways that are ancient and extremely slow to change. I have additionally discovered a major flaw in an existing approach to motif turnover calculations, which may lead to systematic biases in the published literature toward the false inference of increased regulatory complexity over time. I have implemented a revised approach to motif turnover that addresses this flaw
Computational Methods for the Pharmacogenetic Interpretation of Next Generation Sequencing Data
Up to half of all patients do not respond to pharmacological treatment as intended. A substantial fraction of these inter-individual differences is due to heritable factors and a growing number of associations between genetic variations and drug response phenotypes have been identified. Importantly, the rapid progress in Next Generation Sequencing technologies in recent years unveiled the true complexity of the genetic landscape in pharmacogenes with tens of thousands of rare genetic variants. As each individual was found to harbor numerous such rare variants they are anticipated to be important contributors to the genetically encoded inter-individual variability in drug effects. The fundamental challenge however is their functional interpretation due to the sheer scale of the problem that renders systematic experimental characterization of these variants currently unfeasible. Here, we review concepts and important progress in the development of computational prediction methods that allow to evaluate the effect of amino acid sequence alterations in drug metabolizing enzymes and transporters. In addition, we discuss recent advances in the interpretation of functional effects of non-coding variants, such as variations in splice sites, regulatory regions and miRNA binding sites. We anticipate that these methodologies will provide a useful toolkit to facilitate the integration of the vast extent of rare genetic variability into drug response predictions in a precision medicine framework
- …