Search CORE

58,437 research outputs found

Analysis of Sequence Conservation at Nucleotide Resolution

Author: Asthana Saurabh
Roytberg Mikhail
Stamatoyannopoulos John
Sunyaev Shamil
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

One of the major goals of comparative genomics is to understand the evolutionary history of each nucleotide in the human genome sequence, and the degree to which it is under selective pressure. Ascertainment of selective constraint at nucleotide resolution is particularly important for predicting the functional significance of human genetic variation and for analyzing the sequence substructure of cis-regulatory sequences and other functional elements. Current methods for analysis of sequence conservation are focused on delineation of conserved regions comprising tens or even hundreds of consecutive nucleotides. We therefore developed a novel computational approach designed specifically for scoring evolutionary conservation at individual base-pair resolution. Our approach estimates the rate at which each nucleotide position is evolving, computes the probability of neutrality given this rate estimate, and summarizes the result in a Sequence CONservation Evaluation (SCONE) score. We computed SCONE scores in a continuous fashion across 1% of the human genome for which high-quality sequence information from up to 23 genomes are available. We show that SCONE scores are clearly correlated with the allele frequency of human polymorphisms in both coding and noncoding regions. We find that the majority of noncoding conserved nucleotides lie outside of longer conserved elements predicted by other conservation analyses, and are experiencing ongoing selection in modern humans as evident from the allele frequency spectrum of human polymorphism. We also applied SCONE to analyze the distribution of conserved nucleotides within functional regions. These regions are markedly enriched in individually conserved positions and short (<15 bp) conserved “chunks.” Our results collectively suggest that the majority of functionally important noncoding conserved positions are highly fragmented and reside outside of canonically defined long conserved noncoding sequences. A small subset of these fragmented positions may be identified with high confidence

CiteSeerX

Public Library of Science (PLOS)

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

Author: Du Yushen
Gong Danyang
Jiang Lin
Shu Sara
Sun Ren
Wu Nicholas C
Wu Ting-Ting
Zhang Tianhao
Publication venue: eScholarship, University of California
Publication date: 01/11/2016
Field of study

Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available.ImportanceTo fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Recommended from our members

Comprehensive sequence-to-function mapping of cofactor-dependent RNA catalysis in the glmS ribozyme.

Author: Andreasson Johan OL
Block Steven M
Greenleaf William J
Savinov Andrew
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

Massively parallel, quantitative measurements of biomolecular activity across sequence space can greatly expand our understanding of RNA sequence-function relationships. We report the development of an RNA-array assay to perform such measurements and its application to a model RNA: the core glmS ribozyme riboswitch, which performs a ligand-dependent self-cleavage reaction. We measure the cleavage rates for all possible single and double mutants of this ribozyme across a series of ligand concentrations, determining kcat and KM values for active variants. These systematic measurements suggest that evolutionary conservation in the consensus sequence is driven by maintenance of the cleavage rate. Analysis of double-mutant rates and associated mutational interactions produces a structural and functional mapping of the ribozyme sequence, revealing the catalytic consequences of specific tertiary interactions, and allowing us to infer structural rearrangements that permit certain sequence variants to maintain activity

eScholarship - University of California

Recommended from our members

A high-resolution map of human evolutionary constraint using 29 mammals.

Author: Alföldi Jessica
Baldwin Jen
Baylor College of Medicine Human Genome Sequencing Center Sequencing Team
Beal Kathryn
Birney Ewan
Bloom Toby
Broad Institute Sequencing Platform and Whole Genome Assembly Team
Chang Jean
Chin Chee Whye
Clamp Michele
Clawson Hiram
Cree Andrew
Cuff James
Delehaunty Kim
Di Palma Federica
Dihn Huyen H
Dooling David
Ernst Jason
Fitzgerald Stephen
Flicek Paul
Fowler Gerald
Fronik Catrina
Fulton Bob
Fulton Lucinda
Garber Manuel
Genome Institute at Washington University
Gibbs Richard A
Gnerre Sante
Goldman Nick
Graves Tina
Green Eric D
Guttman Mitchell
Haussler David
Heiman Dave
Herrero Javier
Holloway Alisha K
Hubisz Melissa J
Jaffe David B
Jhangiani Shalili
Jordan Gregory
Joshi Vandita
Jungreis Irwin
Kellis Manolis
Kent W James
Kheradpour Pouya
Kostka Dennis
Kovar Christie L
Lander Eric S
Lara Marcia
Lee Sandra
Lewis Lora R
Lin Michael F
Lindblad-Toh Kerstin
Lowe Craig B
Mardis Elaine R
Margulies Elliott H
Martins Andre L
Massingham Tim
Mauceli Evan
Minx Patrick
Moltke Ida
Muzny Donna M
Nazareth Lynne V
Nicol Robert
Nusbaum Chad
Okwuonu Geoffrey
Parker Brian J
Pedersen Jakob S
Pollard Katherine S
Raney Brian J
Rasmussen Matthew D
Robinson Jim
Santibanez Jireh
Siepel Adam
Sodergren Erica
Stark Alexander
Vilella Albert J
Ward Lucas D
Warren Wesley C
Washietl Stefan
Weinstock George M
Wen Jiayu
Wilkinson Jane
Wilson Richard K
Worley Kim C
Xie Xiaohui
Young Sarah
Zody Michael C
Zuk Or
Publication venue: eScholarship, University of California
Publication date: 01/10/2011
Field of study

The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease

eScholarship - University of California

REPARATION : ribosome profiling assisted (re-)annotation of bacterial genomes

Author: Giess Adam
Jonckheere Veronique
Menschaert Gerben
Ndah Elvis
Valen Eivind
Van Damme Petra
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated methods depend heavily on sequence composition and often underestimate the complexity of the proteome. We developed RibosomeE Profiling Assisted (re-)AnnotaTION (REPARATION), a de novo machine learning algorithm that takes advantage of experimental protein synthesis evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation (https://github.com/Biobix/ REPARATION). REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds based on a growth curve model to screen for spurious ORFs. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel (small) ORFs including variants of previously annotated ORFs and >70% of all (variants of) annotated protein coding ORFs were predicted by REPARATION to be translated. Our predictions are supported by matching mass spectrometry proteomics data, sequence composition and conservation analysis. REPARATION is unique in that it makes use of experimental translation evidence to intrinsically perform a de novo ORF delineation in bacterial genomes irrespective of the sequence features linked to open reading frames

Ghent University Academic Bibliography

Conserved substitution patterns around nucleosome footprints in eukaryotes and Archaea derive from frequent nucleosome repositioning through evolution.

Author: Becker Erin
Facciotti Marc
Lehner Ben
Nislow Corey
Warnecke Tobias
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

Nucleosomes, the basic repeat units of eukaryotic chromatin, have been suggested to influence the evolution of eukaryotic genomes, both by altering the propensity of DNA to mutate and by selection acting to maintain or exclude nucleosomes in particular locations. Contrary to the popular idea that nucleosomes are unique to eukaryotes, histone proteins have also been discovered in some archaeal genomes. Archaeal nucleosomes, however, are quite unlike their eukaryotic counterparts in many respects, including their assembly into tetramers (rather than octamers) from histone proteins that lack N- and C-terminal tails. Here, we show that despite these fundamental differences the association between nucleosome footprints and sequence evolution is strikingly conserved between humans and the model archaeon Haloferax volcanii. In light of this finding we examine whether selection or mutation can explain concordant substitution patterns in the two kingdoms. Unexpectedly, we find that neither the mutation nor the selection model are sufficient to explain the observed association between nucleosomes and sequence divergence. Instead, we demonstrate that nucleosome-associated substitution patterns are more consistent with a third model where sequence divergence results in frequent repositioning of nucleosomes during evolution. Indeed, we show that nucleosome repositioning is both necessary and largely sufficient to explain the association between current nucleosome positions and biased substitution patterns. This finding highlights the importance of considering the direction of causality between genetic and epigenetic change

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

review marcatori genetici acquacoltura

Author: Guarniero Ilaria
Publication venue
Publication date: 15/03/2012
Field of study

Almae Matris Studiorum Campus

A flexible integrative approach based on random forest improves prediction of transcription factor binding sites

Author: Abeel
Afflerbach
Angarica
Bailey
Bart Hooghe
Bauer
Benos
Breiman
Bulyk
Burden
Calladine
Camenisch
Chen
Cho
Cordell
Davis
Dickerson
Ehret
Ernst
Frans van Roy
Friedel
Fujii
Fulton
Gama-Castro
Gardiner
Gartenberg
Gershenzon
Goodsell
Gorin
Gowrisankar
Greenbaum
Gunewardena
Hall
Hendrickson
Hu
Juo
Kajimura
Kaplan
Karas
Kel
Kim
Lavery
Lewis
Liu
Liu
Liu
Long
Lu
Lu
Lu
Lunetta
Man
Marco
Marinescu
Martinez-Hackert
Matys
Medina-Rivera
Meysman
Michel
Mokry
Morozov
Narang
Naughton
O'Flanagan
Olson
Paillard
Pan
Parker
Parvin
Pieter De Bleser
Ponomarenko
Portales-Casamar
Powell
Pudimat
Ramsey
Rohs
Rohs
Rohs
Ruiz
Satchwell
Schneider
Shakked
Sharon
Shi
Spolar
Stefan Broos
Stormo
Svozil
Thayer
Tomovic
Toro-Roman
Travers
Tullius
Wunderlich
Zhang
Zhang
Zhu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

Transcription factor binding sites (TFBSs) are DNA sequences of 6-15 base pairs. Interaction of these TFBSs with transcription factors (TFs) is largely responsible for most spatiotemporal gene expression patterns. Here, we evaluate to what extent sequence-based prediction of TFBSs can be improved by taking into account the positional dependencies of nucleotides (NPDs) and the nucleotide sequence-dependent structure of DNA. We make use of the random forest algorithm to flexibly exploit both types of information. Results in this study show that both the structural method and the NPD method can be valuable for the prediction of TFBSs. Moreover, their predictive values seem to be complementary, even to the widely used position weight matrix (PWM) method. This led us to combine all three methods. Results obtained for five eukaryotic TFs with different DNA-binding domains show that our method improves classification accuracy for all five eukaryotic TFs compared with other approaches. Additionally, we contrast the results of seven smaller prokaryotic sets with high-quality data and show that with the use of high-quality data we can significantly improve prediction performance. Models developed in this study can be of great use for gaining insight into the mechanisms of TF binding

Crossref

Ghent University Academic Bibliography

PubMed Central

Recommended from our members

TRIP13 is a protein-remodeling AAA+ ATPase that catalyzes MAD2 conformation switching.

Author: Corbett Kevin D
Moeller Arne
Rosenberg Scott C
Speir Jeffrey A
Su Tiffany Y
Ye Qiaozhen
Publication venue: eScholarship, University of California
Publication date: 01/04/2015
Field of study

The AAA+ family ATPase TRIP13 is a key regulator of meiotic recombination and the spindle assembly checkpoint, acting on signaling proteins of the conserved HORMA domain family. Here we present the structure of the Caenorhabditis elegans TRIP13 ortholog PCH-2, revealing a new family of AAA+ ATPase protein remodelers. PCH-2 possesses a substrate-recognition domain related to those of the protein remodelers NSF and p97, while its overall hexameric architecture and likely structural mechanism bear close similarities to the bacterial protein unfoldase ClpX. We find that TRIP13, aided by the adapter protein p31(comet), converts the HORMA-family spindle checkpoint protein MAD2 from a signaling-active 'closed' conformer to an inactive 'open' conformer. We propose that TRIP13 and p31(comet) collaborate to inactivate the spindle assembly checkpoint through MAD2 conformational conversion and disassembly of mitotic checkpoint complexes. A parallel HORMA protein disassembly activity likely underlies TRIP13's critical regulatory functions in meiotic chromosome structure and recombination

eScholarship - University of California

Recommended from our members

The helicase Ded1p controls use of near-cognate translation initiation codons in 5' UTRs.

Author: Bartel David P
Brar Gloria A
Guenther Ulf-Peter
Jankowsky Eckhard
Licatalosi Donny D
Stawicki Brittany N
Tedeschi Frank A
Weinberg David E
Weissman Jonathan S
Zagore Leah L
Zubradt Meghan M
Publication venue: eScholarship, University of California
Publication date: 01/07/2018
Field of study

The conserved and essential DEAD-box RNA helicase Ded1p from yeast and its mammalian orthologue DDX3 are critical for the initiation of translation1. Mutations in DDX3 are linked to tumorigenesis2-4 and intellectual disability5, and the enzyme is targeted by a range of viruses6. How Ded1p and its orthologues engage RNAs during the initiation of translation is unknown. Here we show, by integrating transcriptome-wide analyses of translation, RNA structure and Ded1p-RNA binding, that the effects of Ded1p on the initiation of translation are connected to near-cognate initiation codons in 5' untranslated regions. Ded1p associates with the translation pre-initiation complex at the mRNA entry channel and repressing the activity of Ded1p leads to the accumulation of RNA structure in 5' untranslated regions, the initiation of translation from near-cognate start codons immediately upstream of these structures and decreased protein synthesis from the corresponding main open reading frames. The data reveal a program for the regulation of translation that links Ded1p, the activation of near-cognate start codons and mRNA structure. This program has a role in meiosis, in which a marked decrease in the levels of Ded1p is accompanied by the activation of the alternative translation initiation sites that are seen when the activity of Ded1p is repressed. Our observations indicate that Ded1p affects translation initiation by controlling the use of near-cognate initiation codons that are proximal to mRNA structure in 5' untranslated regions

eScholarship - University of California