Search CORE

19 research outputs found

Recommended from our members

Transfer RNA genes experience exceptionally elevated mutation rates.

Author: Corbett-Detig Russell B
Gong Henry
Hough Josh
Lowe Todd M
Roger Jacquelyn M
Thornlow Bryan P
Publication venue: eScholarship, University of California
Publication date: 01/09/2018
Field of study

Transfer RNAs (tRNAs) are a central component for the biological synthesis of proteins, and they are among the most highly conserved and frequently transcribed genes in all living things. Despite their clear significance for fundamental cellular processes, the forces governing tRNA evolution are poorly understood. We present evidence that transcription-associated mutagenesis and strong purifying selection are key determinants of patterns of sequence variation within and surrounding tRNA genes in humans and diverse model organisms. Remarkably, the mutation rate at broadly expressed cytosolic tRNA loci is likely between 7 and 10 times greater than the nuclear genome average. Furthermore, evolutionary analyses provide strong evidence that tRNA genes, but not their flanking sequences, experience strong purifying selection acting against this elevated mutation rate. We also find a strong correlation between tRNA expression levels and the mutation rates in their immediate flanking regions, suggesting a simple method for estimating individual tRNA gene activity. Collectively, this study illuminates the extreme competing forces in tRNA gene evolution and indicates that mutations at tRNA loci contribute disproportionately to mutational load and have unexplored fitness consequences in human populations

eScholarship - University of California

Stability of SARS-CoV-2 phylogenies.

Author: Borges Rui
Corbett-Detig Russell
De Maio Nicola
Fernandes Jason D
Goldman Nick
Gozashti Landen
Haussler David
Hinrichs Angie S
Lanfear Robert
Slodkowicz Greg
Thornlow Bryan
Turakhia Yatish
Walker Conor R
Weilguny Lukas
Publication venue: PLoS Genet
Publication date: 01/11/2020
Field of study

Funder: Alfred P. Sloan Foundation; funder-id: http://dx.doi.org/10.13039/100000879Funder: European Molecular Biology Laboratory (EMBL)The SARS-CoV-2 pandemic has led to unprecedented, nearly real-time genetic tracing due to the rapid community sequencing response. Researchers immediately leveraged these data to infer the evolutionary relationships among viral samples and to study key biological questions, including whether host viral genome editing and recombination are features of SARS-CoV-2 evolution. This global sequencing effort is inherently decentralized and must rely on data collected by many labs using a wide variety of molecular and bioinformatic techniques. There is thus a strong possibility that systematic errors associated with lab-or protocol-specific practices affect some sequences in the repositories. We find that some recurrent mutations in reported SARS-CoV-2 genome sequences have been observed predominantly or exclusively by single labs, co-localize with commonly used primer binding sites and are more likely to affect the protein-coding sequences than other similarly recurrent mutations. We show that their inclusion can affect phylogenetic inference on scales relevant to local lineage tracing, and make it appear as though there has been an excess of recurrent mutation or recombination among viral lineages. We suggest how samples can be screened and problematic variants removed, and we plan to regularly inform the scientific community with our updated results as more SARS-CoV-2 genome sequences are shared (https://virological.org/t/issues-with-sars-cov-2-sequencing-data/473 and https://virological.org/t/masking-strategies-for-sars-cov-2-alignments/480). We also develop tools for comparing and visualizing differences among very large phylogenies and we show that consistent clade- and tree-based comparisons can be made between phylogenies produced by different groups. These will facilitate evolutionary inferences and comparisons among phylogenies produced for a wide array of purposes. Building on the SARS-CoV-2 Genome Browser at UCSC, we present a toolkit to compare, analyze and combine SARS-CoV-2 phylogenies, find and remove potential sequencing errors and establish a widely shared, stable clade structure for a more accurate scientific inference and discourse

Directory of Open Access Journals

eScholarship - University of California

The Australian National University

Apollo (Cambridge)

Online phylogenetics using parsimony produces slightly better trees and is dramatically more efficient for large SARS-CoV-2 phylogenies than de novo and maximum-likelihood approaches

Author: Thornlow Bryan
Publication venue: Dryad
Publication date: 01/01/2021
Field of study

Phylogenetics has been foundational to SARS-CoV-2 research and public health policy, assisting in genomic surveillance, contact tracing, and assessing emergence and spread of new variants. However, phylogenetic analyses of SARS-CoV-2 have often relied on tools designed for de novo phylogenetic inference, in which all data are collected before any analysis is performed and the phylogeny is inferred once from scratch. SARS-CoV-2 datasets do not fit this mould. There are currently over 5 million sequenced SARS-CoV-2 genomes in public databases, with tens of thousands of new genomes added every day. Continuous data collection, combined with the public health relevance of SARS-CoV-2, invites an "online" approach to phylogenetics, in which new samples are added to existing phylogenetic trees every day. The extremely dense sampling of SARS-CoV-2 genomes also invites a comparison between Likelihood and Parsimony approaches to phylogenetic inference. Maximum Likelihood (ML) methods are more accurate when there are multiple changes at a single site on a single branch, but this accuracy comes at a large computational cost, and the dense sampling of SARS-CoV-2 genomes means that these instances will be extremely rare. Therefore, it may be that approaches based on Maximum Parsimony (MP) are sufficiently accurate for reconstructing phylogenies of SARS-CoV-2, and their simplicity means that they can be applied to much larger datasets. Here, we evaluate the performance of de novo and online phylogenetic approaches, and ML and MP frameworks, for inferring large and dense SARS-CoV-2 phylogenies. Overall, we find that online phylogenetics produces similar phylogenetic trees to de novo analyses for SARS-CoV-2, and that MP optimizations produce more accurate SARS-CoV-2 phylogenies than do ML optimizations. Since MP is thousands of times faster than presently available implementations of ML and online phylogenetics is faster than de novo, we therefore propose that, in the context of comprehensive genomic epidemiology of SARS-CoV-2, MP online phylogenetics approaches should be favored.All details for this dataset can be found at https://github.com/bpt26/parsimony. The attached protobuf file is the outcome of the commands described in subrepository 1. Funding provided by: NHGRICrossref Funder Registry ID: http://dx.doi.org/10.13039/100000051Award Number: F31HG010584All details for data collection and processing are described at https://github.com/bpt26/parsimony. In March 2021, we developed a phylogeny consisting of 364,427 SARS-CoV-2 whole genomes, pruned of long branches and sequences with multiple ambiguous nucleotides. We assessed several phylogenetic inference and optimization methods using this dataset, as described in our manuscript. Here we include all necessary starting materials for running our analyses

Ezid

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Recommended from our members

Evolutionary Genomics of Transfer RNA Genes and SARS-CoV-2

Author: Thornlow Bryan
Publication venue: eScholarship, University of California
Publication date: 01/01/2021
Field of study

Transfer RNAs (tRNAs) are essential components of translation across all domains of life. The importance of this function is reflected in the strength of their conservation at the genome level, as well as their presence in hundreds of copies across each eukaryotic genome. Their strong conservation and high copy number at the genome level, in conjunction with their extensive post-transcriptional modifications and extreme variation in transcriptional activity by locus, make tRNA genes an enticing but as yet understudied model gene family.The requirement of tRNA transcripts in exceptionally large quantities causes tRNA loci to experience among the highest rates of transcription in the genome. Consequently, transcription-associated mutagenesis (TAM) and natural selection leave distinct genomic signatures at highly transcribed tRNA loci, such that tRNA genes are strongly conserved despite elevated mutation rates, and their immediate flanking regions are among the most variable sites in the genome. Here, I characterize the relationship between expression, mutation, and selection at tRNA loci in detail by using population genetics, comparative genomics, epigenetics, and transcriptomic data. I then use these findings to engineer a random-forest model to predict tRNA gene transcriptional activity using only DNA data. In the second half of this dissertation, I use the comparative genomics skills developed in the first part to help develop a novel phylogenetics toolkit. I identify the effects of sequencing errors on large SARS-CoV-2 phylogenies at global and local scales, demonstrate a novel method to quickly add samples to phylogenies, and explore recombination events in SARS-CoV-2 data, finding an excess in the region surrounding the Spike protein. In this dissertation, I use publicly available DNA, RNA, and epigenetic data to develop novel bioinformatic analysis methods. Together, the conclusions drawn in this dissertation for both tRNA biology and SARS-CoV-2 answer fundamental evolutionary questions

eScholarship - University of California

Online phylogenetics with matOptimize for SARS-CoV-2

Author: Thornlow Bryan
Publication venue: Dryad
Publication date: 01/01/2022
Field of study

Ezid

Transfer RNA genes experience exceptionally elevated mutation rates

Author: Thornlow Bryan P,
Publication venue
Publication date: 21/05/2020
Field of study

Ezid

Estimating the timing of multiple admixture pulses during local ancestry inference

Author: Corbett-Detig Russell
Medina Paloma
Nielsen Rasmus
Thornlow Bryan
Publication venue: 'Genetics Society of America'
Publication date: 01/01/2018
Field of study

Admixture, the mixing of genetically distinct populations, is increasingly recognized as a fundamental biological process. One major goal of admixture analyses is to estimate the timing of admixture events. Whereas most methods today can only detect the most recent admixture event, here, we present coalescent theory and associated software that can be used to estimate the timing of multiple admixture events in an admixed population. We extensively validate this approach and evaluate the conditions under which it can successfully distinguish one- from two-pulse admixture models. We apply our approach to real and simulated data of Drosophila melanogaster We find evidence of a single very recent pulse of cosmopolitan ancestry contributing to African populations, as well as evidence for more ancient admixture among genetically differentiated populations in sub-Saharan Africa. These results suggest our method can quantify complex admixture histories involving genetic material introduced by multiple discrete admixture pulses. The new method facilitates the exploration of admixture and its contribution to adaptation, ecological divergence, and speciation

Copenhagen University Research Information System

eScholarship - University of California

Recommended from our members

Eukaryotic tRNA sequences present conserved and amino acid-specific structural signatures

Author: Chan Patricia P
Lowe Todd M
Thornlow Bryan
Westhof Eric
Publication venue: eScholarship, University of California
Publication date: 22/04/2022
Field of study

Metazoan organisms have many tRNA genes responsible for decoding amino acids. The set of all tRNA genes can be grouped in sets of common amino acids and isoacceptor tRNAs that are aminoacylated by corresponding aminoacyl-tRNA synthetases. Analysis of tRNA alignments shows that, despite the high number of tRNA genes, specific tRNA sequence motifs are highly conserved across multicellular eukaryotes. The conservation often extends throughout the isoacceptors and isodecoders with, in some cases, two sets of conserved isodecoders. This study is focused on non-Watson-Crick base pairs in the helical stems, especially GoU pairs. Each of the four helical stems may contain one or more conserved GoU pairs. Some are amino acid specific and could represent identity elements for the cognate aminoacyl tRNA synthetases. Other GoU pairs are found in more than a single amino acid and could be critical for native folding of the tRNAs. Interestingly, some GoU pairs are anticodon-specific, and others are found in phylogenetically-specific clades. Although the distribution of conservation likely reflects a balance between accommodating isotype-specific functions as well as those shared by all tRNAs essential for ribosomal translation, such conservations may indicate the existence of specialized tRNAs for specific translation targets, cellular conditions, or alternative functions

eScholarship - University of California

Estimating the Timing of Multiple Admixture Pulses During Local Ancestry Inference

Author: Bryan Thornlow
Lachance
Paloma Medina
Rasmus Nielsen
Russell Corbett-Detig
Publication venue: 'Genetics Society of America'
Publication date
Field of study

Crossref

Recommended from our members

Transposable elements drive intron gain in diverse eukaryotes

Author: Ares Manuel
Corbett-Detig Russell
Gozashti Landen
Kramer Alexander
Roy Scott W
Thornlow Bryan
Publication venue: eScholarship, University of California
Publication date: 29/11/2022
Field of study

There is massive variation in intron numbers across eukaryotic genomes, yet the major drivers of intron content during evolution remain elusive. Rapid intron loss and gain in some lineages contrast with long-term evolutionary stasis in others. Episodic intron gain could be explained by recently discovered specialized transposons called Introners, but so far Introners are only known from a handful of species. Here, we performed a systematic search across 3,325 eukaryotic genomes and identified 27,563 Introner-derived introns in 175 genomes (5.2%). Species with Introners span remarkable phylogenetic diversity, from animals to basal protists, representing lineages whose last common ancestor dates to over 1.7 billion years ago. Aquatic organisms were 6.5 times more likely to contain Introners than terrestrial organisms. Introners exhibit mechanistic diversity but most are consistent with DNA transposition, indicating that Introners have evolved convergently hundreds of times from nonautonomous transposable elements. Transposable elements and aquatic taxa are associated with high rates of horizontal gene transfer, suggesting that this combination of factors may explain the punctuated and biased diversity of species containing Introners. More generally, our data suggest that Introners may explain the episodic nature of intron gain across the eukaryotic tree of life. These results illuminate the major source of ongoing intron creation in eukaryotic genomes

eScholarship - University of California