Search CORE

22 research outputs found

Benchmarking multi-rate codon models

Author: Delport Wayne
Gravenor Mike B.
Muse Spencer V.
Pond Sergei Kosakovsky
Scheffler Konrad
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/07/2010
Field of study

CITATION: Delport, W. et al. 2010. Benchmarking multi-rate codon models. PLoS ONE, 5(7): e11587, doi:10.1371/journal.pone.0011587.The original publication is available at http://journals.plos.org/plosoneThe single rate codon model of non-synonymous substitution is ubiquitous in phylogenetic modeling. Indeed, the use of a non-synonymous to synonymous substitution rate ratio parameter has facilitated the interpretation of selection pressure on genomes. Although the single rate model has achieved wide acceptance, we argue that the assumption of a single rate of non-synonymous substitution is biologically unreasonable, given observed differences in substitution rates evident from empirical amino acid models. Some have attempted to incorporate amino acid substitution biases into models of codon evolution and have shown improved model performance versus the single rate model. Here, we show that the single rate model of non-synonymous substitution is easily outperformed by a model with multiple non-synonymous rate classes, yet in which amino acid substitution pairs are assigned randomly to these classes. We argue that, since the single rate model is so easy to improve upon, new codon models should not be validated entirely on the basis of improved model fit over this model. Rather, we should strive to both improve on the single rate model and to approximate the general time-reversible model of codon substitution, with as few parameters as possible, so as to reduce model over-fitting. We hint at how this can be achieved with a Genetic Algorithm approach in which rate classes are assigned on the basis of sequence information content. © 2010 Delport et al.http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0011587Publisher's versio

Stellenbosch University SUNScholar Repository

Correcting the Bias of Empirical Frequency Parameter Estimators in Codon Models

Author: C Kosiol
C Seoighe
G Schwarz
GC Conant
Konrad Scheffler
M Anisimova
M Lacerda
N Goldman
S Whelan
Sergei Kosakovsky Pond
SL Kosakovsky Pond
SL Kosakovsky Pond
SL Kosakovsky Pond
Spencer V. Muse
SV Muse
Thomas Mailund
W Delport
Wayne Delport
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Markov models of codon substitution are powerful inferential tools for studying biological processes such as natural selection and preferences in amino acid substitution. The equilibrium character distributions of these models are almost always estimated using nucleotide frequencies observed in a sequence alignment, primarily as a matter of historical convention. In this note, we demonstrate that a popular class of such estimators are biased, and that this bias has an adverse effect on goodness of fit and estimates of substitution rates. We propose a “corrected” empirical estimator that begins with observed nucleotide counts, but accounts for the nucleotide composition of stop codons. We show via simulation that the corrected estimates outperform the de facto standard estimates not just by providing better estimates of the frequencies themselves, but also by leading to improved estimation of other parameters in the evolutionary models. On a curated collection of sequence alignments, our estimators show a significant improvement in goodness of fit compared to the approach. Maximum likelihood estimation of the frequency parameters appears to be warranted in many cases, albeit at a greater computational cost. Our results demonstrate that there is little justification, either statistical or computational, for continued use of the -style estimators

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Stellenbosch University SUNScholar Repository

CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences

Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of rate classes, where is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Cronfa at Swansea University

Stellenbosch University SUNScholar Repository

Swept Under the Rug? A Historiography of Gender and Black Colleges

Author: Alexander William T
Allen-Castellitto Anita
Anderson Eric
Anderson K
Anderson Margaret L
Aptheker Bettina
Avery Vida
Baham Eva
Barry Keith
Beamon Harry
bell hooks
bell hooks
bell hooks
Bell-Scott Patricia
Bell-Scott Patricia
Bell-Scott Patricia
Bethel Leonard
Blackwell Barbara
Bond Horace Mann
Bond Jean Carey
Bowles Frank
Bracey John
Brazzell Johnetta Cross
Brown Linda Beatrice
Brown M Christopher
Browning Jane E Smith
Butchart Ronald E
Cade Toni
Campbell Clarice
Carroll Constance
Cash Gail
Chancer Lynn
Chapel Cynthia
Chivers Walter
Clary George
Cole Johnnetta Betsch
Collier-Thomas Bettye
Collier-Thomas Bettye
Collins Alicia C
Collins Alicia C
Collins Patricia Hill
Collins Patricia Hill
Collins Patricia Hill
Collins Patricia Hill
Comminey Shawn
Crawford Vicki
Cuthbert Marian
Davis Angela Y
Davis Lenwood
Davis Leroy
Dill Bonnie Thorton
Drewry Henry
Du Bois WEB
Dyson Walter
Edwards Ishmell
Engs Robert
Evans Stephanie Y
Evans Stephanie Y
Evans-Herring Cassandra
Fancher Evelyn
Favors Jelani
Foster Terry
Fox-Genovese Elizabeth
Francis Valera T
Frazier E Franklin
Freeman Kassie
Gallot Mildred
Garrett R Thomas
Gasman Marybeth
Gasman Marybeth
Gasman Marybeth
Gasman Marybeth
Gasman Marybeth
Gibson De Lois
Giddings Paula
Giddings Paula
Gilkes Cheryl Townsend
Gilpin Patrick J
Glenn Evelyn Nakano
Graham Frances D
Graham Frances D
Gurin Patricia
Guy-Sheftal Beverly
Guy-Sheftal Beverly
Hansen Joyce A
Hansen Joyce A
Harley Sharon
Harper Shaun R
Hill ST
Hine Darlene Clark
Hine Darlene Clark
Hine Darlene Clark
Hine Darlene Clark
Hine Darlene Clark
Hine Darlene Clark
Holmes Dwight OW
Hunter Wilma
Hutcheson Philo A
Ihle Elizabeth L
Irving John
Isaacs Barbara
Jabs Albert
Jackson Gretchen
Jaffe Abram J
James Joy
Jenkins Clara
Johnson Alandus
Johnson Karen Ann
Jones Charisse
Kates Susan
Keller FR
Kittrell Flemmie
Klein Arthur L
Kobena Korang-Arthur
Lefhever Harry G
Lemert C
Leone Janice
Liberti Rita
Liberti Rita
Liberti Rita
Linsin C
Lockwood Nadine
Logan Rayford
Marybeth Gasman
Matthews Lamoyne
McCluskey Audrey T
McCluskey Audrey T
McCluskey Audrey T
McCluskey Audrey T
McCluskey Audrey T
McGinnis Frederick
McGrath Earl J
McJamerson Jimmy
McKinney Theophilus
McKinny Richard I
McMillian Joseph
Morton Patricia
Moses Yolanda
Moynihan Daniel Patrick
Muse Clifford
Myers Lena Wright
National Association of College Women
Nidiffer Jana
Noble Jeanne L
O’Brien G
Palmieri Patricia
Parker Marjorie
Parsons M
Perkins Linda M
Perkins Linda M
Perkins Linda M
Perkins Linda M
Player Willa
Poole H Randall
Ramsey Berkley
Read Florence
Read Florence
Richardson Frederick
Roane Florence
Robinson William
Romelle Charlestine
Rosenblum Karen
Rothman Norman
Rovaris Dereck
Sanders Katrina
Seymour Elaine
Slowe Lucy Diggs
Smith Valerie
Solomon Barbara
Spencer Chonita Robinson
Sterling Dorothy
Stringer Patricia S
Sutton Virginia Ann
Taliaferro Cecil
Thompson Daniel C
Thompson Kathleen
Thurgood Marshall Fund
Thurman Frances
Vanlandingham Karen
Venkatesan Madhavi
Wadelington Charles Weldon
Wallace Michelle
Warren-Christian Christiane
Washington Elsie
Watkins William
Watson Yolanda
Watson-Moore Yolanda
Williams Juan
Williams Zachery
Willie Charles
Willie Charles V
Wimbish Jerrold
Wolters Raymond
Woodson Carter G
Woodson Carter G
Woody Thomas
Wright Earl
Yellin Jean Fagan
Young Jacqueline
Publication venue: 'American Educational Research Association (AERA)'
Publication date
Field of study

Crossref

Equiprobable discrete models of site-specific substitution rates underestimate the extent of rate variability.

Author: Frank Mannino
Sadie Wisotsky
Sergei L Kosakovsky Pond
Spencer V Muse
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

It is standard practice to model site-to-site variability of substitution rates by discretizing a continuous distribution into a small number, K, of equiprobable rate categories. We demonstrate that the variance of this discretized distribution has an upper bound determined solely by the choice of K and the mean of the distribution. This bound can introduce biases into statistical inference, especially when estimating parameters governing site-to-site variability of substitution rates. Applications to two large collections of sequence alignments demonstrate that this upper bound is often reached in analyses of real data. When parameter estimation is of primary interest, additional rate categories or more flexible modeling methods should be considered

Directory of Open Access Journals

HyPhy: hypothesis testing using phylogenies

Author: Sergei L. Kosakovsky Pond
Simon D. W. Frost
Spencer V. Muse
Publication venue
Publication date
Field of study

Summary: The HyPhypackage is designed to provide a flexible and unified platform for carrying out likelihood-based analyses on multiple alignments of molecular sequence data, with the emphasis on studies of rates and patterns of sequence evolution

CiteSeerX

Genome Architecture Drives Protein Evolution in Ciliates

Author: Casey L. McGrath
Laura A. Katz
Rebecca A. Zufall
Spencer V. Muse
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

Comparison of frequency parameterizations fitted to simulated alignments.

Author: Konrad Scheffler (49776)
Sergei Kosakovsky Pond (29193)
Spencer V. Muse (243400)
Wayne Delport (243393)
Publication venue
Publication date
Field of study

<p>The top row (A,B) shows the comparison of scores on simulated data obtained with different corrected frequency estimates; C) Bias in the estimate of the substitution rate in near-asymptotic regime () is apparent under , but does not exist for the other two estimators; D) variance of the estimate for is reduced with increasing sample size.</p

FigShare