Search CORE

107 research outputs found

A two-sample Bayesian t-test for microarray data

Author: Dimmic Matthew W
Fox Richard J
Publication venue: BioMed Central
Publication date: 01/03/2006
Field of study

BACKGROUND: Determining whether a gene is differentially expressed in two different samples remains an important statistical problem. Prior work in this area has featured the use of t-tests with pooled estimates of the sample variance based on similarly expressed genes. These methods do not display consistent behavior across the entire range of pooling and can be biased when the prior hyperparameters are specified heuristically. RESULTS: A two-sample Bayesian t-test is proposed for use in determining whether a gene is differentially expressed in two different samples. The test method is an extension of earlier work that made use of point estimates for the variance. The method proposed here explicitly calculates in analytic form the marginal distribution for the difference in the mean expression of two samples, obviating the need for point estimates of the variance without recourse to posterior simulation. The prior distribution involves a single hyperparameter that can be calculated in a statistically rigorous manner, making clear the connection between the prior degrees of freedom and prior variance. CONCLUSION: The test is easy to understand and implement and application to both real and simulated data shows that the method has equal or greater power compared to the previous method and demonstrates consistent Type I error rates. The test is generally applicable outside the microarray field to any situation where prior information about the variance is available and is not limited to cases where estimates of the variance are based on many similar observations

Directory of Open Access Journals

PubMed Central

Telomere-associated endonuclease-deficient Penelope-like retroelements in diverse eukaryotes

Author: Arkhipova
Baird
Blackburn
Danilevskaya
Dimmic
Doolittle
Doulatov
E. A. Gladyshev
Eickbush
Evgen'ev
Fujiwara
Gascuel
I. R. Arkhipova
Jobb
Jurka
Kazazian
Kulpa
Kumar
Lue
Martin
Martinez
Moran
Morrish
Morrish
Nakamura
Rashkova
Schmidt
Welch
Whelan
Zhong
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 27/02/2007
Field of study

Author Posting. © The Author(s), 2007. This is the author's version of the work. It is posted here by permission of National Academy of Sciences of the USA for personal use, not for redistribution. The definitive version was published in Proceedings of the National Academy of the United States of America 104 (2007): 9352-9357, doi:10.1073/pnas.0702741104.The evolutionary origin of telomerases, enzymes that maintain the ends of linear chromosomes in most eukaryotes, is a subject of debate. Penelope-like elements (PLEs) are a recently described class of eukaryotic retroelements characterized by a GIY-YIG endonuclease domain and by a reverse transcriptase domain with similarity to telomerases and group II introns. Here we report that a subset of PLEs found in bdelloid rotifers, basidiomycete fungi, stramenopiles, and plants, representing four different eukaryotic kingdoms, lack the endonuclease domain and are located at telomeres. The 5' truncated ends of these elements are telomereoriented and typically capped by species-specific telomeric repeats. Most of them also carry several shorter stretches of telomeric repeats at or near their 3’ ends, which could facilitate utilization of the telomeric G-rich 3’ overhangs to prime reverse transcription. Many of these telomere-associated PLEs occupy a basal phylogenetic position close to the point of divergence from the telomerase-PLE common ancestor, and may descend from the missing link between early eukaryotic retroelements and present-day telomerases.Financial support from NIH and the U.S. National Science Foundation (MCB-0614142

Crossref

Woods Hole Open Access Server

PubMed Central

Selective Constraints on Amino Acids Estimated by a Mechanistic Codon Substitution Model with Multiple Nucleotide Changes

Author: A Doron-Faigenboim
A Schneider
AL Halpern
AR Kinjo
C Kosiol
Darren Martin
DT Jones
G Bazykin
GC Conant
H Akaike
I Keller
J Adachi
J Adachi
JP Huelsenbeck
K Tamura
L Jin
M Anisimova
M Averof
M Hasegawa
M Kimura
MA Larkin
MO Dayhoff
MW Dimmic
N Goldman
N Rodrigue
N Takahata
NGC Smith
R Grantham
S Guindon
S Miyazawa
S Whelan
S Whelan
S Whelan
Sanzo Miyazawa
SC Choi
SQ Le
SV Muse
T Miyata
T Miyata
TK Seo
TK Seo
W Delport
W Delport
Z Yang
Z Yang
Z Yang
Z Yang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 18/03/2011
Field of study

Empirical substitution matrices represent the average tendencies of substitutions over various protein families by sacrificing gene-level resolution. We develop a codon-based model, in which mutational tendencies of codon, a genetic code, and the strength of selective constraints against amino acid replacements can be tailored to a given gene. First, selective constraints averaged over proteins are estimated by maximizing the likelihood of each 1-PAM matrix of empirical amino acid (JTT, WAG, and LG) and codon (KHG) substitution matrices. Then, selective constraints specific to given proteins are approximated as a linear function of those estimated from the empirical substitution matrices. Akaike information criterion (AIC) values indicate that a model allowing multiple nucleotide changes fits the empirical substitution matrices significantly better. Also, the ML estimates of transition-transversion bias obtained from these empirical matrices are not so large as previously estimated. The selective constraints are characteristic of proteins rather than species. However, their relative strengths among amino acid pairs can be approximated not to depend very much on protein families but amino acid pairs, because the present model, in which selective constraints are approximated to be a linear function of those estimated from the JTT/WAG/LG/KHG matrices, can provide a good fit to other empirical substitution matrices including cpREV for chloroplast proteins and mtREV for vertebrate mitochondrial proteins. The present codon-based model with the ML estimates of selective constraints and with adjustable mutation rates of nucleotide would be useful as a simple substitution model in ML and Bayesian inferences of molecular phylogenetic trees, and enables us to obtain biologically meaningful information at both nucleotide and amino acid levels from codon and protein sequences.Comment: Table 9 in this article includes corrections for errata in the Table 9 published in 10.1371/journal.pone.0017244. Supporting information is attached at the end of the article, and a computer-readable dataset of the ML estimates of selective constraints is available from 10.1371/journal.pone.001724

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Genetic diversity of simian lentivirus in wild De Brazza’s monkeys (Cercopithecus neglectus) in Equatorial Africa

Author: A. Ayouba
A. F. Aghokeng
Aghokeng
Bailes
Barlow
Beer
Beer
Bibollet-Ruche
Clewley
Courgnaud
Courgnaud
Courgnaud
Dimmic
E. Delaporte
E. Mpoudi-Ngole
F. Liegoies
Hahn
Huelsenbeck
J.-J. Muyembe
Kimura
Lole
M. Peeters
Ndongmo
P. Mbala
Peeters
S. Ahuka
Souquiere
Thompson
Van Der Kuyl
van der Kuyl
Van Heuverswyn
Yang
Yang
Publication venue: Society for General Microbiology
Publication date: 01/01/2010
Field of study

De Brazza’s monkeys (Cercopithecus neglectus) are non-human primates (NHP) living in Equatorial Africa from South Cameroon through the Congo-Basin to Uganda. As most of the NHP living in sub-Saharan Africa, they are naturally infected with their own simian lentivirus, SIVdeb. Previous studies confirmed this infection for De Brazza’s from East Cameroon and Uganda. In this report, we studied the genetic diversity of SIVdeb in De Brazza’s monkeys from different geographical areas in South Cameroon and from the Democratic Republic of Congo (DRC). SIVdeb strains from east, central and western equatorial Africa form a species-specific monophyletic lineage. Phylogeographic clustering was observed among SIVdeb strains from Cameroon, the DRC and Uganda, but also among primates from distinct areas in Cameroon. These observations suggest a longstanding virus–host co-evolution. SIVdeb prevalence is high in wild De Brazza’s populations and thus represents a current risk for humans exposed to these primates in central Africa

Crossref

PubMed Central

Horizon / Pleins textes

The extraordinary evolutionary history of the reticuloendotheliosis viruses

Author: A Katzourakis
A Katzourakis
A Stamatakis
AC van der Kuyl
AD Yoder
AL Hughes
AM Awad
AM Fadly
Anna Maria Niewiadomska
Bill Sugden
BR Burmester
C Feschotte
C Hertig
CG Ludford
CG Ludford
CN Dren
CN Dren
CN Dren
CR Parrish
CY Kang
CY Lin
DC Nickle
DH Huson
DH Ley
DJ McGeoch
DS Arathy
DW Trampel
E Prukner-Radovcic
EH Dearborn
F Abascal
F Rodriguez
FR Robinson
H Koyama
HC Carlson
HG Purchase
HM Koo
I Davidson
IS Diallo
IS Diallo
J Hanson
J Li
J Martin
JJ Solomon
JP Stoye
JP Vanderberg
JS McDougall
K Nyakatura
KA Schat
KM Moore
LA Terzian
LE Hayes
LJ Yu
LT Coggeshall
M Barbacid
M Garcia
MJ Peterson
MK Cook
ML Drew
MR Patel
MW Dimmic
MX Motha
N Ratnamohan
N Yuasa
P Singh
PE Miller
PS Paul
PS Sarma
Q Liu
R Crespo
R Gifford
R Isfort
RA Weiss
RB Franklin
RB Franklin
RJ Isfort
RL Witter
RL Witter
RL Witter
RM Corwin
Robert J. Gifford
SB Hitchner
SK Biswas
SL Kosakovsky Pond
T Barbosa
T Tadese
TJ Kim
TM Grimes
VN Kewalramani
W Trager
WG Conway
WH Cheng
Y Wang
Z Cheng
Z Cui
Z Yang
É Brumpt
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

The reticuloendotheliosis viruses (REVs) comprise several closely related amphotropic retroviruses isolated from birds. These viruses exhibit several highly unusual characteristics that have not so far been adequately explained, including their extremely close relationship to mammalian retroviruses, and their presence as endogenous sequences within the genomes of certain large DNA viruses. We present evidence for an iatrogenic origin of REVs that accounts for these phenomena. Firstly, we identify endogenous retroviral fossils in mammalian genomes that share a unique recombinant structure with REVs—unequivocally demonstrating that REVs derive directly from mammalian retroviruses. Secondly, through sequencing of archived REV isolates, we confirm that contaminated Plasmodium lophurae stocks have been the source of multiple REV outbreaks in experimentally infected birds. Finally, we show that both phylogenetic and historical evidence support a scenario wherein REVs originated as mammalian retroviruses that were accidentally introduced into avian hosts in the late 1930s, during experimental studies of P. lophurae, and subsequently integrated into the fowlpox virus (FWPV) and gallid herpesvirus type 2 (GHV-2) genomes, generating recombinant DNA viruses that now circulate in wild birds and poultry. Our findings provide a novel perspective on the origin and evolution of REV, and indicate that horizontal gene transfer between virus families can expand the impact of iatrogenic transmission events

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Enlighten

Full-length genome sequence of a simian immunodeficiency virus (SIV) infecting a captive agile mangabey (Cercocebus agilis) is closely related to SIVrcm infecting wild red-capped mangabeys (Cercocebus torquatus) in Cameroon

Author: A. Ayouba
Aghokeng
Bailes
Beer
Bibollet-Ruche
Bibollet-Ruche
Courgnaud
Dimmic
E. Delaporte
E. Nerrienet
F. Liegeois
Georges-Courbot
Hahn
Huelsenbeck
Jin
Kumar
Lole
M. Peeters
S. Ahuka-Mundeke
Santiago
Van Der Kuyl
Van Heuverswyn
Van Heuverswyn
van Rensburg
VandeWoude
Wertheim
Y. Foupouapouognini
Yang
Publication venue: Society for General Microbiology
Publication date
Field of study

Simian immunodeficiency viruses (SIVs) are lentiviruses that infect an extensive number of wild African primate species. Here we describe for the first time SIV infection in a captive agile mangabey (Cercocebus agilis) from Cameroon. Phylogenetic analysis of the full-length genome sequence of SIVagi-00CM312 showed that this novel virus fell into the SIVrcm lineage and was most closely related to a newly characterized SIVrcm strain (SIVrcm-02CM8081) from a wild-caught red-capped mangabey (Cercocebus torquatus) from Cameroon. In contrast to red-capped mangabeys, no 24 bp deletion in CCR5 has been observed in the agile mangabey. Further studies on wild agile mangabeys are needed to determine whether agile and red-capped mangabeys are naturally infected with the same SIV lineage, or whether this agile mangabey became infected with an SIVrcm strain in captivity. However, our study shows that agile mangabeys are susceptible to SIV infection

Crossref

PubMed Central

INDELible: A Flexible Simulator of Biological Sequence Evolution

Author: Adachi
Arndt
Benner
Bishop
Blanchette
Cartwright
Chang
Dimmic
Ehrlich
Felsenstein
Galtier
Gaut
Goldman
Goldman
Gu
Gu
Hasegawa
Henikoff
Hillis
Kimura
Kimura
M ller
Nickle
Nielsen
Ogurtsov
Pedersen
Silva
Stoye
Tamura
Tamura
Tavar
Thorne
Varadarajan
W. Fletcher
Waterston
Whelan
Whelan
Yang
Yang
Yang
Yang
Yang
Yang
Yang
Yang
Yang
Yang
Yang
Z. Yang
Zhang
Publication venue: Oxford University Press
Publication date
Field of study

Many methods exist for reconstructing phylogenies from molecular sequence data, but few phylogenies are known and can be used to check their efficacy. Simulation remains the most important approach to testing the accuracy and robustness of phylogenetic inference methods. However, current simulation programs are limited, especially concerning realistic models for simulating insertions and deletions. We implement a portable and flexible application, named INDELible, for generating nucleotide, amino acid and codon sequence data by simulating insertions and deletions (indels) as well as substitutions. Indels are simulated under several models of indel-length distribution. The program implements a rich repertoire of substitution models, including the general unrestricted model and nonstationary nonhomogeneous models of nucleotide substitution, mixture, and partition models that account for heterogeneity among sites, and codon models that allow the nonsynonymous/synonymous substitution rate ratio to vary among sites and branches. With its many unique features, INDELible should be useful for evaluating the performance of many inference methods, including those for multiple sequence alignment, phylogenetic tree inference, and ancestral sequence, or genome reconstruction

Crossref

PubMed Central

A diversity of uncharacterized reverse transcriptases in bacteria

Author: Altschul
Baltimore
Barrangou
Blocker
Boeke
Boissinot
Brouns
Chang
Dawn M. Simon
Dimmic
Doulatov
Durmaz
Eickbush
Eickbush
Eickbush
Forde
Fortier
Frankel
Greider
Haberer
Hall
Herzer
Hua-Van
Ichiyanagi
Inouye
Inouye
Knoop
Kojima
Lambowitz
Lampson
Lampson
Lampson
Levis
Lim
Liu
Lynch
Lynch
Makarova
Malik
Medhekar
Mohr
Nakamura
Nakamura
Odegrip
Pasyukova
Rice
Shub
Simon
Sorek
Stamatakis
Steven Zimmerly
Swofford
Temin
Temin
Thompson
Xiong
Yamanaka
Zhong
Zimmerly
Zimmerly
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

Retroelements are usually considered to be eukaryotic elements because of the large number and variety in eukaryotic genomes. By comparison, reverse transcriptases (RTs) are rare in bacteria, with only three characterized classes: retrons, group II introns and diversity-generating retroelements (DGRs). Here, we present the results of a bioinformatic survey that aims to define the landscape of RTs across eubacterial, archaeal and phage genomes. We identify and categorize 1021 RTs, of which the majority are group II introns (73%). Surprisingly, a plethora of novel RTs are found that do not belong to characterized classes. The RTs have 11 domain architectures and are classified into 20 groupings based on sequence similarity, phylogenetic analyses and open reading frame domain structures. Interestingly, group II introns are the only bacterial RTs to exhibit clear evidence for independent mobility, while five other groups have putative functions in defense against phage infection or promotion of phage infection. These examples suggest that additional beneficial functions will be discovered among uncharacterized RTs. The study lays the groundwork for experimental characterization of these highly diverse sequences and has implications for the evolution of retroelements

CiteSeerX

Crossref

PubMed Central

An Endogenous Foamy-like Viral Element in the Coelacanth Genome

Author: A Katzourakis
A Katzourakis
A Marchler-Bauer
AJ Drummond
C Gilbert
C Gilbert
CD Meiering
CT Amemiya
D Posada
DJ Griffiths
EC Holmes
F Ronquist
F Sievers
FH Leendertz
G Han
G Talavera
Guan-Zhu Han
GZ Han
H Brinkmann
J Thézé
JP Noonan
M Linial
M Nikaido
Michael Emerman
Michael Worobey
MR Patel
MW Dimmic
N Takezaki
ND Wolfe
R Zardoya
RC Edgar
RJ Gifford
RW Meredith
S Kumar
S Kumar
SM Murray
W Heneine
WE Johnson
WM Switzer
WM Switzer
Y Shan
Z Johanson
Publication venue: Public Library of Science
Publication date: 28/06/2012
Field of study

Little is known about the origin and long-term evolutionary mode of retroviruses. Retroviruses can integrate into their hosts' genomes, providing a molecular fossil record for studying their deep history. Here we report the discovery of an endogenous foamy virus-like element, which we designate ‘coelacanth endogenous foamy-like virus’ (CoeEFV), within the genome of the coelacanth (Latimeria chalumnae). Phylogenetic analyses place CoeEFV basal to all known foamy viruses, strongly suggesting an ancient ocean origin of this major retroviral lineage, which had previously been known to infect only land mammals. The discovery of CoeEFV reveals the presence of foamy-like viruses in species outside the Mammalia. We show that foamy-like viruses have likely codiverged with their vertebrate hosts for more than 407 million years and underwent an evolutionary transition from water to land with their vertebrate hosts. These findings suggest an ancient marine origin of retroviruses and have important implications in understanding foamy virus biology

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Non-Negative Matrix Factorization for Learning Alignment-Specific Models of Protein Evolution

Author: Ben Murrell
C Kosiol
D Posada
D Posada
D Robinson
Daniel Kaliski
DC Nickle
DD Lee
DJ Lipman
DT Jones
F Abascal
Gerdus Benade
J Adachi
J Felsenstein
J Felsenstein
Jan Buys
K Devarajan
Konrad Scheffler
KP Burnham
KP Burnham
L Stanfel
Lise du Buisson
MO Dayhoff
MO Dayhoff
MW Dimmic
N Goldman
N Lartillot
Robert Ketteringham
S Whelan
S Whelan
S Zoller
SA Guindon
Sasha Moola
SL Kosakovsky Pond
SL Kosakovsky Pond
SQ Le
SQ Le
Thomas Mailund
Thomas Weighill
Tristan Hands
W Delport
Y Cao
Z Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Models of protein evolution currently come in two flavors: generalist and specialist. Generalist models (e.g. PAM, JTT, WAG) adopt a one-size-fits-all approach, where a single model is estimated from a number of different protein alignments. Specialist models (e.g. mtREV, rtREV, HIVbetween) can be estimated when a large quantity of data are available for a single organism or gene, and are intended for use on that organism or gene only. Unsurprisingly, specialist models outperform generalist models, but in most instances there simply are not enough data available to estimate them. We propose a method for estimating alignment-specific models of protein evolution in which the complexity of the model is adapted to suit the richness of the data. Our method uses non-negative matrix factorization (NNMF) to learn a set of basis matrices from a general dataset containing a large number of alignments of different proteins, thus capturing the dimensions of important variation. It then learns a set of weights that are specific to the organism or gene of interest and for which only a smaller dataset is available. Thus the alignment-specific model is obtained as a weighted sum of the basis matrices. Having been constrained to vary along only as many dimensions as the data justify, the model has far fewer parameters than would be required to estimate a specialist model. We show that our NNMF procedure produces models that outperform existing methods on all but one of 50 test alignments. The basis matrices we obtain confirm the expectation that amino acid properties tend to be conserved, and allow us to quantify, on specific alignments, how the strength of conservation varies across different properties. We also apply our new models to phylogeny inference and show that the resulting phylogenies are different from, and have improved likelihood over, those inferred under standard models

Public Library of Science (PLOS)

Cape Town University OpenUCT

Crossref

Directory of Open Access Journals

PubMed Central

Stellenbosch University SUNScholar Repository