67 research outputs found
Selective Constraints on Amino Acids Estimated by a Mechanistic Codon Substitution Model with Multiple Nucleotide Changes
Empirical substitution matrices represent the average tendencies of
substitutions over various protein families by sacrificing gene-level
resolution. We develop a codon-based model, in which mutational tendencies of
codon, a genetic code, and the strength of selective constraints against amino
acid replacements can be tailored to a given gene. First, selective constraints
averaged over proteins are estimated by maximizing the likelihood of each 1-PAM
matrix of empirical amino acid (JTT, WAG, and LG) and codon (KHG) substitution
matrices. Then, selective constraints specific to given proteins are
approximated as a linear function of those estimated from the empirical
substitution matrices.
Akaike information criterion (AIC) values indicate that a model allowing
multiple nucleotide changes fits the empirical substitution matrices
significantly better. Also, the ML estimates of transition-transversion bias
obtained from these empirical matrices are not so large as previously
estimated. The selective constraints are characteristic of proteins rather than
species. However, their relative strengths among amino acid pairs can be
approximated not to depend very much on protein families but amino acid pairs,
because the present model, in which selective constraints are approximated to
be a linear function of those estimated from the JTT/WAG/LG/KHG matrices, can
provide a good fit to other empirical substitution matrices including cpREV for
chloroplast proteins and mtREV for vertebrate mitochondrial proteins.
The present codon-based model with the ML estimates of selective constraints
and with adjustable mutation rates of nucleotide would be useful as a simple
substitution model in ML and Bayesian inferences of molecular phylogenetic
trees, and enables us to obtain biologically meaningful information at both
nucleotide and amino acid levels from codon and protein sequences.Comment: Table 9 in this article includes corrections for errata in the Table
9 published in 10.1371/journal.pone.0017244. Supporting information is
attached at the end of the article, and a computer-readable dataset of the ML
estimates of selective constraints is available from
10.1371/journal.pone.001724
Comparative structural bioinformatics analysis of Bacillus amyloliquefaciens chemotaxis proteins within Bacillus subtilis group
Chemotaxis is a process in which bacteria sense their chemical environment and move towards more favorable conditions. Since plant colonization by bacteria is a multifaceted process which requires a response to the complex chemical environment, a finely tuned and sensitive chemotaxis system is needed. Members of the Bacillus subtilis group including Bacillus amyloliquefaciens are industrially important, for example, as bio-pesticides. The group exhibits plant growth-promoting characteristics, with different specificity towards certain host plants. Therefore, we hypothesize that while the principal molecular mechanisms of bacterial chemotaxis may be conserved, the bacterial chemotaxis system may need an evolutionary tweaking to adapt it to specific requirements, particularly in the process of evolution of free-living soil organisms, towards plant colonization behaviour. To date, almost nothing is known about what parts of the chemotaxis proteins are subjected to positive amino acid substitutions, involved in adjusting the chemotaxis system of bacteria during speciation. In this novel study, positively selected and purified sites of chemotaxis proteins were calculated, and these residues were mapped onto homology models that were built for the chemotaxis proteins, in an attempt to understand the spatial evolution of the chemotaxis proteins. Various positively selected amino acids were identified in semi-conserved regions of the proteins away from the known active sites
Phylogenomics of Unusual Histone H2A Variants in Bdelloid Rotifers
Rotifers of Class Bdelloidea are remarkable in having evolved for millions of years, apparently without males and meiosis. In addition, they are unusually resistant to desiccation and ionizing radiation and are able to repair hundreds of radiation-induced DNA double-strand breaks per genome with little effect on viability or reproduction. Because specific histone H2A variants are involved in DSB repair and certain meiotic processes in other eukaryotes, we investigated the histone H2A genes and proteins of two bdelloid species. Genomic libraries were built and probed to identify histone H2A genes in Adineta vaga and Philodina roseola, species representing two different bdelloid families. The expressed H2A proteins were visualized on SDS-PAGE gels and identified by tandem mass spectrometry. We find that neither the core histone H2A, present in nearly all other eukaryotes, nor the H2AX variant, a ubiquitous component of the eukaryotic DSB repair machinery, are present in bdelloid rotifers. Instead, they are replaced by unusual histone H2A variants of higher mass. In contrast, a species of rotifer belonging to the facultatively sexual, desiccation- and radiation-intolerant sister class of bdelloid rotifers, the monogononts, contains a canonical core histone H2A and appears to lack the bdelloid H2A variant genes. Applying phylogenetic tools, we demonstrate that the bdelloid-specific H2A variants arose as distinct lineages from canonical H2A separate from those leading to the H2AX and H2AZ variants. The replacement of core H2A and H2AX in bdelloid rotifers by previously uncharacterized H2A variants with extended carboxy-terminal tails is further evidence for evolutionary diversity within this class of histone H2A genes and may represent adaptation to unusual features specific to bdelloid rotifers
Advantages of a Mechanistic Codon Substitution Model for Evolutionary Analysis of Protein-Coding Sequences
A mechanistic codon substitution model, in which each codon substitution rate is proportional to the product of a codon mutation rate and the average fixation probability depending on the type of amino acid replacement, has advantages over nucleotide, amino acid, and empirical codon substitution models in evolutionary analysis of protein-coding sequences. It can approximate a wide range of codon substitution processes. If no selection pressure on amino acids is taken into account, it will become equivalent to a nucleotide substitution model. If mutation rates are assumed not to depend on the codon type, then it will become essentially equivalent to an amino acid substitution model. Mutation at the nucleotide level and selection at the amino acid level can be separately evaluated.The present scheme for single nucleotide mutations is equivalent to the general time-reversible model, but multiple nucleotide changes in infinitesimal time are allowed. Selective constraints on the respective types of amino acid replacements are tailored to each gene in a linear function of a given estimate of selective constraints. Their good estimates are those calculated by maximizing the respective likelihoods of empirical amino acid or codon substitution frequency matrices. Akaike and Bayesian information criteria indicate that the present model performs far better than the other substitution models for all five phylogenetic trees of highly-divergent to highly-homologous sequences of chloroplast, mitochondrial, and nuclear genes. It is also shown that multiple nucleotide changes in infinitesimal time are significant in long branches, although they may be caused by compensatory substitutions or other mechanisms. The variation of selective constraint over sites fits the datasets significantly better than variable mutation rates, except for 10 slow-evolving nuclear genes of 10 mammals. An critical finding for phylogenetic analysis is that assuming variable mutation rates over sites lead to the overestimation of branch lengths
Inference of Co-Evolving Site Pairs: an Excellent Predictor of Contact Residue Pairs in Protein 3D structures
Residue-residue interactions that fold a protein into a unique
three-dimensional structure and make it play a specific function impose
structural and functional constraints on each residue site. Selective
constraints on residue sites are recorded in amino acid orders in homologous
sequences and also in the evolutionary trace of amino acid substitutions. A
challenge is to extract direct dependences between residue sites by removing
indirect dependences through other residues within a protein or even through
other molecules. Recent attempts of disentangling direct from indirect
dependences of amino acid types between residue positions in multiple sequence
alignments have revealed that the strength of inferred residue pair couplings
is an excellent predictor of residue-residue proximity in folded structures.
Here, we report an alternative attempt of inferring co-evolving site pairs from
concurrent and compensatory substitutions between sites in each branch of a
phylogenetic tree. First, branch lengths of a phylogenetic tree inferred by the
neighbor-joining method are optimized as well as other parameters by maximizing
a likelihood of the tree in a mechanistic codon substitution model. Mean
changes of quantities, which are characteristic of concurrent and compensatory
substitutions, accompanied by substitutions at each site in each branch of the
tree are estimated with the likelihood of each substitution. Partial
correlation coefficients of the characteristic changes along branches between
sites are calculated and used to rank co-evolving site pairs. Accuracy of
contact prediction based on the present co-evolution score is comparable to
that achieved by a maximum entropy model of protein sequences for 15 protein
families taken from the Pfam release 26.0. Besides, this excellent accuracy
indicates that compensatory substitutions are significant in protein evolution.Comment: 17 pages, 4 figures, and 4 tables with supplementary information of 5
figure
CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences
Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of rate classes, where is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes
Modeling of Human Prokineticin Receptors: Interactions with Novel Small-Molecule Binders and Potential Off-Target Drugs
The Prokineticin receptor (PKR) 1 and 2 subtypes are novel members of family A GPCRs, which exhibit an unusually high degree of sequence similarity. Prokineticins (PKs), their cognate ligands, are small secreted proteins of ∼80 amino acids; however, non-peptidic low-molecular weight antagonists have also been identified. PKs and their receptors play important roles under various physiological conditions such as maintaining circadian rhythm and pain perception, as well as regulating angiogenesis and modulating immunity. Identifying binding sites for known antagonists and for additional potential binders will facilitate studying and regulating these novel receptors. Blocking PKRs may serve as a therapeutic tool for various diseases, including acute pain, inflammation and cancer.Ligand-based pharmacophore models were derived from known antagonists, and virtual screening performed on the DrugBank dataset identified potential human PKR (hPKR) ligands with novel scaffolds. Interestingly, these included several HIV protease inhibitors for which endothelial cell dysfunction is a documented side effect. Our results suggest that the side effects might be due to inhibition of the PKR signaling pathway. Docking of known binders to a 3D homology model of hPKR1 is in agreement with the well-established canonical TM-bundle binding site of family A GPCRs. Furthermore, the docking results highlight residues that may form specific contacts with the ligands. These contacts provide structural explanation for the importance of several chemical features that were obtained from the structure-activity analysis of known binders. With the exception of a single loop residue that might be perused in the future for obtaining subtype-specific regulation, the results suggest an identical TM-bundle binding site for hPKR1 and hPKR2. In addition, analysis of the intracellular regions highlights variable regions that may provide subtype specificity
Positive Darwinian Selection in the Piston That Powers Proton Pumps in Complex I of the Mitochondria of Pacific Salmon
The mechanism of oxidative phosphorylation is well understood, but evolution of the proteins involved is not. We combined phylogenetic, genomic, and structural biology analyses to examine the evolution of twelve mitochondrial encoded proteins of closely related, yet phenotypically diverse, Pacific salmon. Two separate analyses identified the same seven positively selected sites in ND5. A strong signal was also detected at three sites of ND2. An energetic coupling analysis revealed several structures in the ND5 protein that may have co-evolved with the selected sites. These data implicate Complex I, specifically the piston arm of ND5 where it connects the proton pumps, as important in the evolution of Pacific salmon. Lastly, the lineage to Chinook experienced rapid evolution at the piston arm
Evidence for a Fourteenth mtDNA-Encoded Protein in the Female-Transmitted mtDNA of Marine Mussels (Bivalvia: Mytilidae)
BACKGROUND: A novel feature for animal mitochondrial genomes has been recently established: i.e., the presence of additional, lineage-specific, mtDNA-encoded proteins with functional significance. This feature has been observed in freshwater mussels with doubly uniparental inheritance of mtDNA (DUI). The latter unique system of mtDNA transmission, which also exists in some marine mussels and marine clams, is characterized by one mt genome inherited from the female parent (F mtDNA) and one mt genome inherited from the male parent (M mtDNA). In freshwater mussels, the novel mtDNA-encoded proteins have been shown to be mt genome-specific (i.e., one novel protein for F genomes and one novel protein for M genomes). It has been hypothesized that these novel, F- and M-specific, mtDNA-encoded proteins (and/or other F- and/or M-specific mtDNA sequences) could be responsible for the different modes of mtDNA transmission in bivalves but this remains to be demonstrated. METHODOLOGY/PRINCIPAL FINDINGS: We investigated all complete (or nearly complete) female- and male-transmitted marine mussel mtDNAs previously sequenced for the presence of ORFs that could have functional importance in these bivalves. Our results confirm the presence of a novel F genome-specific mt ORF, of significant length (>100aa) and located in the control region, that most likely has functional significance in marine mussels. The identification of this ORF in five Mytilus species suggests that it has been maintained in the mytilid lineage (subfamily Mytilinae) for ∼13 million years. Furthermore, this ORF likely has a homologue in the F mt genome of Musculista senhousia, a DUI-containing mytilid species in the subfamily Crenellinae. We present evidence supporting the functionality of this F-specific ORF at the transcriptional, amino acid and nucleotide levels. CONCLUSIONS/SIGNIFICANCE: Our results offer support for the hypothesis that "novel F genome-specific mitochondrial genes" are involved in key biological functions in bivalve species with DUI
Assessing parallel gene histories in viral genomes
Background: The increasing abundance of sequence data has exacerbated a long known problem: gene trees and species trees for the same terminal taxa are often incongruent. Indeed, genes within a genome have not all followed the same evolutionary path due to events such as incomplete lineage sorting, horizontal gene transfer, gene duplication and deletion, or recombination. Considering conflicts between gene trees as an obstacle, numerous methods have been developed to deal with these incongruences and to reconstruct consensus evolutionary histories of species despite the heterogeneity in the history of their genes. However, inconsistencies can also be seen as a source of information about the specific evolutionary processes that have shaped genomes.
Results: The goal of the approach here proposed is to exploit this conflicting information: we have compiled eleven variables describing phylogenetic relationships and evolutionary pressures and submitted them to dimensionality reduction techniques to identify genes with similar evolutionary histories. To illustrate the applicability of the method, we have chosen two viral datasets, namely papillomaviruses and Turnip mosaic virus (TuMV) isolates, largely dissimilar in genome, evolutionary distance and biology. Our method pinpoints viral genes with common evolutionary patterns. In the case of papillomaviruses, gene clusters match well our knowledge on viral biology and life cycle, illustrating the potential of our approach. For the less known TuMV, our results trigger new hypotheses about viral evolution and gene interaction.
Conclusions: The approach here presented allows turning phylogenetic inconsistencies into evolutionary information, detecting gene assemblies with similar histories, and could be a powerful tool for comparative pathogenomics.IGB was funded by the disappeared Spanish Ministry for Science and Innovation (CGL2010-16713). Work in Valencia was supported by grant BFU2012-30805 from the Spanish Ministry of Economy and Competitiveness (MINECO) to SFE. BMC is the recipient of an IDIBELL PhD fellowship.Mengual-Chuliá, B.; Bedhomme, S.; Lafforgue, G.; Elena Fito, SF.; Bravo, IG. (2016). Assessing parallel gene histories in viral genomes. BMC Evolutionary Biology. 16:1-15. https://doi.org/10.1186/s12862-016-0605-4S11516Hess J, Goldman N. Addressing inter-gene heterogeneity in maximum likelihood phylogenomic analysis: Yeasts revisited. PLoS ONE. 2011;6:e22783.Salichos L, Rokas A. Inferring ancient divergences requires genes with strong phylogenetic signals. Nature. 2013;497:327–31.Zhong B, Liu L, Yan Z, Penny D. Origin of land plants using the multispecies coalescent model. Trends Plant Sci. 2013;18:492–5.Song S, Liu L, Edwards SV, Wu S. Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model. Proc Natl Acad Sci U S A. 2012;109:14942–7.Nichols R. Gene trees and species trees are not the same. Trends Ecol Evol. 2001;16:358–64.Maddison WP. Gene trees in species trees. Syst Biol. 1997;46:523–36.Suh A, Smeds L, Ellegren H. The dynamics of incomplete lineage sorting across the ancient adaptive radiation of neoavian birds. PLoS Biol. 2015;13:e1002224.McBreen K, Lockhart PJ. Reconstructing reticulate evolutionary histories of plants. Trends Plant Sci. 2006;11:398–404.Dagan T, Martin W. The tree of one percent. Genome Biol. 2006;7:118.Beiko RG, Harlow TJ, Ragan MA. Highways of gene sharing in prokaryotes. Proc Natl Acad Sci U S A. 2005;102:14332–7.Cotton JA, Page RD. Going nuclear: Gene family evolution and vertebrate phylogeny reconciled. Proc Biol Sci. 2002;269:1555–61.Kuhner MK, Yamato J. Practical performance of tree comparison metrics. Syst Biol. 2015;64:205–14.Brochier C, Bapteste E, Moreira D, Philippe H. Eubacterial phylogeny based on translational apparatus proteins. Trends Genet. 2002;18:1–5.Bapteste E, Susko E, Leigh J, MacLeod D, Charlebois RL, Doolittle WF. Do orthologous gene phylogenies really support tree-thinking? BMC Evol Biol. 2005;5:33.Leigh JW, Susko E, Baumgartner M, Roger AJ. Testing congruence in phylogenomic analysis. Syst Biol. 2008;57:104–15.Leigh JW, Schliep K, Lopez P, Bapteste E. Let them fall where they may: Congruence analysis in massive phylogenetically messy data sets. Mol Biol Evol. 2011;28:2773–85.de Vienne DM, Ollier S, Aguileta G. Phylo-mcoa: A fast and efficient method to detect outlier genes and species in phylogenomics using multiple co-inertia analysis. Mol Biol Evol. 2012;29:1587–98.Wang S, Luo X, Wei W, Zheng Y, Dou Y, Cai X. Calculation of evolutionary correlation between individual genes and full-length genome: A method useful for choosing phylogenetic markers for molecular epidemiology. PLoS ONE. 2013;8:e81106.Salichos L, Stamatakis A, Rokas A. Novel information theory-based measures for quantifying incongruence among phylogenetic trees. Mol Biol Evol. 2014;31:1261–71.Weyenberg G, Huggins PM, Schardl CL, Howe DK, Yoshida R. Kdetrees: Non-parametric estimation of phylogenetic tree distributions. Bioinformatics. 2014;30:2280–7.de Queiroz A. For consensus (sometimes). Syst Biol. 1993;42:368–72.Miyamoto MM, Fitch WM. Testing the covarion hypothesis of molecular evolution. Mol Biol Evol. 1995;12:503–13.Sanderson MJ, Purvis A, Henze C. Phylogenetic supertrees: Assembling the trees of life. Trends Ecol Evol. 1998;13:105–9.Bininda-Emonds ORP. Phylogenetic supertrees: Combining information to reveal the tree of life. Comput Biol. Dordrecht (The Netherlands): Kluwer Academic Publishers; 2004.Creevey CJ, Fitzpatrick DA, Philip GK, Kinsella RJ, O’Connell MJ, Pentony MM, et al. Does a tree-like phylogeny only exist at the tips in the prokaryotes? Proc Biol Sci. 2004;271:2551–8.Pisani D, Cotton JA, McInerney JO. Supertrees disentangle the chimerical origin of eukaryotic genomes. Mol Biol Evol. 2007;24:1752–60.Ane C, Larget B, Baum DA, Smith SD, Rokas A. Bayesian estimation of concordance among gene trees. Mol Biol Evol. 2007;24:412–26.Gordon AD. A measure of the agreement between rankings. Biometrika. 1979;66:7–15.de Vienne DM, Giraud T, Martin OC. A congruence index for testing topological similarity between trees. Bioinformatics. 2007;23:3119–24.Suchard MA, Weiss RE, Sinsheimer JS, Dorman KS, Patel M, McCabe ERB. Evolutionary similarity among genes. J Am Stat Assoc. 2003;98:653–62.Edwards SV, Liu L, Pearl DK. High-resolution species trees without concatenation. Proc Natl Acad Sci U S A. 2007;104:5936–41.Liu L, Pearl DK. Species trees from gene trees: Reconstructing bayesian posterior distributions of a species phylogeny using estimated gene tree distributions. Syst Biol. 2007;56:504–14.Liu L, Pearl DK, Brumfield RT, Edwards SV. Estimating species trees using multiple-allele DNA sequence data. Evolution. 2008;62:2080–91.Levasseur C, Lapointe FJ. War and peace in phylogenetics: A rejoinder on total evidence and consensus. Syst Biol. 2001;50:881–91.de Queiroz A, Gatesy J. The supermatrix approach to systematics. Trends Ecol Evol. 2007;22:34–41.Huson DH, Bryant D. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol. 2006;23:254–67.Layeghifard M, Peres-Neto PR, Makarenkov V. Inferring explicit weighted consensus networks to represent alternative evolutionary histories. BMC Evol Biol. 2013;13:274.Stockham C, Wang LS, Warnow T. Statistically based postprocessing of phylogenetic analysis by clustering. Bioinformatics. 2002;18 Suppl 1:S285–93.Bonnard C, Berry V, Lartillot N. Multipolar consensus for phylogenetic trees. Syst Biol. 2006;55:837–43.Guenoche A. Multiple consensus trees: A method to separate divergent genes. BMC Bioinformatics. 2013;14:46.Duggal R, Cuconati A, Gromeier M, Wimmer E. Genetic recombination of poliovirus in a cell-free system. Proc Natl Acad Sci U S A. 1997;94:13786–91.Reiter J, Perez-Vilaro G, Scheller N, Mina LB, Diez J, Meyerhans A. Hepatitis c virus rna recombination in cell culture. J Hepatol. 2011;55:777–83.Desbiez C, Lecoq H. Evidence for multiple intraspecific recombinants in natural populations of watermelon mosaic virus (wmv, potyvirus). Arch Virol. 2008;153:1749–54.Larsen RC, Miklas PN, Druffel KL, Wyatt SD. Nl-3 k strain is a stable and naturally occurring interspecific recombinant derived from bean common mosaic necrosis virus and bean common mosaic virus. Phytopathology. 2005;95:1037–42.Valli A, Lopez-Moya JJ, Garcia JA. Recombination and gene duplication in the evolutionary diversification of p1 proteins in the family potyviridae. J Gen Virol. 2007;88:1016–28.Gottschling M, Bravo IG, Schulz E, Bracho MA, Deaville R, Jepson PD, et al. Modular organizations of novel cetacean papillomaviruses. Mol Phylogenet Evol. 2011;59:34–42.Woolford L, Rector A, Van Ranst M, Ducki A, Bennett MD, Nicholls PK, et al. A novel virus detected in papillomas and carcinomas of the endangered western barred bandicoot (perameles bougainville) exhibits genomic features of both the papillomaviridae and polyomaviridae. J Virol. 2007;81:13280–90.Chen X, Zhang Q, He C, Zhang L, Li J, Zhang W, et al. Recombination and natural selection in hepatitis e virus genotypes. J Med Virol. 2012;84:1396–407.Cadar D, Csagola A, Kiss T, Tuboly T. Capsid protein evolution and comparative phylogeny of novel porcine parvoviruses. Mol Phylogenet Evol. 2013;66:243–53.Smith LM, McWhorter AR, Shellam GR, Redwood AJ. The genome of murine cytomegalovirus is shaped by purifying selection and extensive recombination. Virology. 2013;435:258–68.Münk C, Willemsen A, Bravo IG. An ancient history of gene duplications, fusions and losses in the evolution of apobec3 mutators in mammals. BMC Evol Biol. 2012;12:71.Daugherty MD, Malik HS. Rules of engagement: Molecular insights from host-virus arms races. Annu Rev Genet. 2012;46:677–700.Edgar RC. Muscle: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7.Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17:540–52.Stamatakis A, Ludwig T, Meier H. Raxml-iii: A fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics. 2005;21:456–63.Soria-Carrasco V, Talavera G, Igea J, Castresana J. The k tree score: Quantification of differences in the relative branch length and topology of phylogenetic trees. Bioinformatics. 2007;23:2954–6.Stern A, Doron-Faigenboim A, Erez E, Martz E, Bacharach E, Pupko T. Selecton 2007: Advanced models for detecting positive and purifying selection using a bayesian inference approach. Nucleic Acids Res. 2007;35:W506–11.Doron-Faigenboim A, Pupko T. A combined empirical and mechanistic codon model. Mol Biol Evol. 2007;24:388–97.Swanson WJ, Nielsen R, Yang Q. Pervasive adaptive evolution in mammalian fertilization proteins. Mol Biol Evol. 2003;20:18–20.Shukla DD, Ward CW, Brunt AA. The potyviridae. Wallingford (UK): CABI; 1994.Chung BY, Miller WA, Atkins JF, Firth AE. An overlapping essential gene in the potyviridae. Proc Natl Acad Sci U S A. 2008;105:5897–902.Tan Z, Wada Y, Chen J, Ohshima K. Inter- and intralineage recombinants are common in natural populations of turnip mosaic virus. J Gen Virol. 2004;85:2683–96.Bravo IG, de Sanjose S, Gottschling M. The clinical importance of understanding the evolution of papillomaviruses. Trends Microbiol. 2010;18:432–8.Klingelhutz AJ, Roman A. Cellular transformation by human papillomaviruses: Lessons learned by comparing high- and low-risk viruses. Virology. 2012;424:77–98.Bravo IG, Alonso A. Mucosal human papillomaviruses encode four different e5 proteins whose chemistry and phylogeny correlate with malignant or benign growth. J Virol. 2004;78:13613–26.Garcia-Vallve S, Alonso A, Bravo IG. Papillomaviruses: Different genes have different histories. Trends Microbiol. 2005;13:514–21.Bravo IG, Felez-Sanchez M. Papillomaviruses: Viral evolution, cancer and evolutionary medicine. Evol Med Public Health. 2015;2015:32–51.Aleman-Verdaguer ME, Goudou-Urbino C, Dubern J, Beachy RN, Fauquet C. Analysis of the sequence diversity of the p1, hc, p3, nib and cp genomic regions of several yam mosaic potyvirus isolates: Implications for the intraspecies molecular diversity of potyviruses. J Gen Virol. 1997;78(Pt 6):1253–64.Sakai J, Mori M, Morishita T, Tanaka M, Hanada K, Usugi T, et al. Complete nucleotide sequence and genome organization of sweet potato feathery mottle virus (s strain) genomic rna: The large coding region of the p1 gene. Arch Virol. 1997;142:1553–62.Tordo VM, Chachulska AM, Fakhfakh H, Le Romancer M, Robaglia C, Astier-Manifacier S. Sequence polymorphism in the 5’ntr and in the p1 coding region of potato virus y genomic rna. J Gen Virol. 1995;76(Pt 4):939–49.Verchot J, Carrington JC. Evidence that the potyvirus p1 proteinase functions in trans as an accessory factor for genome amplification. J Virol. 1995;69:3668–74.Salvador B, Saenz P, Yanguez E, Quiot JB, Quiot L, Delgadillo MO, et al. Host-specific effect of p1 exchange between two potyviruses. Mol Plant Pathol. 2008;9:147–55.Desbiez C, Lecoq H. The nucleotide sequence of watermelon mosaic virus (wmv, potyvirus) reveals interspecific recombination between two related potyviruses in the 5’ part of the genome. Arch Virol. 2004;149:1619–32.Majer E, Salvador Z, Zwart MP, Willemsen A, Elena SF, Daros JA. Relocation of the nib gene in the tobacco etch potyvirus genome. J Virol. 2014;88:4586–90.Pasin F, Simon-Mateo C, Garcia JA. The hypervariable amino-terminus of p1 protease modulates potyviral replication and host defense responses. PLoS Pathog. 2014;10:e1003985.Lopez-Lastra M, Rivas A, Barria MI. Protein synthesis in eukaryotes: The growing biological relevance of cap-independent translation initiation. Biol Res. 2005;38:121–46.Kang ST, Wang HC, Yang YT, Kou GH, Lo CF. The DNA virus white spot syndrome virus uses an internal ribosome entry site for translation of the highly expressed nonstructural protein icp35. J Virol. 2013;87:13263–78.Dolja VV, Haldeman-Cahill R, Montgomery AE, Vandenbosch KA, Carrington JC. Capsid protein determinants involved in cell-to-cell and long distance movement of tobacco etch potyvirus. Virology. 1995;206:1007–16.Carrington JC, Jensen PE, Schaad MC. Genetic evidence for an essential role for potyvirus ci protein in cell-to-cell movement. Plant J. 1998;14:393–400.Wei T, Zhang C, Hong J, Xiong R, Kasschau KD, Zhou X, et al. Formation of complexes at plasmodesmata for potyvirus intercellular movement is mediated by the viral protein p3n-pipo. PLoS Pathog. 2010;6:e1000962.Felez-Sanchez M, Trosemeier JH, Bedhomme S, Gonzalez-Bravo MI, Kamp C, Bravo IG. Cancer, warts, or asymptomatic infections: Clinical presentation matches codon usage preferences in human papillomaviruses. Genome Biol Evol. 2015;7:2117–35.Doorbar J, Gallimore PH. Identification of proteins encoded by the l1 and l2 open reading frames of human papillomavirus 1a. J Virol. 1987;61:2793–9.Hughes FJ, Romanos MA. E1 protein of human papillomavirus is a DNA helicase/atpase. Nucleic Acids Res. 1993;21:5817–23.Sarafi TR, McBride AA. Domains of the bpv-1 e1 replication protein required for origin-specific DNA binding and interaction with the e2 transactivator. Virology. 1995;211:385–96.Chen G, Stenlund A. Characterization of the DNA-binding domain of the bovine papillomavirus replication initiator e1. J Virol. 1998;72:2567–76.McBride AA. Replication and partitioning of papillomavirus genomes. Adv Virus Res. 2008;72:155–205.McBride A, Myers G. The e2 proteins: An update. In: Laboratory HPLAN. Los Alamos: Myers, G., and coworkers; 1997. p. III54–99.Kirnbauer R, Booy F, Cheng N, Lowy DR, Schiller JT. Papillomavirus l1 major capsid protein self-assembles into virus-like particles that are highly immunogenic. Proc Natl Acad Sci U S A. 1992;89:12180–4.Penrose KJ, McBride AA. Proteasome-mediated degradation of the papillomavirus e2-ta protein is regulated by phosphorylation and can modulate viral genome copy number. J Virol. 2000;74:6031–8.Poddar A, Reed SC, McPhillips MG, Spindler JE, McBride AA. The human papillomavirus type 8 e2 tethering protein targets the ribosomal DNA loci of host mitotic chromosomes. J Virol. 2009;83:640–50.Lai MC, Teh BH, Tarn WY. A human papillomavirus e2 transcriptional activator. The interactions with cellular splicing factors and potential function in pre-mrna processing. J Biol Chem. 1999;274:11832–41.Zou N, Lin BY, Duan F, Lee KY, Jin G, Guan R, et al. The hinge of the human papillomavirus type 11 e2 protein contains major determinants for nuclear localization and nuclear matrix association. J Virol. 2000;74:3761–70.Steger G, Schnabel C, Schmidt HM. The hinge region of the human papillomavirus type 8 e2 protein activates the human p21(waf1/cip1) promoter via interaction with sp1. J Gen Virol. 2002;83:503–10.Hughes AL, Hughes MA. Patterns of nucleotide difference in overlapping and non-overlapping reading frames of papillomavirus genomes. Virus Res. 2005;113:81–8.Ahola H, Bergman P, Strom AC, Moreno-Lopez J, Pettersson U. Organization and expression of the transforming region from the european elk papillomavirus (eepv). Gene. 1986;50:195–205.Chen Z, Schiffman M, Herrero R, Desalle R, Burk RD. Human papillomavirus (hpv) types 101 and 103 isolated from cervicovaginal cells lack an e6 open reading frame (orf) and are related to gamma-papillomaviruses. Virology. 2007;360:447–53.Nobre RJ, Herraez-Hernandez E, Fei JW, Langbein L, Kaden S, Grone HJ, et al. E7 oncoprotein of novel human papillomavirus type 108 lacking the e6 gene induces dysplasia in organotypic keratinocyte cultures. J Virol. 2009;83:2907–16.Stevens H, Rector A, Bertelsen MF, Leifsson PS, Van Ranst M. Novel papillomavirus isolated from the oral mucosa of a polar bear does not cluster with other papillomaviruses of carnivores. Vet Microbiol. 2008;129:108–16.Stevens H, Rector A, Van Der Kroght K, Van Ranst M. Isolation and cloning of two variant papillomaviruses from domestic pigs: Sus scrofa papillomaviruses type 1 variants a and b. J Gen Virol. 2008;89:2475–81.Dyson N, Howley PM, Munger K, Harlow E. The human papilloma virus-16 e7 oncoprotein is able to bind to the retinoblastoma gene product. Science. 1989;243:934–7.Werness BA, Levine AJ, Howley PM. Association of human papillomavirus types 16 and 18 e6 proteins with p53. Science. 1990;248:76–9.Huibregtse JM, Scheffner M, Howley PM. A cellular protein mediates association of p53 with the e6 oncoprotein of human papillomavirus types 16 or 18. EMBO J. 1991;10:4129–35.Hartley KA, Alexander KA. Human tata binding protein inhibits human papillomavirus type 11 DNA replication by antagonizing e1-e2 protein complex formation on the viral origin of replication. J Virol. 2002;76:5014–23.Ilves I, Kadaja M, Ustav M. Two separate replication modes of the bovine papillomavirus bpv1 origin of replication that have different sensitivity to p53. Virus Res. 2003;96:75–84.Narahari J, Fisk JC, Melendy T, Roman A. Interactions of the cellular ccaat displacement protein and human papillomavirus e2 protein with the viral origin of replication can regulate DNA replication. Virology. 2006;350:302–11.Barrow-Laing L, Chen W, Roman A. Low- and high-risk human papillomavirus e7 proteins regulate p130 differently. Virology. 2010;400:233–9.White EA, Sowa ME, Tan MJ, Jeudy S, Hayes SD, Santha S, et al. Systematic identification of interactions between host cell proteins and e7 oncoproteins from diverse human papillomaviruses. Proc Natl Acad Sci U S A. 2012;109:E260–7.Nomine Y, Masson M, Charbonnier S, Zanier K, Ristriani T, Deryckere F, et al. Structural and functional analysis of e6 oncoprotein: Insights in the molecular pathways of human papillomavirus-mediated pathogenesis. Mol Cell. 2006;21:665–78.Zanier K, ould M’hamed ould Sidi A, Boulade-Ladame C, Rybin V, Chappelle A, Atkinson A, et al. Solution structure analysis of the hpv16 e6 oncoprotein reveals a self-association mechanism required for e6-mediated degradation of p53. Structure. 2012;20:604–17.Briddon RW, Patil BL, Bagewadi B, Nawaz-ul-Rehman MS, Fauquet CM. Distinct evolutionary histories of the DNA-a and DNA-b components of bipartite begomoviruses. BMC Evol Biol. 2010;10:97.Chen JM, Sun YX, Chen JW, Liu S, Yu JM, Shen CJ, et al. Panorama phylogenetic diversity and distribution of type a influenza viruses based on their six internal gene sequences. J Virol. 2009;6:137
- …