43 research outputs found

    Top 65 predominant antigenicity associated sites for H3N2 influenza A viruses.

    No full text
    <p>Weight denotes the importance of the single and co-evolutionary sites in shaping the antigenic evolution. As suggested by the parameter tuning process (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0106660#pone.0106660.s012" target="_blank">Table S2</a>), the sites are generated by feature type “Sinco+EvolT4” and Lasso parameter 2<sup>4</sup>.</p><p>Top 65 predominant antigenicity associated sites for H3N2 influenza A viruses.</p

    Comparing eight feature types and 11 Lasso parameters.

    No full text
    <p>Comparing eight feature types and 11 Lasso parameters.</p

    The prediction RMSE curves comparing eight feature types.

    No full text
    <p>A sequential prediction was applied for viruses spanning from 1985 to 2003. The 8 feature types are “single”, “sinco+Struct6A”, “sinco+Struct10A”, “sinco+EvolT4”, “sinco+EvolT8”, “sinco+EvolT10”, “Sinco+EvolT16”, and “sinco+Struct10A+EvolT2”.</p

    Comparing four methods in predicting antigenic variants.

    No full text
    <p>Five accuracies “Pred1”, “Pred2”, “Pred3”, “Pred4” and “Pred5” were used to show prediction accuracies for 1, 2, 3, 4 and 5 seasons. “Pred1” predicted the pairwise distances of viruses in each pair of consecutive years <i>k</i> and <i>k−</i>1 for using viruses in [1968, <i>k</i>−1] as training data. “Pred2” predicted the distances between viruses in year <i>k</i> and <i>k</i>−1, and between viruses in year <i>k</i>−2 and those in years <i>k</i> and <i>k</i>−1 using viruses in [1968, <i>k</i>−2] as training data. Similar definitions hold for “Pred3”, “Pred4”, and “Pred5”.</p><p>Comparing four methods in predicting antigenic variants.</p

    HI-based and sequence-based cartographies on H3N2 68-07 data.

    No full text
    <p>Each ball denotes a single influenza virus and each individual color denotes a specific antigenic cluster.</p

    Four simulation cartographies of antigenic drifts and mutants of positions driving the drifts.

    No full text
    <p>The four antigenic drift events are: “BE89-BE92,” “BE92-WU95,” “WU95-SY97” and “SY97-FU02”. The mutants listed in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0106660#pone-0106660-t002" target="_blank">Table 2</a> from four wild strains “BE/352/1989,””JO/33/1994,” “NA/933/1995,” and “SY/5/1997” are also marked in the cartographies.</p

    Sequence-based cartographies on 1415 H3N2 influenza viruses from 2002 to 2013 downloadable from NCBI.

    No full text
    <p>Each colored ball represents a virus. The different colors mark its collection year. The five vaccine strains “Fujian/411/2002,” “California/07/2004,” “Wisconsin/67/2005,” “Brisbane/10/2007,” and “Perth/16/2007” are shown in big ball. We also mark the year of a representative virus in other years.</p

    Single and co-evolutionary sites driving the 12 antigenic drift events between successive clusters from HK68, EN72, VI75, TX77, BK79, SI87, BE89, BE92, WU95, SY97, FU02, CA04 and BR07.

    No full text
    <p>As suggested by parameter tuning process (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0106660#pone.0106660.s013" target="_blank">Table S3</a> and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0106660#pone.0106660.s014" target="_blank">S4</a>), the sites are generated by feature type “sinco+EvolT8” and Lasso parameter 1; the top numbers are selected by prediction RMSE curve. For simplicity, all top numbers are set to be 10, except for drift EN72-VI75 and BK79-SI87, which is set to be 15, CA04-BR07, which is 3 and SY97-FU02, which is 20.</p><p>Single and co-evolutionary sites driving the 12 antigenic drift events between successive clusters from HK68, EN72, VI75, TX77, BK79, SI87, BE89, BE92, WU95, SY97, FU02, CA04 and BR07.</p

    PR curves of five methods on the Data63 dataset.

    No full text
    <p>PR curves of five methods on the Data63 dataset.</p

    An Alignment-Free Algorithm in Comparing the Similarity of Protein Sequences Based on Pseudo-Markov Transition Probabilities among Amino Acids

    No full text
    <div><p>In this paper, we have proposed a novel alignment-free method for comparing the similarity of protein sequences. We first encode a protein sequence into a 440 dimensional feature vector consisting of a 400 dimensional Pseudo-Markov transition probability vector among the 20 amino acids, a 20 dimensional content ratio vector, and a 20 dimensional position ratio vector of the amino acids in the sequence. By evaluating the Euclidean distances among the representing vectors, we compare the similarity of protein sequences. We then apply this method into the ND5 dataset consisting of the ND5 protein sequences of 9 species, and the F10 and G11 datasets representing two of the xylanases containing glycoside hydrolase families, i.e., families 10 and 11. As a result, our method achieves a correlation coefficient of 0.962 with the canonical protein sequence aligner ClustalW in the ND5 dataset, much higher than those of other 5 popular alignment-free methods. In addition, we successfully separate the xylanases sequences in the F10 family and the G11 family and illustrate that the F10 family is more heat stable than the G11 family, consistent with a few previous studies. Moreover, we prove mathematically an identity equation involving the Pseudo-Markov transition probability vector and the amino acids content ratio vector.</p></div
    corecore