18 research outputs found

    Evolutionary Interactions between N-Linked Glycosylation Sites in the HIV-1 Envelope

    Get PDF
    The addition of asparagine (N)-linked polysaccharide chains (i.e., glycans) to the gp120 and gp41 glycoproteins of human immunodeficiency virus type 1 (HIV-1) envelope is not only required for correct protein folding, but also may provide protection against neutralizing antibodies as a “glycan shield.” As a result, strong host-specific selection is frequently associated with codon positions where nonsynonymous substitutions can create or disrupt potential N-linked glycosylation sites (PNGSs). Moreover, empirical data suggest that the individual contribution of PNGSs to the neutralization sensitivity or infectivity of HIV-1 may be critically dependent on the presence or absence of other PNGSs in the envelope sequence. Here we evaluate how glycan–glycan interactions have shaped the evolution of HIV-1 envelope sequences by analyzing the distribution of PNGSs in a large-sequence alignment. Using a “covarion”-type phylogenetic model, we find that the rates at which individual PNGSs are gained or lost vary significantly over time, suggesting that the selective advantage of having a PNGS may depend on the presence or absence of other PNGSs in the sequence. Consequently, we identify specific interactions between PNGSs in the alignment using a new paired-character phylogenetic model of evolution, and a Bayesian graphical model. Despite the fundamental differences between these two methods, several interactions are jointly identified by both. Mapping these interactions onto a structural model of HIV-1 gp120 reveals that negative (exclusive) interactions occur significantly more often between colocalized glycans, while positive (inclusive) interactions are restricted to more distant glycans. Our results imply that the adaptive repertoire of alternative configurations in the HIV-1 glycan shield is limited by functional interactions between the N-linked glycans. This represents a potential vulnerability of rapidly evolving HIV-1 populations that may provide useful glycan-based targets for neutralizing antibodies

    Comparison between Suitable Priors for Additive Bayesian Networks

    Full text link
    Additive Bayesian networks are types of graphical models that extend the usual Bayesian generalized linear model to multiple dependent variables through the factorisation of the joint probability distribution of the underlying variables. When fitting an ABN model, the choice of the prior of the parameters is of crucial importance. If an inadequate prior - like a too weakly informative one - is used, data separation and data sparsity lead to issues in the model selection process. In this work a simulation study between two weakly and a strongly informative priors is presented. As weakly informative prior we use a zero mean Gaussian prior with a large variance, currently implemented in the R-package abn. The second prior belongs to the Student's t-distribution, specifically designed for logistic regressions and, finally, the strongly informative prior is again Gaussian with mean equal to true parameter value and a small variance. We compare the impact of these priors on the accuracy of the learned additive Bayesian network in function of different parameters. We create a simulation study to illustrate Lindley's paradox based on the prior choice. We then conclude by highlighting the good performance of the informative Student's t-prior and the limited impact of the Lindley's paradox. Finally, suggestions for further developments are provided.Comment: 8 pages, 4 figure

    The Expected Fitness Cost of a Mutation Fixation under the One-dimensional Fisher Model

    Get PDF
    This paper employs Fisher’s model of adaptation to understand the expected fitness effect of fixing a mutation in a natural population. Fisher’s model in one dimension admits a closed form solution for this expected fitness effect. A combination of different parameters, including the distribution of mutation lengths, population sizes, and the initial state that the population is in, are examined to see how they affect the expected fitness effect of state transitions. The results show that the expected fitness change due to the fixation of a mutation is always positive, regardless of the distributional shapes of mutation lengths, effective population sizes, and the initial state that the population is in. The further away the initial state of a population is from the optimal state, the slower the population returns to the optimal state. Effective population size (except when very small) has little effect on the expected fitness change due to mutation fixation. The always positive expected fitness change suggests that small populations may not necessarily be doomed due to the runaway process of fixation of deleterious mutations

    Usefulness of Bayesian networks in epidemiological studies

    Get PDF
    Introduction: Bayesian networks are a form of statistical modelling, which has been widely used in fields like clinical decision, systems biology, human immunodeficiency virus (HIV) and influenza research, analyses of complex disease systems, interactions between multiple diseases and, also, in diagnostic diseases. The present study aimed to show the usefulness of Bayesian networks (BNs) in epidemiological studies. Material and Methods: 3,993 subjects (men 1,758, women 2,235) belonging to the public productive sector from the Balearic Islands (Spain), which were active workers, constitute the data set. Results: A BN was built from a dataset composed of twelve relevant features in cardiovascular disease epidemiology. Furthermore, the structure and parameters were learnt with GeNIe 2.0 tool. Taking into account the main topological properties some features were optimized, obtaining a hypothesized scenario where the likelihoods of the different features were updated and the adequate conclusions were established. Conclusions: Bayesian networks allow us to obtain a hypothetical scenario where the probabilities of the different features are updated according to the evidence that is introduced. This fact makes Bayesian networks a very attractive tool.Introducción: Las redes Bayesianas son una forma de modelización estadística, las cuales han sido ampliamente utilizadas en campos como la decisión clínica, biología de sistemas, virus de inmunodeficiencia humana (VIH) e investigación en influenza, análisis de sistemas de enfermedades complejos, interacciones entre múltiples enfermedades y, también, en enfermedades de diagnóstico. Este estudio tiene como objetivo mostrar la utilidad de las redes Bayesianas en estudios epidemiológicos. Material y Métodos: 3,993 individuos (hombres 1,758, mujeres 2,235) pertenecientes al sector productivo público de las Islas Baleares (España), los cuales eran trabajadores activos, constituyen la base de datos. Resultados: Una red Bayesiana se ha obtenido a partir de una base de datos compuesta de doce características relevantes de la epidemiología de la enfermedad cardiovascular. Por otra parte, la estructura y los parámetros se han obtenido con la herramienta Genie 2.0. Teniendo en cuenta las principales propiedades topológicas algunas características fueron optimizadas. Conclusiones: Las redes Bayesianas permiten obtener un escenario hipotético donde las probabilidades de las diferentes características se van actualizando de acuerdo con la evidencia introducida. Este hecho hace de las redes Bayesianas una herramienta muy atractiva, además permite establecer diversas conclusiones

    Improving epidemiologic data analyses through multivariate regression modelling

    Get PDF
    Regression modelling is one of the most widely utilized approaches in epidemiological analyses. It provides a method of identifying statistical associations, from which potential causal associations relevant to disease control may then be investigated. Multivariable regression – a single dependent variable (outcome, usually disease) with multiple independent variables (predictors) – has long been the standard model. Generalizing multivariable regression to multivariate regression – all variables potentially statistically dependent – offers a far richer modelling framework. Through a series of simple illustrative examples we compare and contrast these approaches. The technical methodology used to implement multivariate regression is well established – Bayesian network structure discovery – and while a relative newcomer to the epidemiological literature has a long history in computing science. Applications of multivariate analysis in epidemiological studies can provide a greater understanding of disease processes at the population level, leading to the design of better disease control and prevention programs

    The role of the humoral immune response in the molecular evolution of the envelope C2, V3 and C3 regions in chronically HIV-2 infected patients

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>This study was designed to investigate, for the first time, the short-term molecular evolution of the HIV-2 C2, V3 and C3 envelope regions and its association with the immune response. Clonal sequences of the <it>env </it>C2V3C3 region were obtained from a cohort of eighteen HIV-2 chronically infected patients followed prospectively during 2–4 years. Genetic diversity, divergence, positive selection and glycosylation in the C2V3C3 region were analysed as a function of the number of CD4+ T cells and the anti-C2V3C3 IgG and IgA antibody reactivity</p> <p>Results</p> <p>The mean intra-host nucleotide diversity was 2.1% (SD, 1.1%), increasing along the course of infection in most patients. Diversity at the amino acid level was significantly lower for the V3 region and higher for the C2 region. The average divergence rate was 0.014 substitutions/site/year, which is similar to that reported in chronic HIV-1 infection. The number and position of positively selected sites was highly variable, except for codons 267 and 270 in C2 that were under strong and persistent positive selection in most patients. N-glycosylation sites located in C2 and V3 were conserved in all patients along the course of infection. Intra-host variation of C2V3C3-specific IgG response over time was inversely associated with the variation in nucleotide and amino acid diversity of the C2V3C3 region. Variation of the C2V3C3-specific IgA response was inversely associated with variation in the number of N-glycosylation sites.</p> <p>Conclusion</p> <p>The evolutionary dynamics of HIV-2 envelope during chronic aviremic infection is similar to HIV-1 implying that the virus should be actively replicating in cellular compartments. Convergent evolution of N-glycosylation in C2 and V3, and the limited diversification of V3, indicates that there are important functional constraints to the potential diversity of the HIV-2 envelope. C2V3C3-specific IgG antibodies are effective at reducing viral population size limiting the number of virus escape mutants. The C3 region seems to be a target for IgA antibodies and increasing N-linked glycosylation may prevent HIV-2 envelope recognition by these antibodies. Our results provide new insights into the biology of HIV-2 and its relation with the human host and may have important implications for vaccine design.</p

    Dynamic features of the selective pressure on the human immunodeficiency virus type 1 (HIV-1) gp120 CD4-binding site in a group of long term non progressor (LTNP) subjects.

    Get PDF
    Abstract The characteristics of intra-host human immunodeficiency virus type 1 (HIV-1) env evolution were evaluated in untreated HIV-1-infected subjects with different patterns of disease progression, including 2 normal progressor [NP], and 5 Long term non-progressor [LTNP] patients. High-resolution phylogenetic analysis of the C2-C5 env gene sequences of the replicating HIV-1 was performed in sequential samples collected over a 3–5 year period; overall, 301 HIV-1 genomic RNA sequences were amplified from plasma samples, cloned, sequenced and analyzed. Firstly, the evolutionary rate was calculated separately in the 3 codon positions. In all LTNPs, the 3rd codon mutation rate was equal or even lower than that observed at the 1st and 2nd positions (p = 0.016), thus suggesting strong ongoing positive selection. A Bayesian approach and a maximum-likelihood (ML) method were used to estimate the rate of virus evolution within each subject and to detect positively selected sites respectively. A great number of N-linked glycosylation sites under positive selection were identified in both NP and LTNP subjects. Viral sequences from 4 of the 5 LTNPs showed extensive positive selective pressure on the CD4-binding site (CD4bs). In addition, localized pressure in the area of the IgG-b12 epitope, a broad neutralizing human monoclonal antibody targeting the CD4bs, was documented in one LTNP subject, using a graphic colour grade 3-dimensional visualization. Overall, the data shown here documenting high selective pressure on the HIV-1 CD4bs of a group of LTNP subjects offers important insights for planning novel strategies for the immune control of HIV-1 infection.</p

    An Evolutionary-Network Model Reveals Stratified Interactions in the V3 Loop of the HIV-1 Envelope

    Get PDF
    The third variable loop (V3) of the human immunodeficiency virus type 1 (HIV-1) envelope is a principal determinant of antibody neutralization and progression to AIDS. Although it is undoubtedly an important target for vaccine research, extensive genetic variation in V3 remains an obstacle to the development of an effective vaccine. Comparative methods that exploit the abundance of sequence data can detect interactions between residues of rapidly evolving proteins such as the HIV-1 envelope, revealing biological constraints on their variability. However, previous studies have relied implicitly on two biologically unrealistic assumptions: (1) that founder effects in the evolutionary history of the sequences can be ignored, and; (2) that statistical associations between residues occur exclusively in pairs. We show that comparative methods that neglect the evolutionary history of extant sequences are susceptible to a high rate of false positives (20%–40%). Therefore, we propose a new method to detect interactions that relaxes both of these assumptions. First, we reconstruct the evolutionary history of extant sequences by maximum likelihood, shifting focus from extant sequence variation to the underlying substitution events. Second, we analyze the joint distribution of substitution events among positions in the sequence as a Bayesian graphical model, in which each branch in the phylogeny is a unit of observation. We perform extensive validation of our models using both simulations and a control case of known interactions in HIV-1 protease, and apply this method to detect interactions within V3 from a sample of 1,154 HIV-1 envelope sequences. Our method greatly reduces the number of false positives due to founder effects, while capturing several higher-order interactions among V3 residues. By mapping these interactions to a structural model of the V3 loop, we find that the loop is stratified into distinct evolutionary clusters. We extend our model to detect interactions between the V3 and C4 domains of the HIV-1 envelope, and account for the uncertainty in mapping substitutions to the tree with a parametric bootstrap
    corecore