27 research outputs found

    scripts

    No full text
    Python and matlab scripts to analyse protease sequence dat

    protease_sequences_1998_to_2006

    No full text
    HIV-1 protease sequences collected from HIV Stanford Database on 17th September 2013. The data spans 9 years (1998-2006) and sequences from treated (i.e. patients receiving one or more protease inhibitors) and untreated patients are in separate files

    Figure_generator

    No full text
    iPython notebook to parse files and generate figures

    Allele frequencies as fitness for type 3 (aa) is varied.

    No full text
    <p>In this valley-crossing landscape, <i>w</i><sub>0</sub> is always 1 and <i>w</i><sub>1</sub> = <i>w</i><sub>2</sub> = 0. Plot shows allele frequencies <i>p</i><sub><i>i</i></sub> at mutation rate <i>ΞΌ</i> = 0.1 as a function of <i>w</i><sub>3</sub>. The intermediate types aA and Aa occur only at the rate of mutation as they have zero fitness.</p

    Average per-site entropies at every position of the HIV-1 protease.

    No full text
    <p>Untreated (top panel) and treated (bottom panel) datasets at the earliest (year 1998, red) and latest (year 2006, blue) time point of our analysis. 300 sequences are resampled from data for each year and average entropy for each position is calculated from the entropies in 10 resampled datasets. Site-specific variation generally increased across the protein following treatment. Entropy (variability) also increased from 1998 to 2006 for several positions. Error bars denote Β±1 SE.</p

    Changes in protease entropy and physico-chemical properties.

    No full text
    <p>Changes in per-site entropies (top panel), residue isoelectric point (middle panel), and residue weights (bottom panel) due to treatment. The property difference at each site is obtained by subtracting property (entropy/pI/residue-weight) value of the untreated data from that of the treated data. Average values are obtained by sampling sequence data from all years (1998–2006, 10 subsamples/year of 300 sequences each). Error bars represent Β±1 SE. Red dots represent positions known to be primary drug resistance loci, while black dots mark positions of compensatory or accessory mutations [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005960#pgen.1005960.ref052" target="_blank">52</a>]. Resistance loci are shaded in red and accessory loci are shaded in black.</p

    Increase in epistasis in HIV-1 protease over time.

    No full text
    <p>Pairwise interactions in the HIV-1 protease are shown for years 1998, 2002, and 2006 in the drug-free (top row) and drug environment (bottom row). Each heatmap shows the mutual information for each pair of residues. Pairwise information (and thus epistatic effects) are fairly constant in the drug-free environment, but gradually increase in the treated group.</p

    Two-loci two-allele model.

    No full text
    <p>The left panel shows the fitness landscapes and epistasis given by <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005960#pgen.1005960.e013" target="_blank">Eq (9)</a> in the first and second half of the simulation (updates 0–499: <i>w</i><sub>0</sub> = 1 and <i>w</i><sub>1</sub> = <i>w</i><sub>2</sub> = <i>w</i><sub>3</sub> = 10<sup>βˆ’5</sup> β‰ˆ 0; updates 500–1000: <i>w</i><sub>0</sub> = <i>w</i><sub>3</sub> = 1 and <i>w</i><sub>1</sub> = <i>w</i><sub>2</sub> = 10<sup>βˆ’5</sup> β‰ˆ 0). The xy-plane shows the four genotypes while the z-axis shows genotype fitness. The middle panel shows the genotype probabilities while the right panel shows the mutual information during the course of the simulation. Note that the increase in epistasis at the 500th update is reflected in the increase in mutual information. The mutation rate was 0.1 and starting population frequencies were <i>p</i><sub>0</sub> = 1 and <i>p</i><sub>1</sub> = <i>p</i><sub>2</sub> = <i>p</i><sub>3</sub> = 0.</p

    Estimates of the information content of the HIV-1 protease.

    No full text
    <p>Filled black circles represent data from untreated subjects and blue triangles represent data from treated individuals. <i>I</i><sub>1</sub> [see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005960#pgen.1005960.e017" target="_blank">Eq (12)</a>] is consistently low in treated sequence data over the years, indicating high sequence variability in the drug environment (top panel). The middle panel shows that the sum of pairwise mutual information significantly increases upon treatment (<i>p</i> ≀ 0.001). On adding the sum of pairwise mutual information to <i>I</i><sub>1</sub>, we obtain a comprehensive measure of information that considers pairwise interactions between residues [<i>I</i><sub>2</sub>, <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005960#pgen.1005960.e018" target="_blank">Eq (13)</a>]. <i>I</i><sub>2</sub> for both the treated and untreated data is comparable and unchanging over the years. We use data only for positions 15–90, as residues 1–14 as well as 91–99 have missing sequence data leading to error-prone estimates of entropy, as evidenced in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005960#pgen.1005960.s002" target="_blank">S2 Fig</a>. Error bars represent Β±1 SD.</p

    Correlation between epistasis and information.

    No full text
    <p>Each point corresponds to information and absolute value of epistasis calculated for one of the 10,000 combinations of <i>w</i><sub>0</sub>, <i>w</i><sub>1</sub>, <i>w</i><sub>2</sub>, and <i>w</i><sub>3</sub>. Here, <i>w</i><sub>0</sub> is always 1, and other fitness values are uniformly randomly assigned between 0 and 1. The inset shows the percentage of points with negligible information (<0.001 bits) as a function of epistasis.</p
    corecore