16 research outputs found

    Impact of the shedding level on transmission of persistent infections in Mycobacterium avium subspecies paratuberculosis (MAP)

    Get PDF
    Super-shedders are infectious individuals that contribute a disproportionate amount of infectious pathogen load to the environment. A super-shedder host may produce up to 10 000 times more pathogens than other infectious hosts. Super-shedders have been reported for multiple human and animal diseases. If their contribution to infection dynamics was linear to the pathogen load, they would dominate infection dynamics. We here focus on quantifying the effect of super-shedders on the spread of infection in natural environments to test if such an effect actually occurs in Mycobacterium avium subspecies paratuberculosis (MAP). We study a case where the infection dynamics and the bacterial load shed by each host at every point in time are known. Using a maximum likelihood approach, we estimate the parameters of a model with multiple transmission routes, including direct contact, indirect contact and a background infection risk. We use longitudinal data from persistent infections (MAP), where infectious individuals have a wide distribution of infectious loads, ranging upward of three orders of magnitude. We show based on these parameters that the effect of super-shedders for MAP is limited and that the effect of the individual bacterial load is limited and the relationship between bacterial load and the infectiousness is highly concave. A 1000-fold increase in the bacterial contribution is equivalent to up to a 2–3 fold increase in infectiousness.https://doi.org/10.1186/s13567-016-0323-

    Power Laws for Heavy-Tailed Distributions: Modeling Allele and Haplotype Diversity for the National Marrow Donor Program

    No full text
    <div><p>Measures of allele and haplotype diversity, which are fundamental properties in population genetics, often follow heavy tailed distributions. These measures are of particular interest in the field of hematopoietic stem cell transplant (HSCT). Donor/Recipient suitability for HSCT is determined by Human Leukocyte Antigen (HLA) similarity. Match predictions rely upon a precise description of HLA diversity, yet classical estimates are inaccurate given the heavy-tailed nature of the distribution. This directly affects HSCT matching and diversity measures in broader fields such as species richness. We, therefore, have developed a power-law based estimator to measure allele and haplotype diversity that accommodates heavy tails using the concepts of regular variation and occupancy distributions. Application of our estimator to 6.59 million donors in the Be The Match Registry revealed that haplotypes follow a heavy tail distribution across all ethnicities: for example, 44.65% of the European American haplotypes are represented by only 1 individual. Indeed, our discovery rate of all U.S. European American haplotypes is estimated at 23.45% based upon sampling 3.97% of the population, leaving a large number of unobserved haplotypes. Population coverage, however, is much higher at 99.4% given that 90% of European Americans carry one of the 4.5% most frequent haplotypes. Alleles were found to be less diverse suggesting the current registry represents most alleles in the population. Thus, for HSCT registries, haplotype discovery will remain high with continued recruitment to a very deep level of sampling, but population coverage will not. Finally, we compared the convergence of our power-law versus classical diversity estimators such as Capture recapture, Chao, ACE and Jackknife methods. When fit to the haplotype data, our estimator displayed favorable properties in terms of convergence (with respect to sampling depth) and accuracy (with respect to diversity estimates). This suggests that power-law based estimators offer a valid alternative to classical diversity estimators and may have broad applicability in the field of population genetics.</p></div

    Impact of the shedding level on transmission of persistent infections in Mycobacterium avium subspecies paratuberculosis (MAP)

    No full text
    Super-shedders are infectious individuals that contribute a disproportionate amount of infectious pathogen load to the environment. A super-shedder host may produce up to 10 000 times more pathogens than other infectious hosts. Super-shedders have been reported for multiple human and animal diseases. If their contribution to infection dynamics was linear to the pathogen load, they would dominate infection dynamics. We here focus on quantifying the effect of super-shedders on the spread of infection in natural environments to test if such an effect actually occurs in Mycobacterium avium subspecies paratuberculosis (MAP). We study a case where the infection dynamics and the bacterial load shed by each host at every point in time are known. Using a maximum likelihood approach, we estimate the parameters of a model with multiple transmission routes, including direct contact, indirect contact and a background infection risk. We use longitudinal data from persistent infections (MAP), where infectious individuals have a wide distribution of infectious loads, ranging upward of three orders of magnitude. We show based on these parameters that the effect of super-shedders for MAP is limited and that the effect of the individual bacterial load is limited and the relationship between bacterial load and the infectiousness is highly concave. A 1000-fold increase in the bacterial contribution is equivalent to up to a 2-3 fold increase in infectiousness

    Properties for haplotypes frequency distribution for 5 populations

    No full text
    <p>: Native Americans (NAM), European Americans (CAU), Hispanic (HIS), African-Americans (AFA) and Asian Pacific Islanders (API). (A) Histogram of haplotype relative frequency distribution for the five combined populations: Native Americans (NAM, black circles), European Americans (CAU, gray triangles), Hispanic (HIS, green diamonds), African-Americans (AFA, purple circles) and Asian Pacific Islanders (API, red circles). (B) Estimated (black) and currently observed (white) total number of haplotypes. (C) Fraction of non-covered population for the five combined populations. (D) Current sample size (white) and estimated sample size required to get coverage similar to the European American population (black) for the five combined populations. Note that if the fit to a truncated power law would be precise, the required and observed population size would be precisely equal for the Caucasian population. However, there are some limited deviations in the fit, and the required population from the theoretical analysis is 7.76E+06, while the observed population size is 7.82E6 (a difference of less than 1%).</p

    Estimation of <i>α</i> and of haplotype number for different sample sizes.

    No full text
    <p><b>A.</b> Convergence of estimates of the power of distribution <i>α</i> to real value in samples from a simulated scale free distribution with a lower cutoff. The real value is the black line with the 'x' signs. The estimate using our method quickly converges to the proper value—1.5 (purple line with diamonds). The estimates using either the discrete (red line with squares) or continuous (blue lines with circles) Clauset estimates, or the Ohannessian et al. estimate (green lines with circles) do not converge, even when more than half the distribution is sampled. A Clauset discrete estimate with a minimal cutoff (orange line with circles) converges almost as well as our algorithm. <b>B.</b> Comparison of haplotype number estimate as a function of the sample size, using a capture recapture method (red squares), Jackknife estimators and the parametric estimate proposed here (blue diamonds), using the same simulation as above. The real number is a black full line with 'x' signs. One can clearly see that the parametric estimate developed here converges to a good estimate, even for a very small sample.</p

    Schematic figure.

    No full text
    <p>(A) We observe a population with different haplotypes. From the population we extract two measures—the haplotype frequency distribution (B) and the number of unique haplotypes as a function of the sample size (C). We assume that the frequency distribution (B) is a scale free distribution with upper and lower cutoff values <i>X</i><sub>min</sub>≤<i>x</i>≤<i>X</i><sub>max</sub>. For the sake of simplicity, we assume a zero probability to observe haplotypes with frequencies beyond these values. We provide an initial guess for the lower cutoff to be limited by the total population size, and the upper cutoff, we limit by an upper estimate from the total population size (Eq B6). We use an initial guess of <i>X</i><sub>min</sub> at its boundary, the Clauset estimate for the slope, and the observed highest frequency <i>X</i> 0 <sub>max</sub>. We then fit the observed unique haplotype curve (C) with an analytical formula for the expected shape (D) with a cost function of </p><p></p><p></p><p></p><p></p><p><mo>∑</mo></p><p>R'</p><p></p><p></p><p></p><p></p><p><mo>(</mo></p><p><mi>log</mi><mo stretchy="false">(</mo>U(R')-log(observed (R')</p><mo>)</mo><p></p><p></p><mn>2</mn><p></p><p></p><p></p><mo>*</mo><mi>log</mi><mo stretchy="false">(</mo>observed (R'))<p></p><p></p><p></p> for different values of sample sizes R'. Finally, we extract the optimal parameters (E) and produce an estimate of the number of unique haplotypes for any population size N, where N is the target population size.<p></p

    Alleles results.

    No full text
    <p>(left) Known (colored) vs. total alleles (transparent) for the five combined populations. The fraction of known alleles is between 50 and 100%. (right) Fraction of uncovered population per allele and combined population. These fractions are much lower than for haplotypes and never reach more than 0.1%.</p

    The values of <i>α</i> obtained from the 21 detailed US populations studied here.

    No full text
    <p>The second column is the number of samples used for these populations. The following columns are <i>α</i> estimates, based on the Clauset estimator (third column) and our parametric method, using half the sample or the full sample (last two columns).</p><p>The values of <i>α</i> obtained from the 21 detailed US populations studied here.</p
    corecore