57 research outputs found

    CNV-seq, a new method to detect copy number variation using high-throughput sequencing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>DNA copy number variation (CNV) has been recognized as an important source of genetic variation. Array comparative genomic hybridization (aCGH) is commonly used for CNV detection, but the microarray platform has a number of inherent limitations.</p> <p>Results</p> <p>Here, we describe a method to detect copy number variation using shotgun sequencing, CNV-seq. The method is based on a robust statistical model that describes the complete analysis procedure and allows the computation of essential confidence values for detection of CNV. Our results show that the number of reads, not the length of the reads is the key factor determining the resolution of detection. This favors the next-generation sequencing methods that rapidly produce large amount of short reads.</p> <p>Conclusion</p> <p>Simulation of various sequencing methods with coverage between 0.1× to 8× show overall specificity between 91.7 – 99.9%, and sensitivity between 72.2 – 96.5%. We also show the results for assessment of CNV between two individual human genomes.</p

    The Newcomb-Benford Law in Its Relation to Some Common Distributions

    Get PDF
    An often reported, but nevertheless persistently striking observation, formalized as the Newcomb-Benford law (NBL), is that the frequencies with which the leading digits of numbers occur in a large variety of data are far away from being uniform. Most spectacular seems to be the fact that in many data the leading digit 1 occurs in nearly one third of all cases. Explanations for this uneven distribution of the leading digits were, among others, scale- and base-invariance. Little attention, however, found the interrelation between the distribution of the significant digits and the distribution of the observed variable. It is shown here by simulation that long right-tailed distributions of a random variable are compatible with the NBL, and that for distributions of the ratio of two random variables the fit generally improves. Distributions not putting most mass on small values of the random variable (e.g. symmetric distributions) fail to fit. Hence, the validity of the NBL needs the predominance of small values and, when thinking of real-world data, a majority of small entities. Analyses of data on stock prices, the areas and numbers of inhabitants of countries, and the starting page numbers of papers from a bibliography sustain this conclusion. In all, these findings may help to understand the mechanisms behind the NBL and the conditions needed for its validity. That this law is not only of scientific interest per se, but that, in addition, it has also substantial implications can be seen from those fields where it was suggested to be put into practice. These fields reach from the detection of irregularities in data (e.g. economic fraud) to optimizing the architecture of computers regarding number representation, storage, and round-off errors

    Assuring finite moments for willingness to pay in random coefficients models

    No full text
    Random coefficient models such as mixed logit are increasingly being used to allow for random heterogeneity in willingness to pay (WTP) measures. In the most commonly used specifications, the distribution of WTP for an attribute is derived from the distribution of the ratio of individual coefficients. Since the cost coefficient enters the denominator, its distribution plays a major role in the distribution of WTP. Depending on the choice of distribution for the cost coefficient, and its implied range, the distribution of WTP may or may not have finite moments. In this paper, we identify a criterion to determine whether, with a given distribution for the cost coefficient, the distribution of WTP has finite moments. Using this criterion, we show that some popular distributions used for the cost coefficient in random coefficient models, including normal, truncated normal, uniform and triangular, imply infinite moments for the distribution of WTP, even if truncated or bounded at zero. We also point out that relying on simulation approaches to obtain moments of WTP from the estimated distribution of the cost and attribute coefficients can mask the issue by giving finite moments when the true ones are infinite

    Change Point Estimation in Two-Phase Regression

    No full text
    • 

    corecore