1,944 research outputs found
A Faster Circular Binary Segmentation Algorithm for the Analysis of Array CGH Data
Motivation: Array CGH technologies enable the simultaneous measurement of DNA copy number for thousands of sites on a genome. We developed the circular binary segmentation (CBS) algorithm to divide the genome into regions of equal copy number (Olshen {\it et~al}, 2004). The algorithm tests for change-points using a maximal -statistic with a permutation reference distribution to obtain the corresponding -value. The number of computations required for the maximal test statistic is where is the number of markers. This makes the full permutation approach computationally prohibitive for the newer arrays that contain tens of thousands markers and highlights the need for a faster. algorithm.
Results: We present a hybrid approach to obtain the -value of the test statistic in linear time. We also introduce a rule for stopping early when there is strong evidence for the presence of a change. We show through simulations that the hybrid approach provides a substantial gain in speed with only a negligible loss in accuracy and that the stopping rule further increases speed. We also present the analysis of array CGH data from a breast cancer cell line to show the impact of the new approaches on the analysis of real data.
Availability: An R (R Development Core Team, 2006) version of the CBS algorithm has been implemented in the ``DNAcopy\u27\u27 package of the Bioconductor project (Gentleman {\it et~al}, 2004). The proposed hybrid method for the -value is available in version 1.2.1 or higher and the stopping rule for declaring a change early is available in version 1.5.1 or higher
Comparing ROC Curves Derived From Regression Models
In constructing predictive models, investigators frequently assess the incremental value of a predictive marker by comparing the ROC curve generated from the predictive model including the new marker with the ROC curve from the model excluding the new marker. Many commentators have noticed empirically that a test of the two ROC areas often produces a non-significant result when a corresponding Wald test from the underlying regression model is significant. A recent article showed using simulations that the widely-used ROC area test [1] produces exceptionally conservative test size and extremely low power [2]. In this article we show why the ROC area test is invalid in this context. We demonstrate how a valid test of the ROC areas can be constructed that has comparable statistical properties to the Wald test. We conclude that using the Wald test to assess the incremental contribution of a marker remains the best strategy. We also examine the use of derived markers from non-nested models and the use of validation samples. We show that comparing ROC areas is invalid in these contexts as well
Relativistic spin precession in the binary PSR J11416545
PSR J11416545 is a precessing binary pulsar that has the rare potential to
reveal the two-dimensional structure of a non-recycled pulsar emission cone. It
has undergone of relativistic spin precession in the
years since its discovery. In this paper, we present a detailed Bayesian
analysis of the precessional evolution of the width of the total intensity
profile, to understand the changes to the line-of-sight impact angle ()
of the pulsar using four different physically motivated prior distribution
models. Although we cannot statistically differentiate between the models with
confidence, the temporal evolution of the linear and circular polarisations
strongly argue that our line-of-sight crossed the magnetic pole around MJD
54000 and that only two models remain viable. For both these models, it appears
likely that the pulsar will precess out of our line-of-sight in the next
years, assuming a simple beam geometry. Marginalising over suggests
that the pulsar is a near-orthogonal rotator and provides the first
polarization-independent estimate of the scale factor () that
relates the pulsar beam opening angle () to its rotational period ()
as : we find it to be at 1.4
GHz with 99\% confidence. If all pulsars emit from opposite poles of a dipolar
magnetic field with comparable brightness, we might expect to see evidence of
an interpulse arising in PSR J11416545, unless the emission is patchy.Comment: Accepted for publication in Astrophysical Journal Letter
Statistical Evaluation of Evidence for Clonal Allelic Alterations in array-CGH Experiments
In recent years numerous investigators have conducted genetic studies of pairs of tumor specimens from the same patient to determine whether the tumors share a clonal origin. These studies have the potential to be of considerable clinical significance, especially in clinical settings where the distinction of a new primary cancer and metastatic spread of a previous cancer would lead to radically different indications for treatment. Studies of clonality have typically involved comparison of the patterns of somatic mutations in the tumors at candidate genetic loci to see if the patterns are sufficiently similar to indicate a clonal origin. More recently, some investigators have explored the use of array CGH for this purpose. Standard clustering approaches have been used to analyze the data, but these existing statistical methods are not suited to this problem due to the paired nature of the data, and the fact that there exists no “gold standard” diagnosis to provide a definitive determination of which pairs are clonal and which pairs are of independent origin. In this article we propose a new statistical method that focuses on the individual allelic gains or losses that have been identified in both tumors, and a statistical test is developed that assesses the degree of matching of the locations of the markers that indicate the endpoints of the allelic change. The validity and statistical power of the test is evaluated, and it is shown to be a promising approach for establishing clonality in tumor samples
Inferential Methods to Assess the Difference in the Area Under the Curve From Nested Binary Regression Models
The area under the curve (AUC) is the most common statistical approach to evaluate the discriminatory power of a set of factors in a binary regression model. A nested model framework is used to ascertain whether the AUC increases when new factors enter the model. Two statistical tests are proposed for the difference in the AUC parameters from these nested models. The asymptotic null distributions for the two test statistics are derived from the scenarios: (A) the difference in the AUC parameters is zero and the new factors are not associated with the binary outcome, (B) the difference in the AUC parameters is less than a strictly positive value. A confidence interval for the difference in AUC parameters is developed. Simulations are generated to determine the finite sample operating characteristics of the tests and a pancreatic cancer data example is used to illustrate this approach
Polarization studies of Rotating Radio Transients
We study the polarization properties of 22 known rotating radio transients
(RRATs) with the 64-m Parkes radio telescope and present the Faraday rotation
measures (RMs) for the 17 with linearly polarized flux exceeding the off-pulse
noise by 3. Each RM was estimated using a brute-force search over trial
RMs that spanned the maximum measurable range (in steps of 1 ), followed by an
iterative refinement algorithm. The measured RRAT RMs are in the range |RM|
to rad m with an average linear polarization
fraction of per cent. Individual single pulses are observed to be up
to 100 per cent linearly polarized. The RMs of the RRATs and the corresponding
inferred average magnetic fields (parallel to the line-of-sight and weighted by
the free electron density) are observed to be consistent with the Galactic
plane pulsar population. Faraday rotation analyses are typically performed on
accumulated pulsar data, for which hundreds to thousands of pulses have been
integrated, rather than on individual pulses. Therefore, we verified the
iterative refinement algorithm by performing Monte Carlo simulations of
artificial single pulses over a wide range of S/N and RM. At and above a S/N of
17 in linearly polarized flux, the iterative refinement recovers the simulated
RM value 100 per cent of the time with a typical mean uncertainty of
rad m. The method described and validated here has also been
successfully used to determine reliable RMs of several fast radio bursts (FRBs)
discovered at Parkes.Comment: Submitted to MNRAS, 10 pages, 6 figure
Estimating the Empirical Lorenz Curve and Gini Coefficient in the Presence of Error
The Lorenz curve is a graphical tool that is widely used to characterize the concentration of a measure in a population, such as wealth. It is frequently the case that the measure of interest used to rank experimental units when estimating the empirical Lorenz curve, and the corresponding Gini coefficient, is subject to random error. This error can result in an incorrect ranking of experimental units which inevitably leads to a curve that exaggerates the degree of concentration (variation) in the population. We explore this bias and discuss several widely available statistical methods that have the potential to reduce or remove the bias in the empirical Lorenz curve. The properties of these methods are examined and compared in a simulation study. This work is motivated by a health outcomes application which seeks to assess the concentration of black patient visits among primary care physicians. The methods are illustrated on data from this study
Quantum interference of tunneling paths under a double-well barrier
The tunnel effect, a hallmark of the quantum realm, involves motion across a
classically forbidden region. In a driven nonlinear system, two or more
tunneling paths may coherently interfere, enhancing or cancelling the tunnel
effect. Since individual quantum systems are difficult to control, this
interference effect has only been studied for the lowest energy states of
many-body ensembles. In our experiment, we show a coherent cancellation of the
tunneling amplitude in the ground and excited state manifold of an individual
squeeze-driven Kerr oscillator, a consequence of the destructive interference
of tunneling paths in the classically forbidden region. The tunnel splitting
vanishes periodically in the spectrum as a function of the frequency of the
squeeze-drive, with the periodicity given by twice the Kerr coefficient. This
resonant cancellation, combined with an overall exponential reduction of
tunneling as a function of both amplitude and frequency of the squeeze-drive,
reduces drastically the well-switching rate under incoherent
environment-induced evolution. The control of tunneling via interference
effects can be applied to quantum computation, molecular, and nuclear physics
A Metastasis or a Second Independent Cancer? Evaluating the Clonal Origin of Tumors Using Array-CGH Data
When a cancer patient develops a new tumor it is necessary to determine if this is a recurrence (metastasis) of the original cancer, or an entirely new occurrence of the disease. This is accomplished by assessing the histo-pathology of the lesions, and it is frequently relatively straightforward. However, there are many clinical scenarios in which this pathological diagnosis is difficult. Since each tumor is characterized by a genetic fingerprint of somatic mutations, a more definitive diagnosis is possible in principle in these difficult clinical scenarios by comparing the fingerprints. In this article we develop and evaluate a statistical strategy for this comparison when the data are derived from array comparative genomic hybridization, a technique designed to identify all of the somatic allelic gains and losses across the genome. Our method involves several stages. First a segmentation algorithm is used to estimate the regions of allelic gain and loss. Then the broad correlation in these patterns between the two tumors is assessed, leading to an initial likelihood ratio for the two diagnoses. This is then further refined by comparing in detail each plausibly clonal mutation within individual chromosome arms, and the results are aggregated to determine a final likelihood ratio. The method is employed to diagnose patients from several clinical scenarios, and the results show that in many cases a strong clonal signal emerges, occasionally contradicting the clinical diagnosis. The “quality” of the arrays can be summarized by a parameter that characterizes the clarity with which allelic changes are detected. Sensitivity analyses show that most of the diagnoses are robust when the data are of high quality
- …