Search CORE

1,944 research outputs found

A Faster Circular Binary Segmentation Algorithm for the Analysis of Array CGH Data

Author: Olshen Adam
Venkatraman E S
Publication venue: Collection of Biostatistics Research Archive
Publication date: 07/06/2006
Field of study

Motivation: Array CGH technologies enable the simultaneous measurement of DNA copy number for thousands of sites on a genome. We developed the circular binary segmentation (CBS) algorithm to divide the genome into regions of equal copy number (Olshen {\it et~al}, 2004). The algorithm tests for change-points using a maximal

t

-statistic with a permutation reference distribution to obtain the corresponding

p

-value. The number of computations required for the maximal test statistic is

O(N^2),

where

N

is the number of markers. This makes the full permutation approach computationally prohibitive for the newer arrays that contain tens of thousands markers and highlights the need for a faster. algorithm. Results: We present a hybrid approach to obtain the

p

-value of the test statistic in linear time. We also introduce a rule for stopping early when there is strong evidence for the presence of a change. We show through simulations that the hybrid approach provides a substantial gain in speed with only a negligible loss in accuracy and that the stopping rule further increases speed. We also present the analysis of array CGH data from a breast cancer cell line to show the impact of the new approaches on the analysis of real data. Availability: An R (R Development Core Team, 2006) version of the CBS algorithm has been implemented in the ``DNAcopy\u27\u27 package of the Bioconductor project (Gentleman {\it et~al}, 2004). The proposed hybrid method for the

p

-value is available in version 1.2.1 or higher and the stopping rule for declaring a change early is available in version 1.5.1 or higher

Collection Of Biostatistics Research Archive

Comparing ROC Curves Derived From Regression Models

Author: Begg Colin B
Gonen Mithat
Seshan Venkatraman E
Publication venue: Collection of Biostatistics Research Archive
Publication date: 08/06/2011
Field of study

In constructing predictive models, investigators frequently assess the incremental value of a predictive marker by comparing the ROC curve generated from the predictive model including the new marker with the ROC curve from the model excluding the new marker. Many commentators have noticed empirically that a test of the two ROC areas often produces a non-significant result when a corresponding Wald test from the underlying regression model is significant. A recent article showed using simulations that the widely-used ROC area test [1] produces exceptionally conservative test size and extremely low power [2]. In this article we show why the ROC area test is invalid in this context. We demonstrate how a valid test of the ROC areas can be constructed that has comparable statistical properties to the Wald test. We conclude that using the Wald test to assess the incremental contribution of a marker remains the best strategy. We also examine the use of derived markers from non-nested models and the use of validation samples. We show that comparing ROC areas is invalid in these contexts as well

Crossref

Collection Of Biostatistics Research Archive

Relativistic spin precession in the binary PSR J1141 $-$ 6545

Author: Bailes M.
Bhat N. D. R.
Flynn C.
Keane E. F.
Kramer M.
Krishnan V. Venkatraman
Osłowski S.
van Straten W.
Publication venue: 'American Astronomical Society'
Publication date: 25/02/2019
Field of study

PSR J1141

-

6545 is a precessing binary pulsar that has the rare potential to reveal the two-dimensional structure of a non-recycled pulsar emission cone. It has undergone

\sim 25 \deg

of relativistic spin precession in the

\sim18

years since its discovery. In this paper, we present a detailed Bayesian analysis of the precessional evolution of the width of the total intensity profile, to understand the changes to the line-of-sight impact angle (

\beta

) of the pulsar using four different physically motivated prior distribution models. Although we cannot statistically differentiate between the models with confidence, the temporal evolution of the linear and circular polarisations strongly argue that our line-of-sight crossed the magnetic pole around MJD 54000 and that only two models remain viable. For both these models, it appears likely that the pulsar will precess out of our line-of-sight in the next

3-5

years, assuming a simple beam geometry. Marginalising over

\beta

suggests that the pulsar is a near-orthogonal rotator and provides the first polarization-independent estimate of the scale factor (

\mathbb{A}

) that relates the pulsar beam opening angle (

\rho

) to its rotational period (

P

) as

\rho = \mathbb{A}P^{-0.5}

: we find it to be

> 6 \rm~deg~s^{0.5}

at 1.4 GHz with 99\% confidence. If all pulsars emit from opposite poles of a dipolar magnetic field with comparable brightness, we might expect to see evidence of an interpulse arising in PSR J1141

-

6545, unless the emission is patchy.Comment: Accepted for publication in Astrophysical Journal Letter

arXiv.org e-Print Archive

MPG.PuRe

Statistical Evaluation of Evidence for Clonal Allelic Alterations in array-CGH Experiments

Author: Begg Colin B
Eng Kevin
Olshen Adam
Venkatraman E S
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/03/2007
Field of study

In recent years numerous investigators have conducted genetic studies of pairs of tumor specimens from the same patient to determine whether the tumors share a clonal origin. These studies have the potential to be of considerable clinical significance, especially in clinical settings where the distinction of a new primary cancer and metastatic spread of a previous cancer would lead to radically different indications for treatment. Studies of clonality have typically involved comparison of the patterns of somatic mutations in the tumors at candidate genetic loci to see if the patterns are sufficiently similar to indicate a clonal origin. More recently, some investigators have explored the use of array CGH for this purpose. Standard clustering approaches have been used to analyze the data, but these existing statistical methods are not suited to this problem due to the paired nature of the data, and the fact that there exists no “gold standard” diagnosis to provide a definitive determination of which pairs are clonal and which pairs are of independent origin. In this article we propose a new statistical method that focuses on the individual allelic gains or losses that have been identified in both tumors, and a statistical test is developed that assesses the degree of matching of the locations of the markers that indicate the endpoints of the allelic change. The validity and statistical power of the test is evaluated, and it is shown to be a promising approach for establishing clonality in tumor samples

Collection Of Biostatistics Research Archive

Inferential Methods to Assess the Difference in the Area Under the Curve From Nested Binary Regression Models

Author: Gonen Mithat
Heller Glenn
Moskowitz Chaya S
Seshan Venkatraman E
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/01/2015
Field of study

The area under the curve (AUC) is the most common statistical approach to evaluate the discriminatory power of a set of factors in a binary regression model. A nested model framework is used to ascertain whether the AUC increases when new factors enter the model. Two statistical tests are proposed for the difference in the AUC parameters from these nested models. The asymptotic null distributions for the two test statistics are derived from the scenarios: (A) the difference in the AUC parameters is zero and the new factors are not associated with the binary outcome, (B) the difference in the AUC parameters is less than a strictly positive value. A confidence interval for the difference in AUC parameters is developed. Simulations are generated to determine the finite sample operating characteristics of the tests and a pancreatic cancer data example is used to illustrate this approach

Collection Of Biostatistics Research Archive

Polarization studies of Rotating Radio Transients

Author: Bailes M.
Barr E. D.
Caleb M.
Flynn C.
Ilie C. D.
Jameson A.
Keane E. F.
Krishnan V. Venkatraman
Petroff E.
Rogers A.
Stappers B. W.
van Straten W.
Weltevrede P.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2019
Field of study

We study the polarization properties of 22 known rotating radio transients (RRATs) with the 64-m Parkes radio telescope and present the Faraday rotation measures (RMs) for the 17 with linearly polarized flux exceeding the off-pulse noise by 3

\sigma

. Each RM was estimated using a brute-force search over trial RMs that spanned the maximum measurable range

\pm1.18 \times 10^5 \, \mathrm{rad \, m^2}

(in steps of 1

\mathrm{rad \, m^2}

), followed by an iterative refinement algorithm. The measured RRAT RMs are in the range |RM|

\sim 1

\sim 950

rad m

^{-2}

with an average linear polarization fraction of

\sim 40

per cent. Individual single pulses are observed to be up to 100 per cent linearly polarized. The RMs of the RRATs and the corresponding inferred average magnetic fields (parallel to the line-of-sight and weighted by the free electron density) are observed to be consistent with the Galactic plane pulsar population. Faraday rotation analyses are typically performed on accumulated pulsar data, for which hundreds to thousands of pulses have been integrated, rather than on individual pulses. Therefore, we verified the iterative refinement algorithm by performing Monte Carlo simulations of artificial single pulses over a wide range of S/N and RM. At and above a S/N of 17 in linearly polarized flux, the iterative refinement recovers the simulated RM value 100 per cent of the time with a typical mean uncertainty of

\sim5

rad m

^{-2}

. The method described and validated here has also been successfully used to determine reliable RMs of several fast radio bursts (FRBs) discovered at Parkes.Comment: Submitted to MNRAS, 10 pages, 6 figure

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

UvA-DARE

MPG.PuRe

International Migration, Integration and Social Cohesion online publications

Estimating the Empirical Lorenz Curve and Gini Coefficient in the Presence of Error

Author: Begg Colin B.
Moskowitz Chaya S
Riedel Elyn
Venkatraman E. S.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 15/01/2007
Field of study

The Lorenz curve is a graphical tool that is widely used to characterize the concentration of a measure in a population, such as wealth. It is frequently the case that the measure of interest used to rank experimental units when estimating the empirical Lorenz curve, and the corresponding Gini coefficient, is subject to random error. This error can result in an incorrect ranking of experimental units which inevitably leads to a curve that exaggerates the degree of concentration (variation) in the population. We explore this bias and discuss several widely available statistical methods that have the potential to reduce or remove the bias in the empirical Lorenz curve. The properties of these methods are examined and compared in a simulation study. This work is motivated by a health outcomes application which seeks to assess the concentration of black patient visits among primary care physicians. The methods are illustrated on data from this study

Collection Of Biostatistics Research Archive

Quantum interference of tunneling paths under a double-well barrier

Author: Cortinas Rodrigo G.
Devoret Michel H.
Frattini Nicholas E.
Venkatraman Jayameenakshi
Xiao Xu
Publication venue
Publication date: 08/11/2022
Field of study

The tunnel effect, a hallmark of the quantum realm, involves motion across a classically forbidden region. In a driven nonlinear system, two or more tunneling paths may coherently interfere, enhancing or cancelling the tunnel effect. Since individual quantum systems are difficult to control, this interference effect has only been studied for the lowest energy states of many-body ensembles. In our experiment, we show a coherent cancellation of the tunneling amplitude in the ground and excited state manifold of an individual squeeze-driven Kerr oscillator, a consequence of the destructive interference of tunneling paths in the classically forbidden region. The tunnel splitting vanishes periodically in the spectrum as a function of the frequency of the squeeze-drive, with the periodicity given by twice the Kerr coefficient. This resonant cancellation, combined with an overall exponential reduction of tunneling as a function of both amplitude and frequency of the squeeze-drive, reduces drastically the well-switching rate under incoherent environment-induced evolution. The control of tunneling via interference effects can be applied to quantum computation, molecular, and nuclear physics

arXiv.org e-Print Archive

A Metastasis or a Second Independent Cancer? Evaluating the Clonal Origin of Tumors Using Array-CGH Data

Author: Albertson D G
Begg Colin B
Olshen Adam
Orlow Irene
Ostrovnaya Irina
Seshan Venkatraman E
Publication venue: Collection of Biostatistics Research Archive
Publication date: 12/08/2008
Field of study

When a cancer patient develops a new tumor it is necessary to determine if this is a recurrence (metastasis) of the original cancer, or an entirely new occurrence of the disease. This is accomplished by assessing the histo-pathology of the lesions, and it is frequently relatively straightforward. However, there are many clinical scenarios in which this pathological diagnosis is difficult. Since each tumor is characterized by a genetic fingerprint of somatic mutations, a more definitive diagnosis is possible in principle in these difficult clinical scenarios by comparing the fingerprints. In this article we develop and evaluate a statistical strategy for this comparison when the data are derived from array comparative genomic hybridization, a technique designed to identify all of the somatic allelic gains and losses across the genome. Our method involves several stages. First a segmentation algorithm is used to estimate the regions of allelic gain and loss. Then the broad correlation in these patterns between the two tumors is assessed, leading to an initial likelihood ratio for the two diagnoses. This is then further refined by comparing in detail each plausibly clonal mutation within individual chromosome arms, and the results are aggregated to determine a final likelihood ratio. The method is employed to diagnose patients from several clinical scenarios, and the results show that in many cases a strong clonal signal emerges, occasionally contradicting the clinical diagnosis. The “quality” of the arrays can be summarized by a parameter that characterizes the clarity with which allelic changes are detected. Sensitivity analyses show that most of the diagnoses are robust when the data are of high quality

Collection Of Biostatistics Research Archive