624 research outputs found

    Some Observations on the Wilcoxon Rank Sum Test

    Get PDF
    This manuscript presents some general comments about the Wilcoxon rank sum test. Even the most casual reader will gather that I am not too impressed with the scientific usefulness of the Wilcoxon test. However, the actual motivation is more to illustrate differences between parametric, semiparametric, and nonparametric (distribution-free) inference, and to use this example to illustrate how many misconceptions have been propagated through a focus on (semi)parametric probability models as the basis for evaluating commonly used statistical analysis models. The document itself arose as a teaching tool for courses aimed at graduate students in biostatistics and statistics, with parts of the document originally written for applied biostatistics classes and parts written for a course in mathematical statistics. Hence, some of the material is also meant to provide an illustration of common methods of deriving moments of distributions, etc

    The Importance of Statistical Theory in Outlier Detection

    Get PDF
    We explore the performance of the outlier-sum statistic (Tibshirani and Hastie, Biostatistics 2007 8:2--8), a proposed method for identifying genes for which only a subset of a group of samples or patients exhibits differential expression levels. Our discussion focuses on this method as an example of how inattention to standard statistical theory can lead to approaches that exhibit some serious drawbacks. In contrast to the results presented by those authors, when comparing this method to several variations of the tt-test, we find that the proposed method offers little benefit even in the most idealized scenarios, and suffers from a number of limitations including difficulty of calibration, high false positive rates owing to its asymmetric treatment of groups, poor power or discriminatory ability under many alternatives, and poorly defined application to one-sample settings. Further issues in the Tibshirani and Hastie paper concern the presentation and accuracy of their simulation results; we were unable to reproduce their findings, and we discuss several undesirable and implausible aspects of their results

    Exploring the Benefits of Adaptive Sequential Designs in Time-to-Event Endpoint Settings

    Get PDF
    Sequential analysis is frequently employed to address ethical and financial issues in clinical trials. Sequential analysis may be performed using standard group sequential designs, or, more recently, with adaptive designs that use estimates of treatment effect to modify the maximal statistical information to be collected. In the general setting in which statistical information and clinical trial costs are functions of the number of subjects used, it has yet to be established whether there is any major efficiency advantage to adaptive designs over traditional group sequential designs. In survival analysis, however, statistical information (and hence efficiency) is most closely related to the observed number of events, while trial costs still depend on the number of patients accrued. As the number of subjects may dominate the cost of a trial, an adaptive design that specifies a reduced maximal possible sample size when an extreme treatment effect has been observed may allow early termination of accrual and therefore a more costefficient trial. We investigate and compare the tradeoffs between efficiency (as measured by average number of observed events required), power, and cost (a function of the number of subjects accrued and length of observation) for standard group sequential methods and an adaptive design that allows for early termination of accrual. We find that when certain trial design parameters are constrained, an adaptive approach to terminating subject accrual may improve upon the cost efficiency of a group sequential clinical trial investigating time-to-event endpoints. However, when the spectrum of group sequential designs considered is broadened, the advantage of the adaptive designs is less clear

    Constrained Boundary Monitoring for Group Sequential Clinical Trials

    Get PDF
    Group sequential stopping rules are often used during the conduct of clinical trials in order to attain more ethical treatment of patients and to better address efficiency concerns. Because the use of such stopping rules materially affects the frequentist operating characteristics of the hypothesis test, it is necessary to choose an appropriate stopping rule during the planning of the study. It is often the case, however, that the number and timing of interim analyses are not precisely known at the time of trial design, and thus the implementation of a particular stopping rule must allow for flexible determination of the schedule of interim analyses. In this paper we consider the use of constrained stopping boundaries in the implementation of stopping rules. We compare this approach when used on various scales for the test statistic. When implemented on the scale of boundary crossing probabilities, this approach is identical to the error spending function approach of Lan & DeMets (1983)

    Evaluating a Group Sequential Design in the Setting of Nonproportional Hazards

    Get PDF
    Group sequential methods have been widely described and implemented in a clinical trial setting where parametric and semiparametric models are deemed suitable. In these situations, the evaluation of the operating characteristics of a group sequential stopping rule remains relatively straightforward. However, in the presence of nonproportional hazards survival data nonparametric methods are often used, and the evaluation of stopping rules is no longer a trivial task. Specifically, nonparametric test statistics do not necessarily correspond to a parameter of clinical interest, thus making it difficult to characterize alternatives at which operating characteristics are to be computed. We describe an approach for constructing alternatives under nonproportional hazards using pre-existing pilot data, allowing one to evaluate various operating characteristics of candidate group sequential stopping rules. The method is illustrated via a case study in which testing is based upon a weighted logrank statistic

    Robustness of approaches to ROC curve modeling under misspecification of the underlying probability model

    Get PDF
    The receiver operating characteristic (ROC) curve is a tool of particular use in disease status classification with a continuous medical test (marker). A variety of statistical regression models have been proposed for the comparison of ROC curves for different markers across covariate groups. A full parametric modeling of the marker distribution has been generally found to be overly reliant on the strong parametric assumptions. Pepe (2003) has instead developed parametric models for the ROC curve that induce a semi-parametric model for the marker distributions. The estimating equations proposed for use in these ROC-GLM models may differ from commonly used estimating equations in those same probability models. In this paper, we investigate the analysis of the power ROC curve when based on the parametric exponential model and the broader semi-parametric proportional hazards probability model. In the case of the latter, we consider estimating equations derived from the usual partial likelihood methods in time-to-event analyses and the ROC-GLM approach of Pepe, et al. In exploring the robustness of these ROC analysis approaches to violations of the distributional assumptions, we find that the ROC-GLM estimating equation provides an extra measure of robustness when compared to the Cox proportional hazards estimating equation

    Bio-Creep in Non-Inferiority Clinical Trials

    Get PDF
    After a non-inferiority clinical trial, a new therapy may be accepted as effective, even if its treatment effect is slightly smaller than the current standard. It is therefore possible that, after a series of trials where the new therapy is slightly worse than the preceding drugs, an ineffective or harmful therapy might be incorrectly declared efficacious; this is known as “bio-creep.” Several factors may influence the rate at which bio-creep occurs, including the distribution of the effects of the new agents being tested and how that changes over time, the choice of active comparator, the method used to model the variability of the estimate of the effect of the active comparator, and changes in the effect of the active comparator from one trial to the next (violations of the constancy assumption). We performed a simulation study to examine which of these factors might lead to bio-creep and found that bio-creep was rare, except when the constancy assumption was violated

    Estimates of Information Growth in Longitudinal Clinical Trials

    Get PDF
    In group sequential clinical trials, it is necessary to estimate the amount of information present at interim analysis times relative to the amount of information that would be present at the final analysis. If only one measurement is made per individual, this is often the ratio of sample sizes available at the interim and final analyses. However, as discussed by Wu and Lan (1992), when the statistic of interest is a change over time, as with longitudinal data, such an approach overstates the information. In this paper, we discuss other problems that can result in overestimating the information, such as heteroscedasticity and correlated observations. We demonstrate that when using an inefficient estimator on unbalanced data, the true information growth can be nonmonotonic across interim analyses

    A Comparison of Parametric and Coarsened Bayesian Interval Estimation in the Presence of a Known Mean-Variance Relationship

    Get PDF
    While the use of Bayesian methods of analysis have become increasingly common, classical frequentist hypothesis testing still holds sway in medical research - especially clinical trials. One major difference between a standard frequentist approach and the most common Bayesian approaches is that even when a frequentist hypothesis test is derived from parametric models, the interpretation and operating characteristics of the test may be considered in a distribution-free manner. Bayesian inference, on the other hand, is often conducted in a parametric setting where the interpretation of the results is dependent on the parametric model. Here we consider a Bayesian counterpart to the most standard frequentist approach to inference. Instead of specifying a sampling distribution for the data we specify an approximate distribution of a summary statistic, thereby resulting in a ``coarsening\u27\u27 of the data. This approach is robust in that it provides some protection against model misspecification and allows one to account for the possibility of a specified mean-variance relationship. Notably, the method also allows one to place prior mass directly on the quantity of interest or, alternatively, to employ a noninformative prior - a counterpart to the standard frequentist approach. We explore interval estimation of a population location parameter in the presence of a mean-variance relationship - a problem that is not well addressed by standard nonparametric frequentist methods. We find that the method has performance comparable to the correct parametric model, and performs notably better than some plausible yet incorrect models. Finally, we apply the method to a real data set and compare ours to previously reported results

    Nonparametric and Semiparametric Group Sequential Methods for Comparing Accuracy of Diagnostic Tests

    Get PDF
    Comparison of the accuracy of two diagnostic tests using the receiver operating characteristic (ROC) curves from two diagnostic tests has been typically conducted using fixed sample designs. On the other hand, the human experimentation inherent in a comparison of diagnostic modalities argues for periodic monitoring of the accruing data to address many issues related to the ethics and efficiency of the medical study. To date, very little research has been done in the use of sequential sampling plans for comparative ROC studies, even when these studies may use expensive and unsafe diagnostic procedures. In this paper, we propose a nonparametric group sequential design plan. The nonparametric sequential method adapts a nonparametric family of weighted area under the ROC curve statistics (Wieand et al., Biometrika, 76: 585-592, 1989) and a group sequential sampling plan. We illustrate the implementation of this nonparametric approach for sequentially comparing ROC curves in the context of diagnostic screening for non-small cell lung cancer. We also describe a semiparametric sequential method based on proportional hazard models. We compare the statistical properties of the nonparametric approach to alternative semiparametric and parametric analyses in simulation studies. The results show the nonparametric approach is robust to model misspecification and has excellent finite sample performance
    corecore