28 research outputs found

    A P-value model for theoretical power analysis and its applications in multiple testing procedures

    Get PDF
    Background: Power analysis is a critical aspect of the design of experiments to detect an effect of a given size. When multiple hypotheses are tested simultaneously, multiplicity adjustments to p-values should be taken into account in power analysis. There are a limited number of studies on power analysis in multiple testing procedures. For some methods, the theoretical analysis is difficult and extensive numerical simulations are often needed, while other methods oversimplify the information under the alternative hypothesis. To this end, this paper aims to develop a new statistical model for power analysis in multiple testing procedures. Methods: We propose a step-function-based p-value model under the alternative hypothesis, which is simple enough to perform power analysis without simulations, but not too simple to lose the information from the alternative hypothesis. The first step is to transform distributions of different test statistics (e.g., t, chi-square or F) to distributions of corresponding p-values. We then use a step function to approximate each of the p-value’s distributions by matching the mean and variance. Lastly, the step-function-based p-value model can be used for theoretical power analysis. Results: The proposed model is applied to problems in multiple testing procedures. We first show how the most powerful critical constants can be chosen using the step-function-based p-value model. Our model is then applied to the field of multiple testing procedures to explain the assumption of monotonicity of the critical constants. Lastly, we apply our model to a behavioral weight loss and maintenance study to select the optimal critical constants. Conclusions: The proposed model is easy to implement and preserves the information from the alternative hypothesis

    Clinical Utilization Pattern of Liquid Biopsies (LB) to Detect Actionable Driver Mutations, Guide Treatment Decisions and Monitor Disease Burden During Treatment of 33 Metastatic Colorectal Cancer (mCRC) Patients (pts) at a Fox Chase Cancer Center GI Oncology Subspecialty Clinic

    Get PDF
    Background: Liquid biopsy (LB) captures dynamic genomic alterations (alts) across metastatic colorectal cancer (mCRC) therapy and may complement tissue biopsy (TB). We sought to describe the utility of LB and better understand mCRC biology during therapy.Methods: Thirty-three patients (pts) with mCRC underwent LB. We used permutation-based t-tests to assess associations between alts, and clinical variables and used Kendall's tau to measure correlations.Results: Of 33 pts, 15 were women; 22 had colon, and the rest rectal cancer. Pts received a median of two lines of therapy before LB. Nineteen pts had limited testing on TB (RAS/RAF/TP53/APC), 11 extended NGS, and 3 no TB. Maxpct and alts correlated with CEA (p < 0.001, respectively). In 3/5 pts with serial LB, CEA correlated with maxpct trend, and CT tumor burden. In 6 pts, mutant RAS was seen in LB and not TB; 5/6 had received anti-EGFR therapy prior to LB, suggesting RAS alts developed post-therapy. In two pts RAS-mutated by TB, no RAS alts were detected on LB; these pts had low disease burden on CT at time of LB that also did not reveal APC or TP53 alts. In six patients who were KRAS wt based on TB, post anti-EGFR LB revealed subclonal KRAS mutations, likely a treatment effect. The median number of alts was higher post anti-EGFR LB (n = 12) vs. anti-EGFR naĂŻve LB (n = 22) (9.5 vs. 5.5, p = 0.059) but not statistically significant. More alts were also noted in post anti-EGFR therapy LB vs. KRAS wt anti-EGFR-naĂŻve LB (n = 6) (9.5 vs. 5) among patients with KRAS wild-type tumors, although the difference was not significant (p = 0.182).Conclusions: LB across mCRC therapy detects driver mutations, monitors disease burden, and identifies sub-clonal alts that reflect drug resistance, tumor evolution, and heterogeneity. Interpretation of LB results is impacted by clinical context

    Solar Ring Mission: Building a Panorama of the Sun and Inner-heliosphere

    Full text link
    Solar Ring (SOR) is a proposed space science mission to monitor and study the Sun and inner heliosphere from a full 360{\deg} perspective in the ecliptic plane. It will deploy three 120{\deg}-separated spacecraft on the 1-AU orbit. The first spacecraft, S1, locates 30{\deg} upstream of the Earth, the second, S2, 90{\deg} downstream, and the third, S3, completes the configuration. This design with necessary science instruments, e.g., the Doppler-velocity and vector magnetic field imager, wide-angle coronagraph, and in-situ instruments, will allow us to establish many unprecedented capabilities: (1) provide simultaneous Doppler-velocity observations of the whole solar surface to understand the deep interior, (2) provide vector magnetograms of the whole photosphere - the inner boundary of the solar atmosphere and heliosphere, (3) provide the information of the whole lifetime evolution of solar featured structures, and (4) provide the whole view of solar transients and space weather in the inner heliosphere. With these capabilities, Solar Ring mission aims to address outstanding questions about the origin of solar cycle, the origin of solar eruptions and the origin of extreme space weather events. The successful accomplishment of the mission will construct a panorama of the Sun and inner-heliosphere, and therefore advance our understanding of the star and the space environment that holds our life.Comment: 41 pages, 6 figures, 1 table, to be published in Advances in Space Researc

    RNA-seq Data and Alternative Splicing 1

    No full text

    Experience Simpson's Paradox in the Classroom

    No full text
    <p>Simpson's paradox is a challenging topic to teach in an introductory statistics course. To motivate students to understand this paradox both intuitively and statistically, this article introduces several new ways to teach Simpson's paradox. We design a paper toss activity between instructors and students in class to engage students in the learning process. We show that Simpson's paradox widely exists in basketball statistics, and thus instructors may consider looking for Simpson's paradox in their own school basketball teams as examples to motivate students’ interest. A new probabilistic explanation of Simpson's paradox is provided, which helps foster students’ statistical understanding. Supplementary materials for this article are available online.</p

    Application of a new dietary pattern analysis method in nutritional epidemiology

    No full text
    Abstract Background Diet plays an important role in chronic disease, and the use of dietary pattern analysis has grown rapidly as a way of deconstructing the complexity of nutritional intake and its relation to health. Pattern analysis methods, such as principal component analysis (PCA), have been used to investigate various dimensions of diet. Existing analytic methods, however, do not fully utilize the predictive potential of dietary assessment data. In particular, these methods are often suboptimal at predicting clinically important variables. Methods We propose a new dietary pattern analysis method using the advanced LASSO (Least Absolute Shrinkage and Selection Operator) model to improve the prediction of disease-related risk factors. Despite the potential advantages of LASSO, this is the first time that the model has been adapted for dietary pattern analysis. Hence, the systematic evaluation of the LASSO model as applied to dietary data and health outcomes is highly innovative and novel. Using Food Frequency Questionnaire data from NHANES 2005–2006, we apply PCA and LASSO to identify dietary patterns related to cardiovascular disease risk factors in healthy US adults (n = 2609) after controlling for confounding variables (e.g., age and BMI). Both analyses account for the sampling weights. Model performance in terms of prediction accuracy is evaluated using an independent test set. Results PCA yields 10 principal components (PCs) that together account for 65% of the variation in the data set and represent distinct dietary patterns. These PCs are then used as predictors in a regression model to predict cardiovascular disease risk factors. We find that LASSO better predicts levels of triglycerides, LDL cholesterol, HDL cholesterol, and total cholesterol (adjusted R 2 = 0.861, 0.899, 0.890, and 0.935 respectively) than does the traditional, linear-regression-based, dietary pattern analysis method (adjusted R 2  = 0.163, 0.005, 0.235, and 0.024 respectively) when the latter is applied to components derived from PCA. Conclusions The proposed method is shown to be an appropriate and promising statistical means of deriving dietary patterns predictive of cardiovascular disease risk. Future studies, involving different diseases and risk factors, will be necessary before LASSO’s broader usefulness in nutritional epidemiology can be established

    A gatekeeping procedure to test a primary and a secondary endpoint in a group sequential design with multiple interim looks

    Get PDF
    Glimm et al. (2010) and Tamhane et al. (2010) studied the problem of testing a primary and a secondary endpoint, subject to a gatekeeping constraint, using a group sequential design (GSD) with K=2 looks. In this article, we greatly extend the previous results to multiple (K&gt;2) looks. If the familywise error rate (FWER) is to be controlled at a preassigned α level then it is clear that the primary boundary must be of level α. We show under what conditions one α-level primary boundary is uniformly more powerful than another. Based on this result, we recommend the choice of the O'Brien and Fleming (1979) boundary over the Pocock (1977) boundary for the primary endpoint. For the secondary endpoint the choice of the boundary is more complicated since under certain conditions the secondary boundary can be refined to have a nominal level αâ€Č&gt;α, while still controlling the FWER at level α, thus boosting the secondary power. We carry out secondary power comparisons via simulation between different choices of primary-secondary boundary combinations. The methodology is applied to the data from the RALES study (Pitt et al., 1999; Wittes et al., 2001). An R library package gsrsb to implement the proposed methodology is made available on CRAN.</p

    Extending Multiple Testing with Unknown Test Dependency via the CoCo Test: With Applications to Cancer Studies.

    No full text
    Multiple testing problems are ubiquitous in clinical and scientific investigations, from testing multiple endpoints in clinical trials, to examining hundreds of thousands of brain voxels in brain imaging research and millions of single nucleotide polymorphisms (SNPs) in genetic studies. Central to multiple testing is to control for the type I error. The behavior of multiple testing procedures for alpha-control when the tests are independent or dependent but with a known joint distribution is relatively well known. When the joint distribution of test statistics is unknown, one can still guarantee the α\alpha-control, if the positive dependency through stochastic ordering (PDS) condition is satisfied. Despite the frequent occurrence of unknown test dependency in multiple testing and the importance of the PDS condition in endorsing its validity, little do we know about how to verify the condition. Here, we develop a new nonparametric statistical test, called the CoCo test, based on ranked correlation coefficients and a simple, yet effective, algebraic arrangement of the Spearman's rho and Kendall's tau, that can validate the condition of PDS, through which one can control for alpha regardless of the prior knowledge of the dependency between test statistics. Simulation studies show that the CoCo test can faithfully detect the violation of the PDS condition or lack thereof. To further evaluate the efficacy of the CoCo test, we apply it to investigate two meta-analyses: 72 trials on patients with metastatic breast cancer and 12 trials on patients with advanced solid tumors. Our simulation studies and data analyses strongly encourage one to evaluate the PDS condition during multiple testing, especially when one is uncertain about the relationship between tests, and the proposed CoCo test provides both methodological insights into and a technical device for doing so. An R package cocotest to implement the proposed methodology is available at CRAN
    corecore