2,421,882 research outputs found

    SPARC: Statistical Performance Analysis With Relevance Conclusions

    Get PDF
    The performance of one computer relative to another is traditionally characterized through benchmarking, a practice occasionally deficient in statistical rigor. The performance is often trivialized through simplified measures, such as the approach of central tendency, but doing so risks a loss of perspective of the variability and non-determinism of modern computer systems. Authentic performance evaluations are derived from statistical methods that accurately interpret and assess data. Methods that currently exist within performance comparison frameworks are limited in efficacy, statistical inference is either overtly simplified or altogether avoided. A prevalent criticism from computer performance literature suggests that the results from difference hypothesis testing lack substance. To address this problem, we propose a new framework, SPARC, that pioneers a synthesis of difference and equivalence hypothesis testing to provide relevant conclusions. It is a union of three key components: (i) identifying either superiority or similarity through difference and equivalence hypotheses (ii) scalable methodology (based on the number of benchmarks), and (iii) a conditional feedback loop from test outcomes that produces informative conclusions of relevance, equivalence, trivial, or indeterminant. We present an experimental analysis characterizing the performance of a trio of RISC-V open-source processors to evaluate SPARC and its efficacy compared to similar frameworks

    An evaluation of the quality of statistical design and analysis of published medical research : results from a systematic survey of general orthopaedic journals

    Get PDF
    Background: The application of statistics in reported research in trauma and orthopaedic surgery has become ever more important and complex. Despite the extensive use of statistical analysis, it is still a subject which is often not conceptually well understood, resulting in clear methodological flaws and inadequate reporting in many papers. Methods: A detailed statistical survey sampled 100 representative orthopaedic papers using a validated questionnaire that assessed the quality of the trial design and statistical analysis methods. Results: The survey found evidence of failings in study design, statistical methodology and presentation of the results. Overall, in 17% (95% confidence interval; 10–26%) of the studies investigated the conclusions were not clearly justified by the results, in 39% (30–49%) of studies a different analysis should have been undertaken and in 17% (10–26%) a different analysis could have made a difference to the overall conclusions. Conclusion: It is only by an improved dialogue between statistician, clinician, reviewer and journal editor that the failings in design methodology and analysis highlighted by this survey can be addressed

    Errors in statistical analysis and questionable randomization lead to unreliable conclusions

    Get PDF
    Dear Editor,We read with interest the paper, “The effect of food service system modifications on staff body mass index in an industrial organization”[1]. We noticed several substantial issues with data and calculations, calling into question the randomized nature of the study and validity of analyses.The distribution of baseline weight was significantly differentbetween groups (p-value = “0.00”). We replicated the test using reported means and standard deviations (SDs) andobtained a p-value of approximately 1.9*10-17. It is extraordinarily unlikely that any variable would be that different between two groups if allocation was truly random. Even it was truly random, the stated method of “the samples were randomly divided into two groups”[1] does not describe the “method used to generate the random allocation sequence” and the “type of randomization; details of any restriction (such as blocking and block size)” details specified by Consolidated Standards of Reporting Trials (CONSORT)[2].Given the large difference in baseline weights, it is unusual that the difference in baseline body mass index (BMI) between groups is not more significant (p=0.032), raising the question of what the groups’ distributions of height were. Both groups have 30 males (58.8%), so sex differences are unlikely to explain this discrepancy. Height was not explicitly reported, but it was possible to estimate height utilizing geometric means from body weight and BMI[3,4]. We calculated the baseline control group geometric mean as 2.04 cm taller than the test group. These calculations also suggest the control group shrunk by 1.26 cm while the test group grew by 1.52 cm over the study. Neither change is explained by rounding error nor seems plausible for adult subjects over 40 days.Because there were no SDs of the change scores reported, we were unable to replicate the reported p-value (0.318) for the between-group test of weight change exactly. However, we were able to consider the pre and post-intervention SDs and calculate possible SDs of within-group change scores for a range of pre-post correlations. The largest p-value possible was 0.1282, calculated when each group had perfect negative pre-post correlation (correlation=-1), which is unlikely. If there was no or a positive correlation, the p-value would be much smaller (p=0.0449 when correlation=0 for each group) and plausibly indicates a significant difference between groups. Therefore, although the published results are impossible the correct analysis could make the intervention appear more effective than reported.The results section describes an initial sample size of 116 with 14 dropping out (p. 115). The tables report the remaining sample size to be 102, but the body of the text reports 101 subjects remained until study completion. It is unclear which values were correct; this lack of clarity also fails CONSORT guidelines[2].Considering that the reported findings are essentially impossible given the stated study design, we encourage the authors to explain the treatment allocationand make the raw data available, or the journal to act according to the Committee on Publication Ethics[5] in situations where findings are unreliabl

    ON THE BOOTSTRAP METHOD OF ESTIMATION OF RESPONSE SURFACE FUNCTION

    Get PDF
    Nowadays, in many fields of science it is necessary to carry out miscellaneous analyses using classical statistical methods, which usually have correct assumptions. These assumptions in the research realities cannot always be met, which makes it impossible to carry out analyses and leads to incorrect conclusions and recommendations.The study of the production process largely consists in the use of tools of statistical quality control which are based on classical statistical methods. These methods result in some improvements in technological and economic results of the manufacturing process. One of the tools of statistical quality control is the design of experiments, whose important element is the estimation of response surface function.The aim of this paper is to present the bootstrap method of estimation of response surface function and its use for empirical data

    Statistical Isotropy of CMB Polarization Maps

    Full text link
    We formulate statistical isotropy of CMB anisotropy maps in its most general form. We also present a fast and orientation independent statistical method to determine deviations from statistical isotropy in CMB polarization maps. Importance of having statistical tests of departures from SI for CMB polarization maps lies not only in interesting theoretical motivations but also in testing cleaned CMB polarization maps for observational artifacts such as residuals from polarized foreground emission. We propose a generalization of the Bipolar Power Spectrum (BiPS) to polarization maps. Application to the observed CMB polarization maps will be soon possible after the release of WMAP three year data. As a demonstration we show that for E-polarization this test can detect breakdown of statistical isotropy due to polarized synchrotron foreground.Comment: 6 pages, 2 figures, Conclusions & results unchanged; Extension to cutsky included (discussion & references added); Matches version accepted to Phys. Rev. D Rapid Com

    Tiny microbes, enormous impacts: what matters in gut microbiome studies?

    Get PDF
    Many factors affect the microbiomes of humans, mice, and other mammals, but substantial challenges remain in determining which of these factors are of practical importance. Considering the relative effect sizes of both biological and technical covariates can help improve study design and the quality of biological conclusions. Care must be taken to avoid technical bias that can lead to incorrect biological conclusions. The presentation of quantitative effect sizes in addition to P values will improve our ability to perform meta-analysis and to evaluate potentially relevant biological effects. A better consideration of effect size and statistical power will lead to more robust biological conclusions in microbiome studies
    corecore