73 research outputs found

    Data splitting as a countermeasure against hypothesis fishing: with a case study of predictors for low back pain

    Get PDF
    There is growing concern in the scientific community that many published scientific findings may represent spurious patterns that are not reproducible in independent data sets. A reason for this is that significance levels or confidence intervals are often applied to secondary variables or sub-samples within the trial, in addition to the primary hypotheses (multiple hypotheses). This problem is likely to be extensive for population-based surveys, in which epidemiological hypotheses are derived after seeing the data set (hypothesis fishing). We recommend a data-splitting procedure to counteract this methodological problem, in which one part of the data set is used for identifying hypotheses, and the other is used for hypothesis testing. The procedure is similar to two-stage analysis of microarray data. We illustrate the process using a real data set related to predictors of low back pain at 14-year follow-up in a population initially free of low back pain. “Widespreadness” of pain (pain reported in several other places than the low back) was a statistically significant predictor, while smoking was not, despite its strong association with low back pain in the first half of the data set. We argue that the application of data splitting, in which an independent party handles the data set, will achieve for epidemiological surveys what pre-registration has done for clinical studies

    Obesity and osteoarthritis in knee, hip and/or hand: An epidemiological study in the general population with 10 years follow-up

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Obesity is one of the most important risk factors for osteoarthritis (OA) in knee(s). However, the relationship between obesity and OA in hand(s) and hip(s) remains controversial and needs further investigation. The purpose of this study was to investigate the impact of obesity on incident osteoarthritis (OA) in hip, knee, and hand in a general population followed in 10 years.</p> <p>Methods</p> <p>A total of 1854 people aged 24–76 years in 1994 participated in a Norwegian study on musculoskeletal pain in both 1994 and 2004. Participants with OA or rheumatoid arthritis in 1994 and those above 74 years in 1994 were excluded, leaving n = 1675 for the analyses. The main outcome measure was OA diagnosis at follow-up based on self-report. Obesity was defined by a body mass index (BMI) of 30 and above.</p> <p>Results</p> <p>At 10-years follow-up the incidence rates were 5.8% (CI 4.3–7.3) for hip OA, 7.3% (CI 5.7–9.0) for knee OA, and 5.6% (CI 4.2–7.1) for hand OA. When adjusting for age, gender, work status and leisure time activities, a high BMI (> 30) was significantly associated with knee OA (OR 2.81; 95%CI 1.32–5.96), and a dose-response relationship was found for this association. Obesity was also significantly associated with hand OA (OR 2.59; 1.08–6.19), but not with hip OA (OR 1.11; 0.41–2.97). There was no statistically significant interaction effect between BMI and gender, age or any of the other confounding variables.</p> <p>Conclusion</p> <p>A high BMI was significantly associated with knee OA and hand OA, but not with hip OA.</p

    Mannose-binding lectin genotypes: lack of association with susceptibility to thoracic empyema

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The role of the innate immune protein mannose-binding lectin (MBL) in host defence against severe respiratory infection remains controversial. Thoracic empyema is a suppurative lung infection that arises as a major complication of pneumonia and is associated with a significant mortality. Although the pathogenesis of thoracic empyema is poorly understood, genetic susceptibility loci for this condition have recently been identified. The possible role of MBL genotypic deficiency in susceptibility to thoracic empyema has not previously been reported.</p> <p>Methods</p> <p>To investigate this further we compared the frequencies of the six functional <it>MBL </it>polymorphisms in 170 European individuals with thoracic empyema and 225 healthy control individuals.</p> <p>Results</p> <p>No overall association was observed between MBL genotypic deficiency and susceptibility to thoracic empyema (2 × 2 Chi square = 0.02, <it>P </it>= 0.87). Furthermore, no association was seen between MBL deficiency and susceptibility to the Gram-positive or pneumococcal empyema subgroups. MBL genotypic deficiency did not associate with progression to death or requirement for surgery.</p> <p>Conclusions</p> <p>Our results suggest that MBL genotypic deficiency does not associate with susceptibility to thoracic empyema in humans.</p

    How Online Learning Approaches Ornstein Uhlenbeck Processes

    Get PDF
    We show that under reasonable conditions, online learning for a nonlinear function near a local minimum is similar to a multivariate Ornstein Uhlenbeck process. This implies that the parameter state oscillates randomly around the minimum point, with a Gaussian limiting distribution

    A robust conflict measure of internal inconsistencies in Bayesian hierarchical models

    Get PDF
    O'Hagan (2003) introduces some tools for criticism of Bayesian hierarchical models that can be applied at each node of the model, with a view to diagnosing problems of model fit at any point in the model structure. His method relies on computing the posterior median of a conflict index, typically through MCMC simulations. We investigate a Gaussian model of two-way analysis of variance, and show that O'Hagan's approach gives unreliable false warning probabilities. We extend and refine the method, especially avoiding double use of data by a data splitting approach, accompanied by theoretical justifications from a non trivial special case. Through extensive numerical experiments we show that our method detects model misspecification about as well as O'Hagan's method does, while retaining the desired false warning probability for data generated from the assumed model. This also holds for a Student-t version of the model

    Representing Human Uncertainty by Subjective Likelihood Estimates

    Get PDF
    We give a definition of human uncertainty through subjective likelihood estimates. The subject is asked for his estimated likelihood of a discrete variable, given a present piece of uncertain observation, under the hypothetical assumption that the variable was uniformly distributed prior to the new observation. With this interpretation of human uncertainty, we are able to perform consistent inference about our target variable, by formally treating the input as likelihood factors. The algorithm has been successfully implemented in an expert system for classification of wildwood mushrooms

    On the Conservativeness of Posterior Predictive P-values

    No full text
    corecore