23 research outputs found
Estimation of Peaked Densities Over the Interval [0,1] Using Two-Sided Power Distribution: Application to Lottery Experiments
Recommended from our members
Assessing heat-related health risk in Europe via the Universal Thermal Climate Index (UTCI)
In this work the potential of the Universal Thermal Climate Index (UTCI) as a heat-related health risk indicator in Europe is demonstrated. The UTCI is a bioclimate index that uses a multi-node human heat balance model to represent the heat stress induced by meteorological conditions to the human body. Using 38 years of meteorological reanalysis data, UTCI maps were computed to assess the thermal bioclimate of Europe for the summer season. Patterns of heat stress conditions and non-thermal stress regions are identified across Europe. An increase in heat stress up to 1°C is observed during recent decades. Correlation with mortality data from 17 European countries revealed that the relationship between the UTCI and death counts depends on the bioclimate of the country, and death counts increase in conditions of moderate and strong stress, i.e. when UTCI is above 26°C and 32°C. The UTCI’s ability to represent mortality patterns is demonstrated for the 2003 European heatwave. These findings confirm the importance of UTCI as a bioclimatic index that is able to both capture the thermal bioclimatic variability of Europe, and relate such variability with the effects it has on human health
Examining the generalizability of research findings from archival data
This initiative examined systematically the extent to which a large set of archival research findings generalizes across contexts. We repeated the key analyses for 29 original strategic management effects in the same context (direct reproduction) as well as in 52 novel time periods and geographies; 45% of the reproductions returned results matching the original reports together with 55% of tests in different spans of years and 40% of tests in novel geographies. Some original findings were associated with multiple new tests. Reproducibility was the best predictor of generalizability-for the findings that proved directly reproducible, 84% emerged in other available time periods and 57% emerged in other geographies. Overall, only limited empirical evidence emerged for context sensitivity. In a forecasting survey, independent scientists were able to anticipate which effects would find support in tests in new samples
Creative destruction in science
Drawing on the concept of a gale of creative destruction in a capitalistic economy, we argue that initiatives to assess the robustness of findings in the organizational literature should aim to simultaneously test competing ideas operating in the same theoretical space. In other words, replication efforts should seek not just to support or question the original findings, but also to replace them with revised, stronger theories with greater explanatory power. Achieving this will typically require adding new measures, conditions, and subject populations to research designs, in order to carry out conceptual tests of multiple theories in addition to directly replicating the original findings. To illustrate the value of the creative destruction approach for theory pruning in organizational scholarship, we describe recent replication initiatives re-examining culture and work morality, working parents\u2019 reasoning about day care options, and gender discrimination in hiring decisions.
Significance statement
It is becoming increasingly clear that many, if not most, published research findings across scientific fields are not readily replicable when the same method is repeated. Although extremely valuable, failed replications risk leaving a theoretical void\u2014 reducing confidence the original theoretical prediction is true, but not replacing it with positive evidence in favor of an alternative theory. We introduce the creative destruction approach to replication, which combines theory pruning methods from the field of management with emerging best practices from the open science movement, with the aim of making replications as generative as possible. In effect, we advocate for a Replication 2.0 movement in which the goal shifts from checking on the reliability of past findings to actively engaging in competitive theory testing and theory building.
Scientific transparency statement
The materials, code, and data for this article are posted publicly on the Open Science Framework, with links provided in the article
Many Labs 5:Testing pre-data collection peer review as an intervention to increase replicability
Replication studies in psychological science sometimes fail to reproduce prior findings. If these studies use methods that are unfaithful to the original study or ineffective in eliciting the phenomenon of interest, then a failure to replicate may be a failure of the protocol rather than a challenge to the original finding. Formal pre-data-collection peer review by experts may address shortcomings and increase replicability rates. We selected 10 replication studies from the Reproducibility Project: Psychology (RP:P; Open Science Collaboration, 2015) for which the original authors had expressed concerns about the replication designs before data collection; only one of these studies had yielded a statistically significant effect (p < .05). Commenters suggested that lack of adherence to expert review and low-powered tests were the reasons that most of these RP:P studies failed to replicate the original effects. We revised the replication protocols and received formal peer review prior to conducting new replication studies. We administered the RP:P and revised protocols in multiple laboratories (median number of laboratories per original study = 6.5, range = 3?9; median total sample = 1,279.5, range = 276?3,512) for high-powered tests of each original finding with both protocols. Overall, following the preregistered analysis plan, we found that the revised protocols produced effect sizes similar to those of the RP:P protocols (?r = .002 or .014, depending on analytic approach). The median effect size for the revised protocols (r = .05) was similar to that of the RP:P protocols (r = .04) and the original RP:P replications (r = .11), and smaller than that of the original studies (r = .37). Analysis of the cumulative evidence across the original studies and the corresponding three replication attempts provided very precise estimates of the 10 tested effects and indicated that their effect sizes (median r = .07, range = .00?.15) were 78% smaller, on average, than the original effect sizes (median r = .37, range = .19?.50)
Examining the generalizability of research findings from archival data
This initiative examined systematically the extent to which a large set of archival research findings generalizes across contexts. We repeated the key analyses for 29 original strategic management effects in the same context (direct reproduction) as well as in 52 novel time periods and geographies; 45% of the reproductions returned results matching the original reports together with 55% of tests in different spans of years and 40% of tests in novel geographies. Some original findings were associated with multiple new tests. Reproducibility was the best predictor of generalizability—for the findings that proved directly reproducible, 84% emerged in other available time periods and 57% emerged in other geographies. Overall, only limited empirical evidence emerged for context sensitivity. In a forecasting survey, independent scientists were able to anticipate which effects would find support in tests in new samples
Forecast changes for heat and cold stress in Warsaw in the 21st century, and their possible influence on mortality risk
This paper presents the results of research dealing with forecast changes
in the frequency of occurrence of heat and cold stress in Warsaw (Poland) in the years
2001–2100, and the possible influence these may exert on mortality risk. Heat and cold stress
were assessed by reference to the Universal Thermal Climate Index (UTC I), for which values
were calculated using meteorological data derived from the MPI-M-RE MO regional climate
model, at a with spatial resolution of 25 × 25 km. The simulations used boundary conditions
from the EC HAMP5 Global Climate Model, for SRES scenario A1B. Predictions of mortality
rate were in turn based on experimental epidemiological data from the period 1993–2002.
Medical data consist of daily numbers of deaths within the age category above 64 years
(TM64+). It proved possible to observe a statistically significant relationship between UTC I
and mortality rates, this serving as a basis for predicting possible changes in mortality in the
21st century due to changing conditions as regards heat and cold stress
Many Labs 5: Registered Replication of Förster, Liberman, and Kuschel’s (2008) Study 1
In a test of their global-/local-processing-style model, Förster, Liberman, and Kuschel (2008) found that people assimilate a primed concept (e.g., “aggressive”) into their social judgments after a global prime (e.g., they rate a person as being more aggressive than do people in a no-prime condition) but contrast their judgment away from the primed concept after a local prime (e.g., they rate the person as being less aggressive than do people in a no prime-condition). This effect was not replicated by Reinhard (2015) in the Reproducibility Project: Psychology. However, the authors of the original study noted that the replication could not provide a test of the moderation effect because priming did not occur. They suggested that the primes might have been insufficiently applicable and the scenarios insufficiently ambiguous to produce priming. In the current replication project, we used both Reinhard’s protocol and a revised protocol that was designed to increase the likelihood of priming, to test the original authors’ suggested explanation for why Reinhard did not observe the moderation effect. Teams from nine universities contributed to this project. We first conducted a pilot study (N = 530) and successfully selected ambiguous scenarios for each site. We then pilot-tested the aggression prime at five different sites (N = 363) and found that it did not successfully produce priming. In agreement with the first author of the original report, we replaced the prime with a task that successfully primed aggression (hostility) in a pilot study by McCarthy et al. (2018). In the final replication study (N = 1,460), we did not find moderation by protocol type, and judgment patterns in both protocols were inconsistent with the effects observed in the original study. We discuss these findings and possible explanations