4 research outputs found

    Quantifying sources of variability in infancy research using the infant-directed-speech preference

    Get PDF
    Psychological scientists have become increasingly concerned with issues related to methodology and replicability, and infancy researchers in particular face specific challenges related to replicability: For example, high-powered studies are difficult to conduct, testing conditions vary across labs, and different labs have access to different infant populations. Addressing these concerns, we report on a large-scale, multisite study aimed at (a) assessing the overall replicability of a single theoretically important phenomenon and (b) examining methodological, cultural, and developmental moderators. We focus on infants’ preference for infant-directed speech (IDS) over adult-directed speech (ADS). Stimuli of mothers speaking to their infants and to an adult in North American English were created using seminaturalistic laboratory-based audio recordings. Infants’ relative preference for IDS and ADS was assessed across 67 laboratories in North America, Europe, Australia, and Asia using the three common methods for measuring infants’ discrimination (head-turn preference, central fixation, and eye tracking). The overall meta-analytic effect size (Cohen’s d) was 0.35, 95% confidence interval = [0.29, 0.42], which was reliably above zero but smaller than the meta-analytic mean computed from previous literature (0.67). The IDS preference was significantly stronger in older children, in those children for whom the stimuli matched their native language and dialect, and in data from labs using the head-turn preference procedure. Together, these findings replicate the IDS preference but suggest that its magnitude is modulated by development, native-language experience, and testing procedure. (This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie SkƂodowska-Curie grant agreement No 798658.

    Many Labs 5:Testing pre-data collection peer review as an intervention to increase replicability

    Get PDF
    Replication studies in psychological science sometimes fail to reproduce prior findings. If these studies use methods that are unfaithful to the original study or ineffective in eliciting the phenomenon of interest, then a failure to replicate may be a failure of the protocol rather than a challenge to the original finding. Formal pre-data-collection peer review by experts may address shortcomings and increase replicability rates. We selected 10 replication studies from the Reproducibility Project: Psychology (RP:P; Open Science Collaboration, 2015) for which the original authors had expressed concerns about the replication designs before data collection; only one of these studies had yielded a statistically significant effect (p < .05). Commenters suggested that lack of adherence to expert review and low-powered tests were the reasons that most of these RP:P studies failed to replicate the original effects. We revised the replication protocols and received formal peer review prior to conducting new replication studies. We administered the RP:P and revised protocols in multiple laboratories (median number of laboratories per original study = 6.5, range = 3?9; median total sample = 1,279.5, range = 276?3,512) for high-powered tests of each original finding with both protocols. Overall, following the preregistered analysis plan, we found that the revised protocols produced effect sizes similar to those of the RP:P protocols (?r = .002 or .014, depending on analytic approach). The median effect size for the revised protocols (r = .05) was similar to that of the RP:P protocols (r = .04) and the original RP:P replications (r = .11), and smaller than that of the original studies (r = .37). Analysis of the cumulative evidence across the original studies and the corresponding three replication attempts provided very precise estimates of the 10 tested effects and indicated that their effect sizes (median r = .07, range = .00?.15) were 78% smaller, on average, than the original effect sizes (median r = .37, range = .19?.50)

    Many Labs 5: Registered Multisite Replication of the Tempting-Fate Effects in Risen and Gilovich (2008)

    No full text
    Risen and Gilovich (2008) found that subjects believed that “tempting fate” would be punished with ironic bad outcomes (a main effect), and that this effect was magnified when subjects were under cognitive load (an interaction). A previous replication study (Frank & Mathur, 2016) that used an online implementation of the protocol on Amazon Mechanical Turk failed to replicate both the main effect and the interaction. Before this replication was run, the authors of the original study expressed concern that the cognitive-load manipulation may be less effective when implemented online than when implemented in the lab and that subjects recruited online may also respond differently to the specific experimental scenario chosen for the replication. A later, large replication project, Many Labs 2 (Klein et al. 2018), replicated the main effect (though the effect size was smaller than in the original study), but the interaction was not assessed. Attempting to replicate the interaction while addressing the original authors’ concerns regarding the protocol for the first replication study, we developed a new protocol in collaboration with the original authors. We used four university sites (N = 754) chosen for similarity to the site of the original study to conduct a high-powered, preregistered replication focused primarily on the interaction effect. Results from these sites did not support the interaction or the main effect and were comparable to results obtained at six additional universities that were less similar to the original site. Post hoc analyses did not provide strong evidence for statistical inconsistency between the original study’s estimates and our estimates; that is, the original study’s results would not have been extremely unlikely in the estimated distribution of population effects in our sites. We also collected data from a new Mechanical Turk sample under the first replication study’s protocol, and results were not meaningfully different from those obtained with the new protocol at universities similar to the original site. Secondary analyses failed to support proposed substantive mechanisms for the failure to replicate
    corecore