35 research outputs found
An Open, Large-Scale, Collaborative Effort to Estimate the Reproducibility of Psychological Science
Reproducibility is a defining feature of science. However, because of strong incentives for innovation and weak incentives for confirmation, direct replication is rarely practiced or published. The Reproducibility Project is an open, large-scale, collaborative effort to systematically examine the rate and predictors of reproducibility in psychological science. So far, 72 volunteer researchers from 41 institutions have organized to openly and transparently replicate studies published in three prominent psychological journals in 2008. Multiple methods will be used to evaluate the findings, calculate an empirical rate of replication, and investigate factors that predict reproducibility. Whatever the result, a better understanding of reproducibility will ultimately improve confidence in scientific methodology and findings
Many Labs 5: Testing Pre-Data-Collection Peer Review as an Intervention to Increase Replicability
Replication studies in psychological science sometimes fail to reproduce prior findings. If these studies use methods that are unfaithful to the original study or ineffective in eliciting the phenomenon of interest, then a failure to replicate may be a failure of the protocol rather than a challenge to the original finding. Formal pre-data-collection peer review by experts may address shortcomings and increase replicability rates. We selected 10 replication studies from the Reproducibility Project: Psychology (RP:P; Open Science Collaboration, 2015) for which the original authors had expressed concerns about the replication designs before data collection; only one of these studies had yielded a statistically significant effect (p < .05). Commenters suggested that lack of adherence to expert review and low-powered tests were the reasons that most of these RP:P studies failed to replicate the original effects. We revised the replication protocols and received formal peer review prior to conducting new replication studies. We administered the RP:P and revised protocols in multiple laboratories (median number of laboratories per original study = 6.5, range = 3–9; median total sample = 1,279.5, range = 276–3,512) for high-powered tests of each original finding with both protocols. Overall, following the preregistered analysis plan, we found that the revised protocols produced effect sizes similar to those of the RP:P protocols (Δr = .002 or .014, depending on analytic approach). The median effect size for the revised protocols (r = .05) was similar to that of the RP:P protocols (r = .04) and the original RP:P replications (r = .11), and smaller than that of the original studies (r = .37). Analysis of the cumulative evidence across the original studies and the corresponding three replication attempts provided very precise estimates of the 10 tested effects and indicated that their effect sizes (median r = .07, range = .00–.15) were 78% smaller, on average, than the original effect sizes (median r = .37, range = .19–.50).Additional co-authors: Ivan Ropovik, Balazs Aczel, Lena F. Aeschbach, Luca Andrighetto, Jack D. Arnal, Holly Arrow, Peter Babincak, Bence E. Bakos, Gabriel Baník, Ernest Baskin, Radomir Belopavlovic, Michael H. Bernstein, Michał Białek, Nicholas G. Bloxsom, Bojana Bodroža, Diane B. V. Bonfiglio, Leanne Boucher, Florian Brühlmann, Claudia C. Brumbaugh, Erica Casini, Yiling Chen, Carlo Chiorri, William J. Chopik, Oliver Christ, Antonia M. Ciunci, Heather M. Claypool, Sean Coary, Marija V. Cˇolic, W. Matthew Collins, Paul G. Curran, Chris R. Day, Anna Dreber, John E. Edlund, Filipe Falcão, Anna Fedor, Lily Feinberg, Ian R. Ferguson, Máire Ford, Michael C. Frank, Emily Fryberger, Alexander Garinther, Katarzyna Gawryluk, Kayla Ashbaugh, Mauro Giacomantonio, Steffen R. Giessner, Jon E. Grahe, Rosanna E. Guadagno, Ewa Hałasa, Rias A. Hilliard, Joachim Hüffmeier, Sean Hughes, Katarzyna Idzikowska, Michael Inzlicht, Alan Jern, William Jiménez-Leal, Magnus Johannesson, Jennifer A. Joy-Gaba, Mathias Kauff, Danielle J. Kellier, Grecia Kessinger, Mallory C. Kidwell, Amanda M. Kimbrough, Josiah P. J. King, Vanessa S. Kolb, Sabina Kołodziej, Marton Kovacs, Karolina Krasuska, Sue Kraus, Lacy E. Krueger, Katarzyna Kuchno, Caio Ambrosio Lage, Eleanor V. Langford, Carmel A. Levitan, Tiago Jessé Souza de Lima, Hause Lin, Samuel Lins, Jia E. Loy, Dylan Manfredi, Łukasz Markiewicz, Madhavi Menon, Brett Mercier, Mitchell Metzger, Venus Meyet, Jeremy K. Miller, Andres Montealegre, Don A. Moore, Rafał Muda, Gideon Nave, Austin Lee Nichols, Sarah A. Novak, Christian Nunnally, Ana Orlic, Anna Palinkas, Angelo Panno, Kimberly P. Parks, Ivana Pedovic, Emilian Pekala, Matthew R. Penner, Sebastiaan Pessers, Boban Petrovic, Thomas Pfeiffer, Damian Pienkosz, Emanuele Preti, Danka Puric, Tiago Ramos, Jonathan Ravid, Timothy S. Razza, Katrin Rentzsch, Juliette Richetin, Sean C. Rife, Anna Dalla Rosa, Kaylis Hase Rudy, Janos Salamon, Blair Saunders, Przemysław Sawicki, Kathleen Schmidt, Kurt Schuepfer, Thomas Schultze, Stefan Schulz-Hardt, Astrid Schütz, Ani N. Shabazian, Rachel L. Shubella, Adam Siegel, Rúben Silva, Barbara Sioma, Lauren Skorb, Luana Elayne Cunha de Souza, Sara Steegen, L. A. R. Stein, R. Weylin Sternglanz, Darko Stojilovic, Daniel Storage, Gavin Brent Sullivan, Barnabas Szaszi, Peter Szecsi, Orsolya Szöke, Attila Szuts, Manuela Thomae, Natasha D. Tidwell, Carly Tocco, Ann-Kathrin Torka, Francis Tuerlinckx, Wolf Vanpaemel, Leigh Ann Vaughn, Michelangelo Vianello, Domenico Viganola, Maria Vlachou, Ryan J. Walker, Sophia C. Weissgerber, Aaron L. Wichman, Bradford J. Wiggins, Daniel Wolf, Michael J. Wood, David Zealley, Iris Žeželj, Mark Zrubka, and Brian A. Nose
Visual Working Memory Capacity and Proactive Interference
Background: Visual working memory capacity is extremely limited and appears to be relatively immune to practice effects or the use of explicit strategies. The recent discovery that visual working memory tasks, like verbal working memory tasks, are subject to proactive interference, coupled with the fact that typical visual working memory tasks are particularly conducive to proactive interference, suggests that visual working memory capacity may be systematically under-estimated. Methodology/Principal Findings: Working memory capacity was probed behaviorally in adult humans both in laboratory settings and via the Internet. Several experiments show that although the effect of proactive interference on visual working memory is significant and can last over several trials, it only changes the capacity estimate by about 15%. Conclusions/Significance: This study further confirms the sharp limitations on visual working memory capacity, both in absolute terms and relative to verbal working memory. It is suggested that future research take these limitations into account in understanding differences across a variety of tasks between human adults, prelinguistic infants and nonlinguistic animals
Many Labs 5:Testing pre-data collection peer review as an intervention to increase replicability
Replication studies in psychological science sometimes fail to reproduce prior findings. If these studies use methods that are unfaithful to the original study or ineffective in eliciting the phenomenon of interest, then a failure to replicate may be a failure of the protocol rather than a challenge to the original finding. Formal pre-data-collection peer review by experts may address shortcomings and increase replicability rates. We selected 10 replication studies from the Reproducibility Project: Psychology (RP:P; Open Science Collaboration, 2015) for which the original authors had expressed concerns about the replication designs before data collection; only one of these studies had yielded a statistically significant effect (p < .05). Commenters suggested that lack of adherence to expert review and low-powered tests were the reasons that most of these RP:P studies failed to replicate the original effects. We revised the replication protocols and received formal peer review prior to conducting new replication studies. We administered the RP:P and revised protocols in multiple laboratories (median number of laboratories per original study = 6.5, range = 3?9; median total sample = 1,279.5, range = 276?3,512) for high-powered tests of each original finding with both protocols. Overall, following the preregistered analysis plan, we found that the revised protocols produced effect sizes similar to those of the RP:P protocols (?r = .002 or .014, depending on analytic approach). The median effect size for the revised protocols (r = .05) was similar to that of the RP:P protocols (r = .04) and the original RP:P replications (r = .11), and smaller than that of the original studies (r = .37). Analysis of the cumulative evidence across the original studies and the corresponding three replication attempts provided very precise estimates of the 10 tested effects and indicated that their effect sizes (median r = .07, range = .00?.15) were 78% smaller, on average, than the original effect sizes (median r = .37, range = .19?.50)
Recommended from our members
Evaluating unsupervised word segmentation in adults: a meta-analysis
Humans, even from infancy, are capable of unsupervised (“sta- tistical”) learning of linguistic information. However, it re- mains unclear which of the myriad algorithms for unsuper- vised learning captures human abilities. This matters because unsupervised learning algorithms vary greatly in how much can be learned how quickly. Thus, which algorithm(s) humans use may place a strong bound on how much of language can ac- tually be learned in an unsupervised fashion. As a step towards more precisely characterizing human unsupervised learning capabilities, we quantitatively synthesize the literature on adult unsupervised (“statistical”) word segmentation. Unfortunately, most confidence intervals were very large, and few moderators were found to be significant. These findings are consistent with prior work suggesting low power and precision in the litera- ture. Constraining theory will require more, higher-powered studies
Even Simultaneous Bilinguals Do Not Reach Monolingual Levels of Proficiency in Syntax
While there is no doubt that children raised bilingual can become extremely proficient in both languages, theorists are divided on whether bilingualism is effectively monolingualism twice (the “Two Monolinguals in One Brain” hypothesis) or differs in some fundamental way from monolingualism. A strong version of the “Two Monolinguals” hypothesis predicts that bilinguals can achieve monolingual-level proficiency in either (or both) of their languages. Recently, Bylund and Abrahamsson argued that evidence of lower syntactic proficiency in simultaneous bilinguals was due to confounds of language dominance; when simultaneous bilinguals are tested in their primary language, any difference disappears. We find no evidence for this hypothesis. Meta-analysis and Monte Carlo simulation show that variation in published results is fully consistent with sampling error, with no evidence that method mattered. Meta-analytic estimates strongly indicate lower syntactic performance by simultaneous bilinguals relative to monolinguals. Re-analysis of a large dataset (N = 115,020) confirms this finding, even controlling for language dominance. Interestingly, the effect is relatively small, challenging current theories
Recommended from our members
Towards Broader Adoption of Massive Online Experiments
Since the Workshop on Scaling Cognitive Science was held at CogSci 2020, the methodological landscape of the field has undergone a significant shift toward online experiments. The COVID-19 pandemic forced greater numbers of cognitive scientists to collect data over the internet and to explore online alternatives to methods which they had previously used only in the lab. For example, since this time, eye-tracking capabilities have been introduced to a number of software packages for online experiments (Ba ́nki, de Eccher, Falschlehner, Hoehl, & Markova, 2022; Slim & Hartsuiker, 2022; Yang & Krajbich, 2021)
Recommended from our members
Tracking replicability as a method of post-publication open evaluation.
Recent reports have suggested that many published results are unreliable. To increase the reliability and accuracy of published papers, multiple changes have been proposed, such as changes in statistical methods. We support such reforms. However, we believe that the incentive structure of scientific publishing must change for such reforms to be successful. Under the current system, the quality of individual scientists is judged on the basis of their number of publications and citations, with journals similarly judged via numbers of citations. Neither of these measures takes into account the replicability of the published findings, as false or controversial results are often particularly widely cited. We propose tracking replications as a means of post-publication evaluation, both to help researchers identify reliable findings and to incentivize the publication of reliable results. Tracking replications requires a database linking published studies that replicate one another. As any such database is limited by the number of replication attempts published, we propose establishing an open-access journal dedicated to publishing replication attempts. Data quality of both the database and the affiliated journal would be ensured through a combination of crowd-sourcing and peer review. As reports in the database are aggregated, ultimately it will be possible to calculate replicability scores, which may be used alongside citation counts to evaluate the quality of work published in individual journals. In this paper, we lay out a detailed description of how this system could be implemented, including mechanisms for compiling the information, ensuring data quality, and incentivizing the research community to participate