5 research outputs found

    The Crowd Classification Problem: Social Dynamics of Binary Choice Accuracy

    Get PDF
    Decades of research suggest that information exchange in groups and organizations can reliably improve judgment accuracy in tasks such as financial forecasting, market research, and medical decision making. However, we show that improving the accuracy of numeric estimates does not necessarily improve the accuracy of decisions. For binary-choice judgments, also known as classification tasks—for example, yes/no or build/buy decisions—social influence is most likely to grow the majority vote share, regardless of the accuracy of that opinion. As a result, initially, inaccurate groups become increasingly inaccurate after information exchange, even as they signal stronger support. We term this dynamic the “crowd classification problem.” Using both a novel data set and a reanalysis of three previous data sets, we study this process in two types of information exchange: (1) when people share votes only, and (2) when people form and exchange numeric estimates prior to voting. Surprisingly, when people exchange numeric estimates prior to voting, the binary-choice vote can become less accurate, even as the average numeric estimate becomes more accurate. Our findings recommend against voting as a form of decision making when groups are optimizing for accuracy. For those cases where voting is required, we discuss strategies for managing communication to avoid the crowd classification problem. We close with a discussion of how our results contribute to a broader contingency theory of collective intelligence

    Estimating credibility of science claims : analysis of forecasting data from metascience projects : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany, New Zealand

    Get PDF
    The veracity of scientific claims is not always certain. In fact, sufficient claims have been proven incorrect that many scientists believe that science itself is facing a “replication crisis”. Large scale replication projects provided empirical evidence that only around 50% of published social and behavioral science findings are replicable. Multiple forecasting studies showed that the outcomes of replication projects could be predicted by crowdsourced human evaluators. The research presented in this thesis builds on previous forecasting studies, deriving new findings and exploring new scope and scale. The research is centered around the DARPA SCORE (Systematizing Confidence in Open Research and Evidence) programme, a project aimed at developing measures of credibility for social and behavioral science claims. As part of my contribution to SCORE, myself, along with a international collaboration, elicited forecasts from human experts via surveys and prediction markets to predict the replicability of 3000 claims. I also present research on other forecasting studies. In chapter 2, I pool data from previous studies to analyse the performance of prediction markets and surveys with higher statistical power. I confirm that prediction markets are better at forecasting replication outcomes than surveys. This study also demonstrates the relationship between p-values of original findings and replication outcomes. These findings are used to inform the experimental and statistical design to forecast the replicability of 3000 claims as part of the SCORE programme. A full description of the design including planned statistical analyses is included in chapter 3. Due to COVID-19 restrictions, our generated forecasts could not be validated through direct replication, experiments conducted by other teams within the SCORE collaboration, thereby preventing results being presented in this thesis. The completion of these replications is now scheduled for 2022, and the pre-analysis plan presented in Chapter 3 will provide the basis for the analysis of the resulting data. In chapter 4, an analysis of ‘meta’ forecasts, or forecasts regarding field wide replication rates and year specific replication rates, is presented. We presented and published community expectations that replication rates will differ by field and will increase over time. These forecasts serve as valuable insights into the academic community’s views of the replication crisis, including those research fields for which no large-scale replication studies have been undertaken yet. Once the full results from SCORE are available, there will be additional insights from validations of the community expectations. I also analyse forecaster’s ability to predict replications and effect sizes in Chapters 5 (Creative Destruction in Science) and 6 (A creative destruction approach to replication: Implicit work and sex morality across cultures). In these projects a ‘creative destruction’ approach to replication was used, where a claim is compared not only to the null hypothesis but to alternative contradictory claims. I conclude forecasters can predict the size and direction of effects. Chapter 7 examines the use of forecasting for scientific outcomes beyond replication. In the COVID-19 preprint forecasting project I find that forecasters can predict if a preprint will be published within one year, including the quality of the publishing journal. Forecasters can also predict the number of citations preprints will receive. This thesis demonstrates that information about scientific claims with respect to replicability is dispersed within scientific community. I have helped to develop methodologies and tools to efficiently elicit and aggregate forecasts. Forecasts about scientific outcomes can be used as guides to credibility, to gauge community expectations and to efficiently allocate sparse replication resources

    Algorithmic Robust Forecast Aggregation

    Full text link
    Forecast aggregation combines the predictions of multiple forecasters to improve accuracy. However, the lack of knowledge about forecasters' information structure hinders optimal aggregation. Given a family of information structures, robust forecast aggregation aims to find the aggregator with minimal worst-case regret compared to the omniscient aggregator. Previous approaches for robust forecast aggregation rely on heuristic observations and parameter tuning. We propose an algorithmic framework for robust forecast aggregation. Our framework provides efficient approximation schemes for general information aggregation with a finite family of possible information structures. In the setting considered by Arieli et al. (2018) where two agents receive independent signals conditioned on a binary state, our framework also provides efficient approximation schemes by imposing Lipschitz conditions on the aggregator or discrete conditions on agents' reports. Numerical experiments demonstrate the effectiveness of our method by providing a nearly optimal aggregator in the setting considered by Arieli et al. (2018)

    Essays on Creative Ideation and New Product Design

    Get PDF
    Creative ideation, i.e., the generation of novel ideas, represents the terminus-a-quo in the design and development of innovative products. In my dissertation essays, I examine two approaches employed by firms for creative ideation, (1) channeled ideation, a closed approach, which involves applying replicable patterns or properties observed in historical innovations and (2) idea crowdsourcing, an open approach where firms invite crowds to contribute ideas to solve a specific challenge. In my studies, I clarify how firms can incorporate market-related information in the channeled ideation process and examine how the selection of ideas in crowdsourcing challenges relates to local and global novelty. In Essay 1, “Attribute Auto-dynamics and New Product Ideation,” I introduce a replicable property – attribute auto-dynamics, observed in several novel products, where a product possesses the ability to modify its attributes automatically in response to changing customer, product-system, or environmental conditions. I propose a typology of attribute auto-dynamics, based on an analysis of U.S. utility patents. Based on this typology, I specify a procedural framework for new product ideation that integrates market-pull relevant knowledge and technology-push relevant knowledge. I also illustrate how managers and product designers can apply the framework to identify new product ideas for specific target markets using a channeled ideation approach. In Essay 2, “Selection in Crowdsourced Ideation: Role of Local and Global Novelty,” I examine how the selection of ideas in crowdsourced challenges depends on the form of novelty – local or global. Firms often turn to idea crowdsourcing challenges to obtain novel ideas. Yet prior research cautions that ideators and seeker firms may not select novel ideas. To reexamine the links between idea novelty and selection, I propose a bi-faceted notion of idea novelty that may be local or global. Examining data on OpenIDEO, I find that the selection of novel ideas differs according to the selector, the form of novelty, and the challenge task structure. I also specify a predictive model that seeker firms can leverage when ideator selection metrics such as likes are unavailable.Doctor of Philosoph
    corecore