5 research outputs found

    CrowdIQ: A New Opinion Aggregation Model

    Get PDF
    In this study, we investigate the problem of aggregating crowd opinions for decision making. The Wisdom of Crowds (WoC) theory explains how crowd opinions should be aggregated in order to improve the performance of decision making. Crowd independence and a weighting mechanism are two important factors to crowd wisdom. However, most existing crowd opinion aggregation methods fail to build a differential weighting mechanism for identifying the expertise of individuals and appropriately accounting for crowd dependence when aggregating their judgments. We propose a new crowd opinion aggregation model, namely CrowdIQ, that has a differential weighting mechanism and accounts for individual dependence. We empirically evaluate CrowdIQ in comparison to four baseline methods using real data collected from StockTwits. The results show that, CrowdIQ significantly outperforms all baseline methods in terms of both a quadratic prediction scoring measure and simulated investment returns

    Correcting Judgment Correctives in National Security Intelligence

    Get PDF
    Intelligence analysts, like other professionals, form norms that define standards of tradecraft excellence. These norms, however, have evolved in an idiosyncratic manner that reflects the influence of prominent insiders who had keen psychological insights but little appreciation for how to translate those insights into testable hypotheses. The net result is that the prevailing tradecraft norms of best practice are only loosely grounded in the science of judgment and decision-making. The “common sense” of prestigious opinion leaders inside the intelligence community has pre-empted systematic validity testing of the training techniques and judgment aids endorsed by those opinion leaders. Drawing on the scientific literature, we advance hypotheses about how current best practices could well be reducing rather than increasing the quality of analytic products. One set of hypotheses pertain to the failure of tradecraft training to recognize the most basic threat to accuracy: measurement error in the interpretation of the same data and in the communication of interpretations. Another set of hypotheses focuses on the insensitivity of tradecraft training to the risk that issuing broad-brush, one-directional warnings against bias (e.g., over-confidence) will be less likely to encourage self-critical, deliberative cognition than simple response-threshold shifting that yields the mirror-image bias (e.g., under-confidence). Given the magnitude of the consequences of better and worse intelligence analysis flowing to policy-makers, we see a compelling case for greater funding of efforts to test what actually works

    Estimating credibility of science claims : analysis of forecasting data from metascience projects : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany, New Zealand

    Get PDF
    The veracity of scientific claims is not always certain. In fact, sufficient claims have been proven incorrect that many scientists believe that science itself is facing a “replication crisis”. Large scale replication projects provided empirical evidence that only around 50% of published social and behavioral science findings are replicable. Multiple forecasting studies showed that the outcomes of replication projects could be predicted by crowdsourced human evaluators. The research presented in this thesis builds on previous forecasting studies, deriving new findings and exploring new scope and scale. The research is centered around the DARPA SCORE (Systematizing Confidence in Open Research and Evidence) programme, a project aimed at developing measures of credibility for social and behavioral science claims. As part of my contribution to SCORE, myself, along with a international collaboration, elicited forecasts from human experts via surveys and prediction markets to predict the replicability of 3000 claims. I also present research on other forecasting studies. In chapter 2, I pool data from previous studies to analyse the performance of prediction markets and surveys with higher statistical power. I confirm that prediction markets are better at forecasting replication outcomes than surveys. This study also demonstrates the relationship between p-values of original findings and replication outcomes. These findings are used to inform the experimental and statistical design to forecast the replicability of 3000 claims as part of the SCORE programme. A full description of the design including planned statistical analyses is included in chapter 3. Due to COVID-19 restrictions, our generated forecasts could not be validated through direct replication, experiments conducted by other teams within the SCORE collaboration, thereby preventing results being presented in this thesis. The completion of these replications is now scheduled for 2022, and the pre-analysis plan presented in Chapter 3 will provide the basis for the analysis of the resulting data. In chapter 4, an analysis of ‘meta’ forecasts, or forecasts regarding field wide replication rates and year specific replication rates, is presented. We presented and published community expectations that replication rates will differ by field and will increase over time. These forecasts serve as valuable insights into the academic community’s views of the replication crisis, including those research fields for which no large-scale replication studies have been undertaken yet. Once the full results from SCORE are available, there will be additional insights from validations of the community expectations. I also analyse forecaster’s ability to predict replications and effect sizes in Chapters 5 (Creative Destruction in Science) and 6 (A creative destruction approach to replication: Implicit work and sex morality across cultures). In these projects a ‘creative destruction’ approach to replication was used, where a claim is compared not only to the null hypothesis but to alternative contradictory claims. I conclude forecasters can predict the size and direction of effects. Chapter 7 examines the use of forecasting for scientific outcomes beyond replication. In the COVID-19 preprint forecasting project I find that forecasters can predict if a preprint will be published within one year, including the quality of the publishing journal. Forecasters can also predict the number of citations preprints will receive. This thesis demonstrates that information about scientific claims with respect to replicability is dispersed within scientific community. I have helped to develop methodologies and tools to efficiently elicit and aggregate forecasts. Forecasts about scientific outcomes can be used as guides to credibility, to gauge community expectations and to efficiently allocate sparse replication resources

    Collaborative decision-making for hierarchical forecasts: a multi-sector perspective

    Get PDF
    Demand forecasting and planning is never devoid of human judgment. Managers input their opinions at various stages of the organisational decision-making process. The process is generally a combination of statistical forecasts, produced using software packages, and managerial expert judgments. Often this process is broken down into different hierarchies based on product-type, locations, and customer classes. When forecasts are produced in such hierarchical fashion, there is a need to aggregate the forecasts so that consistency is maintained across the organisation. A number of studies have explored how to achieve this forecast consistency using statistical methods. But none of these methods utilise judgmental inputs from managers at different hierarchical levels. Hence, a research gap has been identified to address this topic of hierarchical forecast aggregation using a judgmental reconciliation process. We address this gap in this study by employing a case study design with 6 case organisations, which are spread across industries, sectors and geographies. Exploratory interviews are conducted with case managers to investigate the current hierarchical forecasting process. Four dominant themes of this process are identified as information sharing, time pressure, power, and social value. A strengths, weaknesses, opportunities, and threats (SWOT) analysis on these themes help access their impacts on the forecast decision-making process. Based on this, three theoretical prepositions and a conceptual framework are suggested. A questionnaire is used to gather wider managerial opinions on the SWOT analysis. Using Multiple Attribute Decision Making (MADM) methods, the responses are analysed to confirm the impact of each theme. This validates the prepositions and the suggested framework for collaborative decision-making in hierarchical forecasting. We conclude with research implications and recommendations for future research
    corecore