49 research outputs found

    Can groups improve expert economic and financial forecasts?

    Get PDF
    Economic and financial forecasts are important for business planning and government policy but are notoriously challenging. We take advantage of recent advances in individual and group judgement, and a data set of economic and financial forecasts compiled over 25 years, consisting of multiple individual and institutional estimates, to test the claim that nominal groups will make more accurate economic and financial forecast than individuals. We validate the forecasts using the subsequent published (real) outcomes, explore the performance of nominal groups against institutions, identify potential superforecasters and discuss the benefits of implementing structured judgment techniques to improve economic and financial forecasts

    Improving expert forecasts in reliability. Application and evidence for structured elicitation protocols

    Get PDF
    Quantitative expert judgementsare used in reliability assessmentsto informcritically important decisions. Structured elicitation protocols have been advocated to improveexpert judgements, yet their application in reliability ischallenged by a lack of examples or evidence that they improve judgements. This paper aims to overcome these barriers. We present a case study where two world-leading protocols, the IDEA protocol and the Classical Model were combined and applied by the Australian Department of Defence for a reliability assessment. We assess the practicality of the methods, and the extent to which they improve judgements. The average expert was extremely overconfident, with 90% credible intervals containing the true realisation 36% of the time. However,steps contained inthe protocols substantially improvedjudgements. In particular, an equal weighted aggregation of individual judgements, and the inclusion ofa discussion phase and revised estimate helped to improve calibration, statistical accuracy and the Classical Model score. Further improvements in precision and information were made via performance weighted aggregation. This paper provides useful insights into the application of structured elicitation protocols for reliability andthe extent to which judgements are improved. The findings raise concerns about existing practices for utilising experts in reliability assessments and suggest greater adoption of structured protocols is warranted. We encourage the reliability community to further develop examples and insights

    The value of performance weights and discussion in aggregated expert judgements

    Get PDF
    In risky situations characterized by imminent decisions, scarce resources, and insufficient data, policymakers rely on experts to estimate model parameters and their associated uncertainties. Different elicitation and aggregation methods can vary substantially in their efficacy and robustness. While it is generally agreed that biases in expert judgments can be mitigated using structured elicitations involving groups rather than individuals, there is still some disagreement about how to best elicit and aggregate judgments. This mostly concerns the merits of using performance‐based weighting schemes to combine judgments of different individuals (rather than assigning equal weights to individual experts), and the way that interaction between experts should be handled. This article aims to contribute to, and complement, the ongoing discussion on these topics

    Online training courses on Expert Knowledge Elicitation (EKE)

    Get PDF
    This report summarises the training courses delivered under the contract OC/EFSA/AMU/2021/02 EKE: “Develop and conduct online training courses on Expert Knowledge Elicitation (EKE)”. The objective of the courses was to develop and conduct online training courses on applying the methodology described in the EFSA Guidance on Expert Knowledge Elicitation in Food and Feed Safety Risk Assessment” for EFSA staff and experts, as well as corresponding experts from EU member states. In addition to the three standard EKE methods (Sheffield, Delphi and Cooke), the training included a semi-formal method of EKE. All these methods may be used when EKE is performed within an existing EFSA working group to support uncertainty analysis as outlined in “The principles and methods behind EFSA\u27s Guidance on Uncertainty Analysis in Scientific Assessment”. In total, 12 courses were organised: two on “Steering an Expert Knowledge Elicitation”, two on “Conduct of the Sheffield protocol for an EKE”, one on “Conduct of the Cooke protocol for an EKE”, one on “Conduct of the Delphi protocol for an EKE”, two on “Conduct of a Semi-formal EKE”, two on “Reporting an Expert Knowledge Elicitation” and two on “Writing an Evidence Dossier for an Expert Knowledge Elicitation”. The courses had in total 149 participants and received very good feedback from the participants with a mean value of 4.2 of 5 possible, considering all numerical questions in the feedback questionnaire. Recommendations for future activities on training EKE methodologies are provided

    Mathematically aggregating experts' predictions of possible futures

    Get PDF
    Structured protocols offer a transparent and systematic way to elicit and combine/aggregate, probabilistic predictions from multiple experts. These judgements can be aggregated behaviourally or mathematically to derive a final group prediction. Mathematical rules (e.g., weighted linear combinations of judgments) provide an objective approach to aggregation. The quality of this aggregation can be defined in terms of accuracy, calibration and informativeness. These measures can be used to compare different aggregation approaches and help decide on which aggregation produces the “best” final prediction. When experts’ performance can be scored on similar questions ahead of time, these scores can be translated into performance-based weights, and a performance-based weighted aggregation can then be used. When this is not possible though, several other aggregation methods, informed by measurable proxies for good performance, can be formulated and compared. Here, we develop a suite of aggregation methods, informed by previous experience and the available literature. We differentially weight our experts’ estimates by measures of reasoning, engagement, openness to changing their mind, informativeness, prior knowledge, and extremity, asymmetry or granularity of estimates. Next, we investigate the relative performance of these aggregation methods using three datasets. The main goal of this research is to explore how measures of knowledge and behaviour of individuals can be leveraged to produce a better performing combined group judgment. Although the accuracy, calibration, and informativeness of the majority of methods are very similar, a couple of the aggregation methods consistently distinguish themselves as among the best or worst. Moreover, the majority of methods outperform the usual benchmarks provided by the simple average or the median of estimates

    Can groups improve expert economic and financial forecasts?

    Get PDF
    Economic and financial forecasts are important for business planning and government policy but are notoriously challenging. We take advantage of recent advances in individual and group judgement, and a data set of economic and financial forecasts compiled over 25 years, consisting of multiple individual and institutional estimates, to test the claim that nominal groups will make more accurate economic and financial forecast than individuals. We validate the forecasts using the subsequent published (real) outcomes, explore the performance of nominal groups against institutions, identify potential superforecasters and discuss the benefits of implementing structured judgment techniques to improve economic and financial forecasts

    What is a Good Calibration Question?

    No full text
    Weighted aggregation of expert judgements based on their performance on calibration questions may improve mathematically aggregated judgements relative to equal weights. However, obtaining validated, relevant calibration questions can be difficult. If so, should analysts settle for equal weights? Or should they use calibration questions that are easier to obtain but less relevant? In this paper, we examine what happens to the out-of-sample performance of weighted aggregations of the Classical Model compared to equal weighted aggregations when the set of calibration questions includes many so-called ‘irrelevant’ questions, those that might ordinarily be considered to be outside the domain of the questions of interest. We find that performance weighted aggregations outperform equal weights on the combined Classical Model (CM) Score, but not on Statistical Accuracy (i.e., calibration). Importantly, there was no appreciable difference in performance when weights were developed on relevant versus irrelevant questions. Experts were unable to adapt their knowledge across vastly different domains, and in-sample validation did not accurately predict out-of-sample performance on irrelevant questions. We suggest that if relevant calibration questions cannot be found, then analysts should use equal weights, and draw on alternative techniques to improve judgements. Our study also indicates limits to the predictive accuracy of performance weighted aggregation, and the degree to which expertise can be adapted across domains. We note limitations in our study and urge further research into the effect of question type on the reliability of performance weighted aggregations
    corecore