17 research outputs found

    Open Science Perspectives on Machine Learning for the Identification of Careless Responding:A New Hope or Phantom Menace?

    Get PDF
    Powerful methods for identifying careless respondents in survey data are not just important to ensure the validity of subsequent data analyses, they are also instrumental for studying the psychological processes that drive humans to respond carelessly. Conversely, a deeper understanding of the phenomenon of careless responding enables the development of improved methods for the identification of careless respondents. While machine learning has gained substantial attention and popularity in many scientific fields, it is largely unexplored for the detection of careless responding. On the one hand, machine learning algorithms can be highly powerful tools due to their flexibility. On the other hand, science based on machine learning has been criticized in the literature for a lack of reproducibility. We assess the potential and the pitfalls of machine learning approaches for identifying careless respondents from an open science perspective. In particular, we discuss possible sources of reproducibility issues when applying machine learning in the context of careless responding, and we give practical guidelines on how to avoid them. Furthermore, we illustrate the high potential of an unsupervised machine learning method for the identification of careless respondents in a proof-of-concept simulation experiment. Finally, we stress the necessity of building an open data repository with labeled benchmark data sets, which would enable the evaluation of methods in a more realistic setting and make it possible to train supervised learning methods. Without such a data repository, the true potential of machine learning for the identification of careless responding may fail to be unlocked.</p

    Open Science Perspectives on Machine Learning for the Identification of Careless Responding:A New Hope or Phantom Menace?

    Get PDF
    Powerful methods for identifying careless respondents in survey data are not just important to ensure the validity of subsequent data analyses, they are also instrumental for studying the psychological processes that drive humans to respond carelessly. Conversely, a deeper understanding of the phenomenon of careless responding enables the development of improved methods for the identification of careless respondents. While machine learning has gained substantial attention and popularity in many scientific fields, it is largely unexplored for the detection of careless responding. On the one hand, machine learning algorithms can be highly powerful tools due to their flexibility. On the other hand, science based on machine learning has been criticized in the literature for a lack of reproducibility. We assess the potential and the pitfalls of machine learning approaches for identifying careless respondents from an open science perspective. In particular, we discuss possible sources of reproducibility issues when applying machine learning in the context of careless responding, and we give practical guidelines on how to avoid them. Furthermore, we illustrate the high potential of an unsupervised machine learning method for the identification of careless respondents in a proof-of-concept simulation experiment. Finally, we stress the necessity of building an open data repository with labeled benchmark data sets, which would enable the evaluation of methods in a more realistic setting and make it possible to train supervised learning methods. Without such a data repository, the true potential of machine learning for the identification of careless responding may fail to be unlocked.</p

    Forecasting Real GDP Growth for Africa

    Get PDF
    We propose a simple and reproducible methodology to create a single equation forecasting model (SEFM) for low-frequency macroeconomic variables. Our methodology is illustrated by forecasting annual real GDP growth rates for 52 African countries, where the data are obtained from the World Bank and start in 1960. The models include lagged growth rates of other countries, as well as a cointegration relationship to capture potential common stochastic trends. With a few selection steps, our methodology quickly arrives at a reasonably small forecasting model per country. Compared with benchmark models, the single equation forecasting models seem to perform quite well

    Does More Expert Adjustment Associate with Less Accurate Professional Forecasts?

    Get PDF
    Professional forecasters can rely on an econometric model to create their forecasts. It is usually unknown to what extent they adjust an econometric modelā€based forecast. In this paper we show, while making just two simple assumptions, that it is possible to estimate the persistence and variance of the deviation of their forecasts from forecasts from an econometric model. A key feature of the data that facilitates our estimates is that we have forecast updates for the same forecast target. An illustration to consensus forecasters who give forecasts for GDP growth, inflation and unemployment for a range of countries and years suggests that the more a forecaster deviates from a prediction from an econometric model, the less accurate are the forecasts

    Does More Expert Adjustment Associate with Less Accurate Professional Forecasts?

    Get PDF
    Professional forecasters can rely on an econometric model to create their forecasts. It is usually unknown to what extent they adjust an econometric modelā€based forecast. In this paper we show, while making just two simple assumptions, that it is possible to estimate the persistence and variance of the deviation of their forecasts from forecasts from an econometric model. A key feature of the data that facilitates our estimates is that we have forecast updates for the same forecast target. An illustration to consensus forecasters who give forecasts for GDP growth, inflation and unemployment for a range of countries and years suggests that the more a forecaster deviates from a prediction from an econometric model, the less accurate are the forecasts

    Evaluating heterogeneous forecasts for vintages of macroeconomic variables

    Get PDF
    There are various reasons why professional forecasters may disagree in their quotes for macroeconomic variables. One reason is that they target at different vintages of the data. We propose a novel method to test forecast bias in case of such heterogeneity. The method is based on Symbolic Regression, where the variables of interest become interval variables. We associate the interval containing the vintages of data with the intervals of the forecasts. An illustration to 18 years of forecasts for annual USA real GDP growth, given by the Consensus Economics forecasters, shows the relevance of the method

    Open Science Perspectives on Machine Learning for the Identification of Careless Responding: A New Hope or Phantom Menace?

    No full text
    Powerful methods for identifying careless respondents in survey data are not just important to ensure the validity of subsequent data analyses, they are also instrumental for studying the psychological processes that drive humans to respond carelessly. Conversely, a deeper understanding of the phenomenon of careless responding enables the development of improved methods for the identification of careless respondents. While machine learning has gained substantial attention and popularity in many scientific fields, it is largely unexplored for the detection of careless responding. On the one hand, machine learning algorithms can be highly powerful tools due to their flexibility. On the other hand, science based on machine learning has been criticized in the literature for a lack of reproducibility. We assess the potential and the pitfalls of machine learning approaches for identifying careless respondents from an open science perspective. In particular, we discuss possible sources of reproducibility issues when applying machine learning in the context of careless responding, and we give practical guidelines on how to avoid them. Furthermore, we illustrate the high potential of an unsupervised machine learning method for the identification of careless respondents in a proof-of-concept simulation experiment. Finally, we stress the necessity of building an open data repository with accurately labeled benchmark data sets, which would enable the evaluation of methods in a more realistic setting and make it possible to train supervised learning methods. Without such a data repository, the true potential of machine learning for the identification of careless responding may fail to be unlocked

    Forecasting Real GDP Growth for Africa

    No full text
    We propose a simple and reproducible methodology to create a single equation forecasting model (SEFM) for low-frequency macroeconomic variables. Our methodology is illustrated by forecasting annual real GDP growth rates for 52 African countries, where the data are obtained from the World Bank and start in 1960. The models include lagged growth rates of other countries, as well as a cointegration relationship to capture potential common stochastic trends. With a few selection steps, our methodology quickly arrives at a reasonably small forecasting model per country. Compared with benchmark models, the single equation forecasting models seem to perform quite well
    corecore