184 research outputs found

    Chemical laboratories 4.0: A two-stage machine learning system for predicting the arrival of samples

    Get PDF
    This paper presents a two-stage Machine Learning (ML) model to predict the arrival time of In-Process Control (IPC) samples at the quality testing laboratories of a chemical company. The model was developed using three iterations of the CRoss-Industry Standard Process for Data Mining (CRISP-DM) methodology, each focusing on a different regression approach. To reduce the ML analyst effort, an Automated Machine Learning (AutoML) was adopted during the modeling stage of CRISP-DM. The AutoML was set to select the best among six distinct state-of-the-art regression algorithms. Using recent real-world data, the three main regression approaches were compared, showing that the proposed two-stage ML model is competitive and provides interesting predictions to support the laboratory management decisions (e.g., preparation of testing instruments). In particular, the proposed method can accurately predict 70% of the examples under a tolerance of 4 time units.This work has been supported by FCT – Funda ̧c ̃ao para a Ciˆencia e Tecnologiawithin the R&D Units Project Scope: UIDB/00319/2020. The authors also wishto thank the chemical company staff involved with this project for providing thedata and also the valuable domain feedback

    Implications of return predictability for consumption dynamics and asset pricing

    Get PDF
    Two broad classes of consumption dynamics—long-run risks and rare disasters—have proven successful in explaining the equity premium puzzle when used in conjunction with recursive preferences. We show that bounds a-là Gallant, Hansen, and Tauchen that restrict the volatility of the stochastic discount factor by conditioning on a set of return predictors constitute a useful tool to discriminate between these alternative dynamics. In particular, we document that models that rely on rare disasters meet comfortably the bounds independently of the forecasting horizon and the asset returns used to construct the bounds. However, the specific nature of disasters is a relevant characteristic at the 1-year horizon: disasters that unfold over multiple years are more successful in meeting the predictors-based bounds than one-period disasters. Instead, at the 5-year horizon, the sole presence of disasters—even if one-period and permanent—is sufficient for the model to satisfy the bounds. Finally, the bounds point to multiple volatility components in consumption as a promising dimension for long-run risk models

    Real-Time Forecasting with a MIDAS VAR

    Full text link
    This paper presents a MIDAS type mixed frequency VAR forecasting model. First, we propose a general and compact mixed frequency VAR framework using a stacked vector approach. Second, we integrate the mixed frequency VAR with a MIDAS type Almon lag polynomial scheme which is designed to reduce the parameter space while keeping models flexible. We show how to recast the resulting non-linear MIDAS type mixed frequency VAR into a linear equation system that can be easily estimated. A pseudo out-of-sample forecasting exercise with US real-time data yields that the mixed frequency VAR substantially improves predictive accuracy upon a standard VAR for different VAR specififications. Forecast errors for, e.g., GDP growth decrease by 30 to 60 percent for forecast horizons up to six months and by around 20 percent for a forecast horizon of one year

    Methods for Computing Marginal Data Densities from the Gibbs Output

    Full text link
    We introduce two new methods for estimating the Marginal Data Density (MDD) from the Gibbs output, which are based on exploiting the analytical tractability condition. Such a condition requires that some parameter blocks can be analytically integrated out from the conditional posterior densities. Our estimators are applicable to densely parameterized time series models such as VARs or DFMs. An empirical application to six-variate VAR models shows that the bias of a fully computational estimator is sufficiently large to distort the implied model rankings. One estimator is fast enough to make multiple computations of MDDs in densely parameterized models feasible

    A Practical, Accurate, Information Criterion for Nth Order Markov Processes

    Get PDF
    The recent increase in the breath of computational methodologies has been matched with a corresponding increase in the difficulty of comparing the relative explanatory power of models from different methodological lineages. In order to help address this problem a Markovian information criterion (MIC) is developed that is analogous to the Akaike information criterion (AIC) in its theoretical derivation and yet can be applied to any model able to generate simulated or predicted data, regardless of its methodology. Both the AIC and proposed MIC rely on the Kullback–Leibler (KL) distance between model predictions and real data as a measure of prediction accuracy. Instead of using the maximum likelihood approach like the AIC, the proposed MIC relies instead on the literal interpretation of the KL distance as the inefficiency of compressing real data using modelled probabilities, and therefore uses the output of a universal compression algorithm to obtain an estimate of the KL distance. Several Monte Carlo tests are carried out in order to (a) confirm the performance of the algorithm and (b) evaluate the ability of the MIC to identify the true data-generating process from a set of alternative models
    • …
    corecore