224,351 research outputs found

    Forecasting realized volatility models:the benefits of bagging and nonlinear specifications

    Get PDF
    We forecast daily realized volatilities with linear and nonlinear models and evaluate the benefits of bootstrap aggregation (bagging) in producing more precise forecasts. We consider the linear autoregressive (AR) model, the Heterogeneous Autoregressive model (HAR), and a non-linear HAR model based on a neural network specification that allows for logistic transition effects (NNHAR). The models and the bagging schemes are applied to the realized volatility time series of the S&P500 index from 3-Jan-2000 through 30-Dec-2005. Our main findings are: (1) For the HAR model, bagging successfully averages over the randomness of variable selection; however, when the NN model is considered, there is no clear benefit from using bagging; (2) including past returns in the models improves the forecast precision; and (3) the NNHAR model outperforms the linear alternatives.

    Building Neural Network Models for Time Series: A Statistical Approach

    Get PDF
    This paper is concerned with modelling time series by single hidden layer feedforward neural network models. A coherent modelling strategy based on statistical inference is presented. Variable selection is carried out using existing techniques. The problem of selecting the number of hidden units is solved by sequentially applying Lagrange multiplier type tests, with the aim of avoiding the estimation of unidentified models. Misspecification tests are derived for evaluating an estimated neural network model. A small-sample simulation experiment is carried out to show how the proposed modelling strategy works and how the misspecification tests behave in small samples. Two applications to real time series, one univariate and the other multivariate, are considered as well. Sets of one-step-ahead forecasts are constructed and forecast accuracy is compared with that of other nonlinear models applied to the same series.

    Multivariate forecast of winter monsoon rainfall in India using SST anomaly as a predictor: Neurocomputing and statistical approaches

    Get PDF
    In this paper, the complexities in the relationship between rainfall and sea surface temperature (SST) anomalies during the winter monsoon (November-January) over India were evaluated statistically using scatter plot matrices and autocorrelation functions.Linear as well as polynomial trend equations were obtained and it was observed that the coefficient of determination for the linear trend was very low and it remained low even when polynomial trend of degree six was used. An exponential regression equation and an artificial neural network with extensive variable selection were generated to forecast the average winter monsoon rainfall of a given year using the rainfall amounts and the sea surface temperature anomalies in the winter monsoon months of the previous year as predictors. The regression coefficients for the multiple exponential regression equation were generated using Levenberg-Marquardt algorithm. The artificial neural network was generated in the form of a multiplayer perceptron with sigmoid non-linearity and genetic-algorithm based variable selection. Both of the predictive models were judged statistically using the Willmott index, percentage error of prediction, and prediction yields. The statistical assessment revealed the potential of artificial neural network over exponential regression.Comment: 18 page

    Neural Network Based Inferential Model For Ethane Steam Cracking Furnace

    Get PDF
    The product yield distribution of ethane steam cracking is typically obtained using analysers and lab sampling. Since both methods take time to produce results, primarily depending on them to determine main product yield will hinder immediate control action on the process. In order to resolve this issue, an inferential sensor is required. In this study, a neural network based inferential model is developed. The ethane steam cracking process has been modelled using ASPEN Plus and validated with industrial data taken from literature. The relative error (RE) of the model outputs obtained are less than 10%. The ASPEN Plus model is used for input variable selection, nonlinearity assessment, and data generation for neural network modelling. The input variable selection study found that five variables are significantly influential to the ethane and ethylene yields, namely reactor pressure, coil outlet temperature, steam-hydrocarbon ratio, feed composition, and fuel composition. Nonlinearity assessment of the process shows that the process exhibit asymmetrical response and input multiplicities characteristics, and thus, can be classified as a nonlinear process. Data generated from the ASPEN Plus model is used for training, validation, and testing. Two methods have been used to generate the data which are sequential excitation and simultaneous excitation. Four variables are individually excited and combined to make a sequential excitation profile. Data from sequential excitation is divided into training and validation while data from simultaneous excitation is used solely for testing. Three neural network model, namely the Feedforward Neural Network (FFNN), the Generalized Regression Neural Network (GRNN), and the Extreme Learning Machine Neural Network (ELM-NN) are developed and they are evaluated in terms of prediction accuracy and computational time. The evaluation results show that ELM-NN prediction accuracy is higher than FFNN and GRNN. To train, the best model for ELM-NN, GRNN, and FFNN models require 0.0068 seconds, 0.35 seconds, and 12 seconds respectively. In terms of computation time of new set of input data sample, all three models require less than 0.05 seconds to compute one sample of data. However, computation time of the trained GRNN model increases exponentially with the increasing amount of data samples in a batch while for trained FFNN and trained ELM-NN model, the increment is not significant. Out of the three models, the ELM-NN gives the best performance in terms of prediction accuracy and computational time. The R2 of the ELM-NN model is 91.3% and 82.6% for ethane and ethylene yield respectively. The model requires 0.0068 seconds to train and 0.0001 seconds to compute ethane yield and ethylene yields from a new set of input data. This makes the model suitable for applications in real time inferential control system

    Input variable selection for data-driven models of Coriolis flowmeters for two-phase flow measurement

    Get PDF
    Input variable selection is an essential step in the development of data-driven models for environmental, biological and industrial applications. Through input variable selection to eliminate the irrelevant or redundant variables, a suitable subset of variables is identified as the input of a model. Meanwhile, through input variable selection the complexity of the model structure is simplified and the computational efficiency is improved. This paper describes the procedures of the input variable selection for the data-driven models for the measurement of liquid mass flowrate and gas volume fraction under two-phase flow conditions using Coriolis flowmeters. Three advanced input variable selection methods, including Partial Mutual Information (PMI), Genetic Algorithm - Artificial Neural Network (GA-ANN) and tree-based Iterative Input Selection (IIS) are applied in this study. Typical data-driven models incorporating Support Vector Machine (SVM) are established individually based on the input candidates resulting from the selection methods. The validity of the selection outcomes is assessed through an output performance comparison of the SVM based data-driven models and sensitivity analysis. The validation and analysis results suggest that the input variables selected from the PMI algorithm provide more effective information for the models to measure liquid mass flowrate while the IIS algorithm provides a fewer but more effective variables for the models to predict gas volume fraction

    Crop Yield Prediction Using Deep Neural Networks

    Get PDF
    Crop yield is a highly complex trait determined by multiple factors such as genotype, environment, and their interactions. Accurate yield prediction requires fundamental understanding of the functional relationship between yield and these interactive factors, and to reveal such relationship requires both comprehensive datasets and powerful algorithms. In the 2018 Syngenta Crop Challenge, Syngenta released several large datasets that recorded the genotype and yield performances of 2,267 maize hybrids planted in 2,247 locations between 2008 and 2016 and asked participants to predict the yield performance in 2017. As one of the winning teams, we designed a deep neural network (DNN) approach that took advantage of state-of-the-art modeling and solution techniques. Our model was found to have a superior prediction accuracy, with a root-mean-square-error (RMSE) being 12% of the average yield and 50% of the standard deviation for the validation dataset using predicted weather data. With perfect weather data, the RMSE would be reduced to 11% of the average yield and 46% of the standard deviation. We also performed feature selection based on the trained DNN model, which successfully decreased the dimension of the input space without significant drop in the prediction accuracy. Our computational results suggested that this model significantly outperformed other popular methods such as Lasso, shallow neural networks (SNN), and regression tree (RT). The results also revealed that environmental factors had a greater effect on the crop yield than genotype.Comment: 9 pages, Presented at 2018 INFORMS Conference on Business Analytics and Operations Research (Baltimore, MD, USA). One of the winning solutions to the 2018 Syngenta Crop Challeng

    Application of neural networks and sensitivity analysis to improved prediction of trauma survival

    Get PDF
    Application of neural networks and sensitivity analysis to improved prediction of trauma surviva
    corecore