227,453 research outputs found

    Input variable selection for forecasting models

    Get PDF
    2002 IFAC15th Triennial World Congress, Barcelona, SpainThe selection of input variables plays a crucial role when modelling time series. For nonlinear models there are not well developed techniques such as AIC and other criteria that work with linear models. In the case of Short Term Load Forecasting (STLF) generalization is greatly influenced by such selection. In this paper two approaches are compared using real data from a Spanish utility company. The models used are neural networks although the algorithms can be used with other nonlinear models. The experiments show that that input variable selection affects the performance of forecasting models and thus should be treated as a generalization problem

    Prediction model for coronary artery disease using neural networks and feature selection based on classification and regression tree

    Get PDF
    Background and aims: Risk of implementing invasive diagnostic procedures for coronary artery disease (CAD) such as angiography is considerable. On the other hand, Successful experience has been achieved in medical data mining approaches. Therefore this study has been done to produce a model based on data mining techniques of neural networks that can predict coronary artery disease. Methods: In this descriptive- analytical study, the data set includes nine risk factors of 13228 participants who were undergone angiography at Tehran Heart Center. (4059 participants were not suffering from CAD but 9169 were suffering from CAD). Producing model for predicting coronary artery disease was done based on multilayer perceptron neural networks and variable selection based on classification and regression tree (CART) using of Statistica software. For comparison and selection of best model, the ROC curve analysis was used. Results: After seven-time modeling and comparing the generated models, the final model consists of all existing risk factors obtained with the area under ROC curve of 0.754, accuracy of 74.19%, sensitivity of 92.41% and specificity of 33.25% .Also, variable selection results in producing a model consists of four risk factors with area under ROC curve of 0.737, accuracy of 74.19%, sensitivity of 93.34% and specificity of 31.17% was produced. Conclusion: The obtained model is produced based on neural networks. The model is able to identify both high risk patients and acceptable number of healthy subjects. Also, utilizing the feature selection in this study ends up in production of a model which consists of only four risk factors as: age, sex, diabetes and high blood pressure

    Bayesian Deep Net GLM and GLMM

    Full text link
    Deep feedforward neural networks (DFNNs) are a powerful tool for functional approximation. We describe flexible versions of generalized linear and generalized linear mixed models incorporating basis functions formed by a DFNN. The consideration of neural networks with random effects is not widely used in the literature, perhaps because of the computational challenges of incorporating subject specific parameters into already complex models. Efficient computational methods for high-dimensional Bayesian inference are developed using Gaussian variational approximation, with a parsimonious but flexible factor parametrization of the covariance matrix. We implement natural gradient methods for the optimization, exploiting the factor structure of the variational covariance matrix in computation of the natural gradient. Our flexible DFNN models and Bayesian inference approach lead to a regression and classification method that has a high prediction accuracy, and is able to quantify the prediction uncertainty in a principled and convenient way. We also describe how to perform variable selection in our deep learning method. The proposed methods are illustrated in a wide range of simulated and real-data examples, and the results compare favourably to a state of the art flexible regression and classification method in the statistical literature, the Bayesian additive regression trees (BART) method. User-friendly software packages in Matlab, R and Python implementing the proposed methods are available at https://github.com/VBayesLabComment: 35 pages, 7 figure, 10 table

    Performance prediction of the full-scale bardenpho process using a genetic adapted time-delay neural network (GATDNN)

    Get PDF
    Wastewater treatment systems are characterized by large temporal variability of inflow, variable concentrations of components in the incoming wastewater to the plant, and highly variable biological reactions within the process. The behavior of observed process variables within a wastewater treatment plant (WWTP at a certain time instant is the combined effect of various processes initiated at different moments in the past. This is called a time-delay effect in the system. Due to the nature of strong nonlinear mapping, neural networks provide advantages as a modeling and identification tool over a structure-based model. However, the determination of the architecture of the artificial neural networks (ANNs) and the selection of key input variables with a time delay is not easy. in our research, a genetic adapted time-delay neural network (GATDNN), which is a combination of time-delay neural network(TDNN) and genetic algorithms(GAs), was developed and applied to the full-scale Bardenpho advanced sewage treatment process. In a GATDNN, a three-step modelling procedure was performed: (1) selection of significant input variables to maximise the predictive accuracy for each specific output; (2) finding a suitable network topology for the ANN-based process estimator; (3) sensitivity analysis. The results demonstrate that the modelling technique presented using a GATDNN provides a valuable tool for predicting the outputs with high levels of accuracy and identifying key operating variables. This work will permit the development of a reliable control strategy thus reducing the burden of the process engineer
    corecore