227,453 research outputs found
Input variable selection for forecasting models
2002 IFAC15th Triennial World Congress, Barcelona, SpainThe selection of input variables plays a crucial role when modelling time series. For nonlinear models there are not well developed techniques such as AIC and other criteria that work with linear models. In the case of Short Term Load Forecasting (STLF) generalization is greatly influenced by such selection. In this paper two approaches are compared using real data from a Spanish utility company. The models used are neural networks although the algorithms can be used with other nonlinear models. The experiments show that that input variable selection affects the performance of forecasting models and thus should be treated as a generalization problem
Prediction model for coronary artery disease using neural networks and feature selection based on classification and regression tree
Background and aims: Risk of implementing invasive diagnostic procedures for coronary artery disease (CAD) such as angiography is considerable. On the other hand, Successful experience has been achieved in medical data mining approaches. Therefore this study has been done to produce a model based on data mining techniques of neural networks that can predict coronary artery disease. Methods: In this descriptive- analytical study, the data set includes nine risk factors of 13228 participants who were undergone angiography at Tehran Heart Center. (4059 participants were not suffering from CAD but 9169 were suffering from CAD). Producing model for predicting coronary artery disease was done based on multilayer perceptron neural networks and variable selection based on classification and regression tree (CART) using of Statistica software. For comparison and selection of best model, the ROC curve analysis was used. Results: After seven-time modeling and comparing the generated models, the final model consists of all existing risk factors obtained with the area under ROC curve of 0.754, accuracy of 74.19%, sensitivity of 92.41% and specificity of 33.25% .Also, variable selection results in producing a model consists of four risk factors with area under ROC curve of 0.737, accuracy of 74.19%, sensitivity of 93.34% and specificity of 31.17% was produced. Conclusion: The obtained model is produced based on neural networks. The model is able to identify both high risk patients and acceptable number of healthy subjects. Also, utilizing the feature selection in this study ends up in production of a model which consists of only four risk factors as: age, sex, diabetes and high blood pressure
Bayesian Deep Net GLM and GLMM
Deep feedforward neural networks (DFNNs) are a powerful tool for functional
approximation. We describe flexible versions of generalized linear and
generalized linear mixed models incorporating basis functions formed by a DFNN.
The consideration of neural networks with random effects is not widely used in
the literature, perhaps because of the computational challenges of
incorporating subject specific parameters into already complex models.
Efficient computational methods for high-dimensional Bayesian inference are
developed using Gaussian variational approximation, with a parsimonious but
flexible factor parametrization of the covariance matrix. We implement natural
gradient methods for the optimization, exploiting the factor structure of the
variational covariance matrix in computation of the natural gradient. Our
flexible DFNN models and Bayesian inference approach lead to a regression and
classification method that has a high prediction accuracy, and is able to
quantify the prediction uncertainty in a principled and convenient way. We also
describe how to perform variable selection in our deep learning method. The
proposed methods are illustrated in a wide range of simulated and real-data
examples, and the results compare favourably to a state of the art flexible
regression and classification method in the statistical literature, the
Bayesian additive regression trees (BART) method. User-friendly software
packages in Matlab, R and Python implementing the proposed methods are
available at https://github.com/VBayesLabComment: 35 pages, 7 figure, 10 table
Performance prediction of the full-scale bardenpho process using a genetic adapted time-delay neural network (GATDNN)
Wastewater treatment systems are characterized by large temporal variability of inflow, variable concentrations of components in the incoming wastewater to the plant, and highly variable biological reactions within the process. The behavior of observed process variables within a wastewater treatment plant (WWTP at a certain time instant is the combined effect of various processes initiated at different moments in the past. This is called a time-delay effect in the system. Due to the nature of strong nonlinear mapping, neural networks provide advantages as a modeling and identification tool over a structure-based model. However, the determination of the architecture of the artificial neural networks (ANNs) and the selection of key input variables with a time delay is not easy. in our research, a genetic adapted time-delay neural network (GATDNN), which is a combination of time-delay neural network(TDNN) and genetic algorithms(GAs), was developed and applied to the full-scale Bardenpho advanced sewage treatment process. In a GATDNN, a three-step modelling procedure was performed: (1) selection of significant input variables to maximise the predictive accuracy for each specific output; (2) finding a suitable network topology for the ANN-based process estimator; (3) sensitivity analysis. The results demonstrate that the modelling technique presented using a GATDNN provides a valuable tool for predicting the outputs with high levels of accuracy and identifying key operating variables. This work will permit the development of a reliable control strategy thus reducing the burden of the process engineer
- …