280 research outputs found
A novel ensemble method for the accurate prediction of the major oil prices in Tanzania
Global development relies much on oil to run different types of machines. Using oil to power many types of equipment is very important to world economic growth. The analysis of oil prices is crucial for the country's long-term stability. However, global monopoly producers, wars, and pandemics have contributed to the volatility of crude oil prices. As a result, the optimal prediction model for oil prices becomes crucial. The performance of several ensemble strategies on single traditional and machine learning models was examined in this study. We found that the weighted ensemble technique outperformed other ensemble and single models in predicting petrol and diesel prices in Tanzania based on four performance metrics. Furthermore, a spike in global oil prices necessitates global economic and political stability for non-oil-producing nations to avoid suffering the consequences. Finally, other ensemble approaches may be used and compared to predict the oil prices.
Air Quality Prediction in Smart Cities Using Machine Learning Technologies Based on Sensor Data: A Review
The influence of machine learning technologies is rapidly increasing and penetrating almost in every field, and air pollution prediction is not being excluded from those fields. This paper covers the revision of the studies related to air pollution prediction using machine learning algorithms based on sensor data in the context of smart cities. Using the most popular databases and executing the corresponding filtration, the most relevant papers were selected. After thorough reviewing those papers, the main features were extracted, which served as a base to link and compare them to each other. As a result, we can conclude that: (1) instead of using simple machine learning techniques, currently, the authors apply advanced and sophisticated techniques, (2) China was the leading country in terms of a case study, (3) Particulate matter with diameter equal to 2.5 micrometers was the main prediction target, (4) in 41% of the publications the authors carried out the prediction for the next day, (5) 66% of the studies used data had an hourly rate, (6) 49% of the papers used open data and since 2016 it had a tendency to increase, and (7) for efficient air quality prediction it is important to consider the external factors such as weather conditions, spatial characteristics, and temporal features
Claim Models: Granular Forms and Machine Learning Forms
This collection of articles addresses the most modern forms of loss reserving methodology: granular models and machine learning models. New methodologies come with questions about their applicability. These questions are discussed in one article, which focuses on the relative merits of granular and machine learning models. Others illustrate applications with real-world data. The examples include neural networks, which, though well known in some disciplines, have previously been limited in the actuarial literature. This volume expands on that literature, with specific attention to their application to loss reserving. For example, one of the articles introduces the application of neural networks of the gated recurrent unit form to the actuarial literature, whereas another uses a penalized neural network. Neural networks are not the only form of machine learning, and two other papers outline applications of gradient boosting and regression trees respectively. Both articles construct loss reserves at the individual claim level so that these models resemble granular models. One of these articles provides a practical application of the model to claim watching, the action of monitoring claim development and anticipating major features. Such watching can be used as an early warning system or for other administrative purposes. Overall, this volume is an extremely useful addition to the libraries of those working at the loss reserving frontier
Development of explainable AI-based predictive models for bubbling fluidised bed gasification process
© 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).In this study, seven different types of regression-based predictive modelling techniques are used to predict the product gas composition (H2, CO, CO2, CH4) and gas yield (GY) during the gasification of biomass in a fluidised bed reactor. The performance of different regression-based models is compared with the gradient boosting model(GB) to show the relative merits and demerits of the technique. Additionally, S Hapley Additive ex Planations (SHAP)-based explainable artificial intelligence (XAI) method was utilised to explain individual predictions. This study demonstrates that the prediction performance of the GB algorithm was the best among other regression based models i.e. Linear Regression (LR), Multilayer perception (MLP), Ridge Regression (RR), Least-angle regression (LARS), Random Forest (RF) and Bagging (BAG). It was found that at learning rate (lr) 0.01 and number of boosting stages (est) 1000 yielded the best result with an average root mean squared error (RMSE) of0.0597 for all outputs. The outcome of this study indicates that XAI-based methodology can be used as a viable alternative modelling paradigm in predicting the performance of a fluidised bed gasifier for an informed decision-making process.Peer reviewe
Conditional Transformation Models
The ultimate goal of regression analysis is to obtain information about the
conditional distribution of a response given a set of explanatory variables.
This goal is, however, seldom achieved because most established regression
models only estimate the conditional mean as a function of the explanatory
variables and assume that higher moments are not affected by the regressors.
The underlying reason for such a restriction is the assumption of additivity of
signal and noise. We propose to relax this common assumption in the framework
of transformation models. The novel class of semiparametric regression models
proposed herein allows transformation functions to depend on explanatory
variables. These transformation functions are estimated by regularised
optimisation of scoring rules for probabilistic forecasts, e.g. the continuous
ranked probability score. The corresponding estimated conditional distribution
functions are consistent. Conditional transformation models are potentially
useful for describing possible heteroscedasticity, comparing spatially varying
distributions, identifying extreme events, deriving prediction intervals and
selecting variables beyond mean regression effects. An empirical investigation
based on a heteroscedastic varying coefficient simulation model demonstrates
that semiparametric estimation of conditional distribution functions can be
more beneficial than kernel-based non-parametric approaches or parametric
generalised additive models for location, scale and shape
Dynamic Feature Engineering and model selection methods for temporal tabular datasets with regime changes
The application of deep learning algorithms to temporal panel datasets is
difficult due to heavy non-stationarities which can lead to over-fitted models
that under-perform under regime changes. In this work we propose a new machine
learning pipeline for ranking predictions on temporal panel datasets which is
robust under regime changes of data. Different machine-learning models,
including Gradient Boosting Decision Trees (GBDTs) and Neural Networks with and
without simple feature engineering are evaluated in the pipeline with different
settings. We find that GBDT models with dropout display high performance,
robustness and generalisability with relatively low complexity and reduced
computational cost. We then show that online learning techniques can be used in
post-prediction processing to enhance the results. In particular, dynamic
feature neutralisation, an efficient procedure that requires no retraining of
models and can be applied post-prediction to any machine learning model,
improves robustness by reducing drawdown in regime changes. Furthermore, we
demonstrate that the creation of model ensembles through dynamic model
selection based on recent model performance leads to improved performance over
baseline by improving the Sharpe and Calmar ratios of out-of-sample prediction
performances. We also evaluate the robustness of our pipeline across different
data splits and random seeds with good reproducibility of results
- …