280 research outputs found

    A novel ensemble method for the accurate prediction of the major oil prices in Tanzania

    Get PDF
    Global development relies much on oil to run different types of machines. Using oil to power many types of equipment is very important to world economic growth. The analysis of oil prices is crucial for the country's long-term stability. However, global monopoly producers, wars, and pandemics have contributed to the volatility of crude oil prices. As a result, the optimal prediction model for oil prices becomes crucial. The performance of several ensemble strategies on single traditional and machine learning models was examined in this study. We found that the weighted ensemble technique outperformed other ensemble and single models in predicting petrol and diesel prices in Tanzania based on four performance metrics. Furthermore, a spike in global oil prices necessitates global economic and political stability for non-oil-producing nations to avoid suffering the consequences. Finally, other ensemble approaches may be used and compared to predict the oil prices.

    Air Quality Prediction in Smart Cities Using Machine Learning Technologies Based on Sensor Data: A Review

    Get PDF
    The influence of machine learning technologies is rapidly increasing and penetrating almost in every field, and air pollution prediction is not being excluded from those fields. This paper covers the revision of the studies related to air pollution prediction using machine learning algorithms based on sensor data in the context of smart cities. Using the most popular databases and executing the corresponding filtration, the most relevant papers were selected. After thorough reviewing those papers, the main features were extracted, which served as a base to link and compare them to each other. As a result, we can conclude that: (1) instead of using simple machine learning techniques, currently, the authors apply advanced and sophisticated techniques, (2) China was the leading country in terms of a case study, (3) Particulate matter with diameter equal to 2.5 micrometers was the main prediction target, (4) in 41% of the publications the authors carried out the prediction for the next day, (5) 66% of the studies used data had an hourly rate, (6) 49% of the papers used open data and since 2016 it had a tendency to increase, and (7) for efficient air quality prediction it is important to consider the external factors such as weather conditions, spatial characteristics, and temporal features

    Claim Models: Granular Forms and Machine Learning Forms

    Get PDF
    This collection of articles addresses the most modern forms of loss reserving methodology: granular models and machine learning models. New methodologies come with questions about their applicability. These questions are discussed in one article, which focuses on the relative merits of granular and machine learning models. Others illustrate applications with real-world data. The examples include neural networks, which, though well known in some disciplines, have previously been limited in the actuarial literature. This volume expands on that literature, with specific attention to their application to loss reserving. For example, one of the articles introduces the application of neural networks of the gated recurrent unit form to the actuarial literature, whereas another uses a penalized neural network. Neural networks are not the only form of machine learning, and two other papers outline applications of gradient boosting and regression trees respectively. Both articles construct loss reserves at the individual claim level so that these models resemble granular models. One of these articles provides a practical application of the model to claim watching, the action of monitoring claim development and anticipating major features. Such watching can be used as an early warning system or for other administrative purposes. Overall, this volume is an extremely useful addition to the libraries of those working at the loss reserving frontier

    Development of explainable AI-based predictive models for bubbling fluidised bed gasification process

    Get PDF
    © 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).In this study, seven different types of regression-based predictive modelling techniques are used to predict the product gas composition (H2, CO, CO2, CH4) and gas yield (GY) during the gasification of biomass in a fluidised bed reactor. The performance of different regression-based models is compared with the gradient boosting model(GB) to show the relative merits and demerits of the technique. Additionally, S Hapley Additive ex Planations (SHAP)-based explainable artificial intelligence (XAI) method was utilised to explain individual predictions. This study demonstrates that the prediction performance of the GB algorithm was the best among other regression based models i.e. Linear Regression (LR), Multilayer perception (MLP), Ridge Regression (RR), Least-angle regression (LARS), Random Forest (RF) and Bagging (BAG). It was found that at learning rate (lr) 0.01 and number of boosting stages (est) 1000 yielded the best result with an average root mean squared error (RMSE) of0.0597 for all outputs. The outcome of this study indicates that XAI-based methodology can be used as a viable alternative modelling paradigm in predicting the performance of a fluidised bed gasifier for an informed decision-making process.Peer reviewe

    Conditional Transformation Models

    Full text link
    The ultimate goal of regression analysis is to obtain information about the conditional distribution of a response given a set of explanatory variables. This goal is, however, seldom achieved because most established regression models only estimate the conditional mean as a function of the explanatory variables and assume that higher moments are not affected by the regressors. The underlying reason for such a restriction is the assumption of additivity of signal and noise. We propose to relax this common assumption in the framework of transformation models. The novel class of semiparametric regression models proposed herein allows transformation functions to depend on explanatory variables. These transformation functions are estimated by regularised optimisation of scoring rules for probabilistic forecasts, e.g. the continuous ranked probability score. The corresponding estimated conditional distribution functions are consistent. Conditional transformation models are potentially useful for describing possible heteroscedasticity, comparing spatially varying distributions, identifying extreme events, deriving prediction intervals and selecting variables beyond mean regression effects. An empirical investigation based on a heteroscedastic varying coefficient simulation model demonstrates that semiparametric estimation of conditional distribution functions can be more beneficial than kernel-based non-parametric approaches or parametric generalised additive models for location, scale and shape

    Dynamic Feature Engineering and model selection methods for temporal tabular datasets with regime changes

    Full text link
    The application of deep learning algorithms to temporal panel datasets is difficult due to heavy non-stationarities which can lead to over-fitted models that under-perform under regime changes. In this work we propose a new machine learning pipeline for ranking predictions on temporal panel datasets which is robust under regime changes of data. Different machine-learning models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks with and without simple feature engineering are evaluated in the pipeline with different settings. We find that GBDT models with dropout display high performance, robustness and generalisability with relatively low complexity and reduced computational cost. We then show that online learning techniques can be used in post-prediction processing to enhance the results. In particular, dynamic feature neutralisation, an efficient procedure that requires no retraining of models and can be applied post-prediction to any machine learning model, improves robustness by reducing drawdown in regime changes. Furthermore, we demonstrate that the creation of model ensembles through dynamic model selection based on recent model performance leads to improved performance over baseline by improving the Sharpe and Calmar ratios of out-of-sample prediction performances. We also evaluate the robustness of our pipeline across different data splits and random seeds with good reproducibility of results
    corecore