Search CORE

280 research outputs found

A novel ensemble method for the accurate prediction of the major oil prices in Tanzania

Author: B Ismail
SAQWARE GODFREY JOSEPH
Publication venue: Assam Don Bosco University
Publication date: 03/07/2023
Field of study

Global development relies much on oil to run different types of machines. Using oil to power many types of equipment is very important to world economic growth. The analysis of oil prices is crucial for the country's long-term stability. However, global monopoly producers, wars, and pandemics have contributed to the volatility of crude oil prices. As a result, the optimal prediction model for oil prices becomes crucial. The performance of several ensemble strategies on single traditional and machine learning models was examined in this study. We found that the weighted ensemble technique outperformed other ensemble and single models in predicting petrol and diesel prices in Tanzania based on four performance metrics. Furthermore, a spike in global oil prices necessitates global economic and political stability for non-oil-producing nations to avoid suffering the consequences. Finally, other ensemble approaches may be used and compared to predict the oil prices.

Assam Don Bosco University Journals

Air Quality Prediction in Smart Cities Using Machine Learning Technologies Based on Sensor Data: A Review

Author: Iskandaryan Ditsuhi
Ramos Francisco
Trilles Sergio
Publication venue: 'MDPI AG'
Publication date: 01/04/2020
Field of study

The influence of machine learning technologies is rapidly increasing and penetrating almost in every field, and air pollution prediction is not being excluded from those fields. This paper covers the revision of the studies related to air pollution prediction using machine learning algorithms based on sensor data in the context of smart cities. Using the most popular databases and executing the corresponding filtration, the most relevant papers were selected. After thorough reviewing those papers, the main features were extracted, which served as a base to link and compare them to each other. As a result, we can conclude that: (1) instead of using simple machine learning techniques, currently, the authors apply advanced and sophisticated techniques, (2) China was the leading country in terms of a case study, (3) Particulate matter with diameter equal to 2.5 micrometers was the main prediction target, (4) in 41% of the publications the authors carried out the prediction for the next day, (5) 66% of the studies used data had an hourly rate, (6) 49% of the papers used open data and since 2016 it had a tendency to increase, and (7) for efficient air quality prediction it is important to consider the external factors such as weather conditions, spatial characteristics, and temporal features

Multidisciplinary Digital Publishing Institute

Repositori Institucional de la Universitat Jaume I

House Price Prediction With Gradient Boosted Trees Under Different Loss Functions

Author: Hjort Anders
Pensar Johan
Scheel Ida
Sommervoll Dag Einar
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2022
Field of study

publishedVersio

Brage NMBU

NORA - Norwegian Open Research Archives

Claim Models: Granular Forms and Machine Learning Forms

Author: Taylor Greg
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

This collection of articles addresses the most modern forms of loss reserving methodology: granular models and machine learning models. New methodologies come with questions about their applicability. These questions are discussed in one article, which focuses on the relative merits of granular and machine learning models. Others illustrate applications with real-world data. The examples include neural networks, which, though well known in some disciplines, have previously been limited in the actuarial literature. This volume expands on that literature, with specific attention to their application to loss reserving. For example, one of the articles introduces the application of neural networks of the gated recurrent unit form to the actuarial literature, whereas another uses a penalized neural network. Neural networks are not the only form of machine learning, and two other papers outline applications of gradient boosting and regression trees respectively. Both articles construct loss reserves at the individual claim level so that these models resemble granular models. One of these articles provides a practical application of the model to claim watching, the action of monitoring claim development and anticipating major features. Such watching can be used as an early warning system or for other administrative purposes. Overall, this volume is an extremely useful addition to the libraries of those working at the loss reserving frontier

Directory of Open Access Books (DOAB)

Development of explainable AI-based predictive models for bubbling fluidised bed gasification process

Author: Bhattacharyya Saugat
Pandey Daya
Raza Haider
Publication venue: 'Elsevier BV'
Publication date: 01/11/2023
Field of study

© 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).In this study, seven different types of regression-based predictive modelling techniques are used to predict the product gas composition (H2, CO, CO2, CH4) and gas yield (GY) during the gasification of biomass in a fluidised bed reactor. The performance of different regression-based models is compared with the gradient boosting model(GB) to show the relative merits and demerits of the technique. Additionally, S Hapley Additive ex Planations (SHAP)-based explainable artificial intelligence (XAI) method was utilised to explain individual predictions. This study demonstrates that the prediction performance of the GB algorithm was the best among other regression based models i.e. Linear Regression (LR), Multilayer perception (MLP), Ridge Regression (RR), Least-angle regression (LARS), Random Forest (RF) and Bagging (BAG). It was found that at learning rate (lr) 0.01 and number of boosting stages (est) 1000 yielded the best result with an average root mean squared error (RMSE) of0.0597 for all outputs. The outcome of this study indicates that XAI-based methodology can be used as a viable alternative modelling paradigm in predicting the performance of a fluidised bed gasifier for an informed decision-making process.Peer reviewe

University of Essex Research Repository

Ulster University's Research Portal

White Rose Research Online

University of Hertfordshire Research Archive

Conditional Transformation Models

Author: Bühlmann
Bühlmann
Bühlmann
Bühlmann
Chen
Chen
Cheng
Cheng
Currie
Dette
Doksum
Eilers
Fenske
Friedman
Gilchrist
Gneiting
Gneiting
Gneiting
Hall
Hall
Hayfield
He
Hofner
Hothorn
Koenker
Koenker
Koenker
Kriegler
Li
Lu
Lu
Mayr
Ridgeway
Rigby
Schemper
Schild
Schmid
Schnabel
Sexton
Shen
Tutz
Tutz
van de Geer
Wu
Zeng
Zheng
Publication venue: 'Wiley'
Publication date: 28/11/2012
Field of study

The ultimate goal of regression analysis is to obtain information about the conditional distribution of a response given a set of explanatory variables. This goal is, however, seldom achieved because most established regression models only estimate the conditional mean as a function of the explanatory variables and assume that higher moments are not affected by the regressors. The underlying reason for such a restriction is the assumption of additivity of signal and noise. We propose to relax this common assumption in the framework of transformation models. The novel class of semiparametric regression models proposed herein allows transformation functions to depend on explanatory variables. These transformation functions are estimated by regularised optimisation of scoring rules for probabilistic forecasts, e.g. the continuous ranked probability score. The corresponding estimated conditional distribution functions are consistent. Conditional transformation models are potentially useful for describing possible heteroscedasticity, comparing spatially varying distributions, identifying extreme events, deriving prediction intervals and selecting variables beyond mean regression effects. An empirical investigation based on a heteroscedastic varying coefficient simulation model demonstrates that semiparametric estimation of conditional distribution functions can be more beneficial than kernel-based non-parametric approaches or parametric generalised additive models for location, scale and shape

arXiv.org e-Print Archive

Crossref

ZORA

A meta-learning based stacked regression approach for customer lifetime value prediction

Author: Abdelmoniem AM
Gadgil K
Gill SS
Publication venue: Elsevier
Publication date: 22/09/2023
Field of study

Queen Mary Research Online

Dynamic Feature Engineering and model selection methods for temporal tabular datasets with regime changes

Author: Barahona Mauricio
Wong Thomas
Publication venue
Publication date: 26/04/2023
Field of study

The application of deep learning algorithms to temporal panel datasets is difficult due to heavy non-stationarities which can lead to over-fitted models that under-perform under regime changes. In this work we propose a new machine learning pipeline for ranking predictions on temporal panel datasets which is robust under regime changes of data. Different machine-learning models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks with and without simple feature engineering are evaluated in the pipeline with different settings. We find that GBDT models with dropout display high performance, robustness and generalisability with relatively low complexity and reduced computational cost. We then show that online learning techniques can be used in post-prediction processing to enhance the results. In particular, dynamic feature neutralisation, an efficient procedure that requires no retraining of models and can be applied post-prediction to any machine learning model, improves robustness by reducing drawdown in regime changes. Furthermore, we demonstrate that the creation of model ensembles through dynamic model selection based on recent model performance leads to improved performance over baseline by improving the Sharpe and Calmar ratios of out-of-sample prediction performances. We also evaluate the robustness of our pipeline across different data splits and random seeds with good reproducibility of results

arXiv.org e-Print Archive