Automated machine learning for bio-oil yield prediction from lignocellulosic biomass pyrolysis

Abstract

Lignocellulosic biomass pyrolysis for bio-oil production stands as a promising route for renewable energy, yet predicting bio-oil yield remains challenging due to the complex interplay of biomass properties and process conditions. Conventional Machine Learning (ML) approaches, while effective, require labor-intensive manual algorithm selection and hyperparameter tuning, hindering their scalability and reproducibility. To address this point, we present a systematic comparison of four state-of-the-art Automated Machine Learning (AutoML) frameworks—AutoGluon, Auto-Sklearn, FLAML, and TPOT—for automating bio-oil yield prediction. Relying on a dataset of 329 experimental samples from 34 biomass types and seven input features (cellulose, hemicellulose, lignin content, nitrogen flow, heating rate, temperature, and particle size), we demonstrate that FLAML coupled with XGBoost achieves superior predictive performance (R^2 = 0.890, RMSE = 3.158), thus significantly outperforming both traditional ML models and other AutoML tools. Statistical validation via ANOVA and Tukey’s post-hoc test confirms the robustness of these findings. Our study highlights AutoML’s ability to generate accurate and efficient models for complex pyrolysis systems, substantially reducing reliance on expert knowledge and manual configuration. The developed work establishes AutoML as a scalable solution for optimizing bio-oil production, facilitating more sustainable and data-driven biomass conversion strategies

Similar works

Full text

thumbnail-image

AIR Universita degli studi di Milano

redirect
Last time updated on 28/08/2025

This paper was published in AIR Universita degli studi di Milano.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.