Search CORE

1 research outputs found

Explicable Machine Learning for Predicting High-Efficiency Lignocellulose Pretreatment Solvents Based on Kamlet–Taft and Polarity Parameters

Author: Bin Li (39349)
Hanwen Ge (18459213)
Huanfei Xu (2919446)
Jiahui Wei (8289561)
Rui Zhou (38109)
Shenglin Wang (6293918)
Yaoze Liu (4610215)
Yuekun Bai (18459216)
Publication venue
Publication date: 29/04/2024
Field of study

Incorporating density functional theory (DFT) and machine learning (ML) methodologies, an intrinsic relationship model was developed utilizing the Kamlet–Taft parameters and polarity values of 104 deep eutectic solvents (DES). DES with high lignocellulosic pretreatment efficiency were expected to be screened through the synergistic combination of hydrogen bond acidity (α), hydrogen bond basicity (β), polarization (Π*) and molecular polarity index (MPI). Partial least-squares (PLS) models and a variety of ML models were used to predict cellulose retention and delignification. The XGBoost model has the highest predictive performance with R2 of 0.97 and 0.91, respectively. Feature importance analysis and partial dependence analysis were used to explain the importance of variables based on the XGBoost model. Feature importance analysis showed that α, β, Π* of DES and MPI of hydrogen bond donor determined the pretreatment efficiency. The partial dependence analysis showed that the relationship among 4 parameters and the pretreatment efficiency is nonlinear, and there are multiple extreme values in different intervals. The model gave a parameter range corresponding to the high pretreatment efficiency. Based on the range of 4 parameters given in this study, the composition and ratio of DES can be selected to ensure that at least 80% of the cellulose is retained and 50% of the lignin is removed. Molecular simulation results showed that these highly efficient DES often contain a large number of hydrogen bonds and highly polar groups

FigShare