23 research outputs found
Using Data Mining To Search for Perovskite Materials with Higher Specific Surface Area
The
specific surface area (SSA) of ABO3-type perovskite
is one of the important properties associated with photocatalytic
ability. In this work, data mining methods were used to explore the
relationship between the SSA (in the range of 1–60 m2 g–1) of perovskite and its features, including
chemical compositions and technical parameters. The genetic algorithm–support
vector regression method was used to screen the main features for
modeling. The correlation coefficient (R) between
the predicted and experimental SSAs reached as high as 0.986 for the
training data set and 0.935 for leave-one-out cross-validation. ABO3-type perovskites with higher SSA can be screened out using
the Online Computation Platform for Materials Data Mining (OCPMDM)
developed in our laboratory. Further, an online web server has been
developed to share the model for the prediction of SSA of ABO3-type perovskite, which is accessible at http://118.25.4.79/material_api/csk856q0fulhhhwv
Inverse Design of Hybrid Organic–Inorganic Perovskites with Suitable Bandgaps via Proactive Searching Progress
Hybrid organic–inorganic
perovskites (HOIPs) have shown
the encouraging development in solar cells that have achieved excellent
device performance. One of the most important issues has been focused
on finding Pb-free candidates with suitable bandgaps, which could
accelerate the commercialization of environmentally friendly HOIP-based
cells. Herein, we propose a new inverse design method, proactive searching
progress (PSP), to efficiently discover potential HOIPs from universal
chemical space by combining machine learning (ML) techniques. Compared
to the pioneering work on this topic, we carried out our ML study
based on 1201 collected HOIP samples with experimental bandgaps rather
than theoretical properties. On the basis of 25 selected features,
a weighted voting regressor ML model was constructed to predict bandgaps
of HOIPs. The model comprehensively embedded four submodels and performed
the coefficient determinations of 0.95 for leaving-one-out cross-validation
and 0.91 for testing set. The feature analysis revealed that the tolerance
factor (tf) below 0.971 and the new tolerance
factor (τf) in 3.75–4.09 contributed to lower
bandgaps and vice versa. By applying the PSP method, the Pb-free HOIPs
with optimal bandgaps were successfully designed from a generated
chemical space comprising over 8.20 × 1018 combinations,
which included 733848 candidates (e.g., Cs0.334FA0.266MA0.400Sn0.769Ge0.003Pd0.228Br0.164I2.836) with an optimal bandgap of 1.34
eV for single junction solar cells, 1511073 large-bandgap candidates
(e.g., Cs0.392FA0.016MA0.592Cr0.383Sr0.347Sn0.270Br1.171I1.829) for top parts in tandem solar cells (TSCs), and
20242 low-bandgap ones (e.g., MA0.815FA0.185Sn0.927Ge0.073I3) for bottom cells
in TSCs. Finally, three new HOIPs were synthesized with an average
bandgap error 0.07 eV between predictions and experiments. We are
convinced that the proposed PSP method and ML progress could facilitate
the discovery of new promising HOIPs for photovoltaic devices with
the desired properties
Predicting Experimental Formability of Hybrid Organic–Inorganic Perovskites via Imbalanced Learning
Hybrid
organic–inorganic perovskites (HOIPs) have gained
lots of attention in the photovoltaic field, but their further development
is restrained by contaminant and stability. More potential HOIPs should
be explored for photovoltaic devices. In this work, we collected 539
HOIPs and 24 non-HOIPs experimentally synthesized to explore novel
compositions of HOIPs. An imbalanced learning was carried out, and
the best classification model achieved a leaving-one-out cross-validation
accuracy of 100.0% and a test accuracy of 96.1%. The A site atomic
radii (ARA), A site ionic radius (IRA), and tolerance factor (tf) were identified as the most important features. ARA IRA tf <
1.01 contributed to perovskite formability, and the formability possibilities
of the corresponding samples were over 90.0%. Potential A site organic
fragments were identified for perovskite solar cells, such as dimethylamine,
hydroxylamine, hydrazine, etc. Finally, three new
Sn–Ge mixed systems of HOIPs were successfully synthesized,
which was consistent with the model predictions
Predicting Experimental Formability of Hybrid Organic–Inorganic Perovskites via Imbalanced Learning
Hybrid
organic–inorganic perovskites (HOIPs) have gained
lots of attention in the photovoltaic field, but their further development
is restrained by contaminant and stability. More potential HOIPs should
be explored for photovoltaic devices. In this work, we collected 539
HOIPs and 24 non-HOIPs experimentally synthesized to explore novel
compositions of HOIPs. An imbalanced learning was carried out, and
the best classification model achieved a leaving-one-out cross-validation
accuracy of 100.0% and a test accuracy of 96.1%. The A site atomic
radii (ARA), A site ionic radius (IRA), and tolerance factor (tf) were identified as the most important features. ARA IRA tf <
1.01 contributed to perovskite formability, and the formability possibilities
of the corresponding samples were over 90.0%. Potential A site organic
fragments were identified for perovskite solar cells, such as dimethylamine,
hydroxylamine, hydrazine, etc. Finally, three new
Sn–Ge mixed systems of HOIPs were successfully synthesized,
which was consistent with the model predictions
Machine Learning Combined with Weighted Voting Regression and Proactive Searching Progress to Discover ABO<sub>3‑δ</sub> Perovskites with High Oxide Ionic Conductivity
ABO3‑δ-type
perovskites are one
of the
important oxygen ion conductors because of the enhanced properties
through adjustments to the composition via elemental doping. In this
work, machine learning combined with weighted voting regression (WVR)
and proactive searching progress (PSP) was used to develop a model
with high accuracy for the prediction of the oxide ionic conductivity
of doped ABO3‑δ perovskites. After feature
selection, algorithm selection, and parameter optimization, Gradient
Boosting regression (GBR), random forest regression (RFR), and extra
trees regression (ETR) were determined to be the optimal methods for
WVR in constructing the integrated model. The R values of leave-one-out
cross-validation (LOOCV) and the test set for the integrated model
MWVR could reach 0.812 and 0.920, respectively. After the
PSP was conducted, a total of 179 perovskites with high oxide ionic
conductivity were discovered. PSP searching identified 8 types of
perovskites with high oxide ionic conductivity. Pattern recognition
was employed to identify the optimization area that exhibited a high
oxide ionic conductivity. Visualization of factor effects was used
to visualize the effect of the doping element type and ratio on the
oxide ionic conductivity. The Shapley Additive exPlanations (SHAP)
analysis of the significant features revealed that Ra/Rb had the highest influence on the oxide ionic conductivity
with a negative impact. The developed integrated model, explored patterns,
and optimization areas in this work can serve as a valuable guide
for the discovery and design of perovskites with high oxide ionic
conductivity
Predicting Experimental Formability of Hybrid Organic–Inorganic Perovskites via Imbalanced Learning
Hybrid
organic–inorganic perovskites (HOIPs) have gained
lots of attention in the photovoltaic field, but their further development
is restrained by contaminant and stability. More potential HOIPs should
be explored for photovoltaic devices. In this work, we collected 539
HOIPs and 24 non-HOIPs experimentally synthesized to explore novel
compositions of HOIPs. An imbalanced learning was carried out, and
the best classification model achieved a leaving-one-out cross-validation
accuracy of 100.0% and a test accuracy of 96.1%. The A site atomic
radii (ARA), A site ionic radius (IRA), and tolerance factor (tf) were identified as the most important features. ARA IRA tf <
1.01 contributed to perovskite formability, and the formability possibilities
of the corresponding samples were over 90.0%. Potential A site organic
fragments were identified for perovskite solar cells, such as dimethylamine,
hydroxylamine, hydrazine, etc. Finally, three new
Sn–Ge mixed systems of HOIPs were successfully synthesized,
which was consistent with the model predictions
Search for ABO<sub>3</sub> Type Ferroelectric Perovskites with Targeted Multi-Properties by Machine Learning Strategies
Ferroelectric perovskites are one
of the most promising functional
materials due to the pyroelectric and piezoelectric effect. In the
practical applications of ferroelectric perovskites, it is often necessary
to meet the requirements of multiple properties. In this work, a multiproperties
machine learning strategy was proposed to accelerate the discovery
and design of new ferroelectric ABO3-type perovskites.
First, a classification model was constructed with data collected
from publications to distinguish ferroelectric and nonferroelectric
perovskites. The classification accuracies of LOOCV and the test set
are 87.29% and 86.21%, respectively. Then, two machine learning strategies,
Machine-Learning Workflow and SISSO, were used to construct the regression
models to predict the specific surface area (SSA), band gap (Eg), Curie temperature (Tc), and dielectric loss (tan δ) of ABO3-type
perovskites. The correlation coefficients of LOOCV in the optimal
models for SSA, Eg, and Tc are 0.935, 0.891, and 0.971, respectively, while the
correlation coefficient of the predicted and experimental values of
the SISSO model for tan δ prediction could reach 0.913. On the
basis of the models, 20 ABO3 ferroelectric perovskites
with three different application prospects were screened out with
the required properties, which could be explained by the patterns
between the important descriptors and the properties by using SHAP.
Furthermore, the constructed models were developed into web servers
for the researchers to accelerate the rational design and discovery
of ABO3 ferroelectric perovskites with desired multiple
properties
Accelerated Design for High-Entropy Alloys Based on Machine Learning and Multiobjective Optimization
High-entropy
alloys (HEAs) with high hardness and high ductility
can be considered as candidates for wear-resistant applications. However,
designing novel HEAs with multiple desired properties using traditional
alloy design methods remains challenging due to the enormous composition
space. In this work, we proposed a machine-learning-based framework
to design HEAs with high Vickers hardness (H) and
high compressive fracture strain (D). Initially,
we constructed data sets containing 172,467 data with 161 features
for D and H, respectively. Four-step
feature selection was performed, with the selection of 12 and 8 features
for the D and H prediction models
based on the optimal algorithms of the support vector machine (SVR)
and light gradient boosting machine (LightGBM), respectively. The R2 of the well-trained models reached 0.76 and
0.90 for the 10-fold cross validation. Nondominated sorting genetic
algorithm version II (NSGA-II) and virtual screening were employed
to search for the optimal alloying compositions, and four recommended
candidates were synthesized to validate our methods. Notably, the D of three candidates have shown significant improvements
compared to the samples with similar H in the original
data sets, with increases of 135.8, 282.4, and 194.1% respectively.
Analyzing the candidates, we have recommended suitable atomic percentage
ranges for elements such as Al (2–14.8 at %), Nb (4–25
at %), and Mo (3–9.9 at %) in order to design HEAs with high
hardness and ductility
Accelerated Design for High-Entropy Alloys Based on Machine Learning and Multiobjective Optimization
High-entropy
alloys (HEAs) with high hardness and high ductility
can be considered as candidates for wear-resistant applications. However,
designing novel HEAs with multiple desired properties using traditional
alloy design methods remains challenging due to the enormous composition
space. In this work, we proposed a machine-learning-based framework
to design HEAs with high Vickers hardness (H) and
high compressive fracture strain (D). Initially,
we constructed data sets containing 172,467 data with 161 features
for D and H, respectively. Four-step
feature selection was performed, with the selection of 12 and 8 features
for the D and H prediction models
based on the optimal algorithms of the support vector machine (SVR)
and light gradient boosting machine (LightGBM), respectively. The R2 of the well-trained models reached 0.76 and
0.90 for the 10-fold cross validation. Nondominated sorting genetic
algorithm version II (NSGA-II) and virtual screening were employed
to search for the optimal alloying compositions, and four recommended
candidates were synthesized to validate our methods. Notably, the D of three candidates have shown significant improvements
compared to the samples with similar H in the original
data sets, with increases of 135.8, 282.4, and 194.1% respectively.
Analyzing the candidates, we have recommended suitable atomic percentage
ranges for elements such as Al (2–14.8 at %), Nb (4–25
at %), and Mo (3–9.9 at %) in order to design HEAs with high
hardness and ductility
Accelerated Design for High-Entropy Alloys Based on Machine Learning and Multiobjective Optimization
High-entropy
alloys (HEAs) with high hardness and high ductility
can be considered as candidates for wear-resistant applications. However,
designing novel HEAs with multiple desired properties using traditional
alloy design methods remains challenging due to the enormous composition
space. In this work, we proposed a machine-learning-based framework
to design HEAs with high Vickers hardness (H) and
high compressive fracture strain (D). Initially,
we constructed data sets containing 172,467 data with 161 features
for D and H, respectively. Four-step
feature selection was performed, with the selection of 12 and 8 features
for the D and H prediction models
based on the optimal algorithms of the support vector machine (SVR)
and light gradient boosting machine (LightGBM), respectively. The R2 of the well-trained models reached 0.76 and
0.90 for the 10-fold cross validation. Nondominated sorting genetic
algorithm version II (NSGA-II) and virtual screening were employed
to search for the optimal alloying compositions, and four recommended
candidates were synthesized to validate our methods. Notably, the D of three candidates have shown significant improvements
compared to the samples with similar H in the original
data sets, with increases of 135.8, 282.4, and 194.1% respectively.
Analyzing the candidates, we have recommended suitable atomic percentage
ranges for elements such as Al (2–14.8 at %), Nb (4–25
at %), and Mo (3–9.9 at %) in order to design HEAs with high
hardness and ductility
