3 research outputs found
Anomaly segmentation model for defects detection in electroluminescence images of heterojunction solar cells
Efficient defect detection in solar cell manufacturing is crucial for stable
green energy technology manufacturing. This paper presents a
deep-learning-based automatic detection model SeMaCNN for classification and
semantic segmentation of electroluminescent images for solar cell quality
evaluation and anomalies detection. The core of the model is an anomaly
detection algorithm based on Mahalanobis distance that can be trained in a
semi-supervised manner on imbalanced data with small number of digital
electroluminescence images with relevant defects. This is particularly valuable
for prompt model integration into the industrial landscape. The model has been
trained with the on-plant collected dataset consisting of 68 748
electroluminescent images of heterojunction solar cells with a busbar grid. Our
model achieves the accuracy of 92.5%, F1 score 95.8%, recall 94.8%, and
precision 96.9% within the validation subset consisting of 1049 manually
annotated images. The model was also tested on the open ELPV dataset and
demonstrates stable performance with accuracy 94.6% and F1 score 91.1%. The
SeMaCNN model demonstrates a good balance between its performance and
computational costs, which make it applicable for integrating into quality
control systems of solar cell manufacturing
Boosting Heterogeneous Catalyst Discovery by Structurally Constrained Deep Learning Models
The discovery of new catalysts is one of the significant topics of
computational chemistry as it has the potential to accelerate the adoption of
renewable energy sources. Recently developed deep learning approaches such as
graph neural networks (GNNs) open new opportunity to significantly extend scope
for modelling novel high-performance catalysts. Nevertheless, the graph
representation of particular crystal structure is not a straightforward task
due to the ambiguous connectivity schemes and numerous embeddings of nodes and
edges. Here we present embedding improvement for GNN that has been modified by
Voronoi tesselation and is able to predict the energy of catalytic systems
within Open Catalyst Project dataset. Enrichment of the graph was calculated
via Voronoi tessellation and the corresponding contact solid angles and types
(direct or indirect) were considered as features of edges and Voronoi volumes
were used as node characteristics. The auxiliary approach was enriching node
representation by intrinsic atomic properties (electronegativity, period and
group position). Proposed modifications allowed us to improve the mean absolute
error of the original model and the final error equals to 651 meV per atom on
the Open Catalyst Project dataset and 6 meV per atom on the intermetallics
dataset. Also, by consideration of additional dataset, we show that a sensible
choice of data can decrease the error to values above physically-based 20 meV
per atom threshold
New drugs and stock market: a machine learning framework for predicting pharma market reaction to clinical trial announcements
Abstract Pharmaceutical companies operate in a strictly regulated and highly risky environment in which a single slip can lead to serious financial implications. Accordingly, the announcements of clinical trial results tend to determine the future course of events, hence being closely monitored by the public. Most works focus on retrospective analysis of announcement impact on company stock prices, bypassing the consideration of the problem in the predictive paradigm. In this work, we aim to close this gap by proposing a framework that allows predicting the numerical values of announcement-induced changes in stock prices. In fact, it is a problem of the impact prediction of the specific event on the corresponding time series. Our framework includes a BERT model for extracting the sentiment polarity of announcements, a Temporal Fusion Transformer for forecasting the expected return, a graph convolution network for capturing event relationships, and gradient boosting for predicting the price change. We operate with one of the biggest FDA (the Food and Drug Administration) datasets, consisting of 5436 clinical trial announcements from 681 companies for the years 2018–2022. During the study, we get several significant outcomes and domain-specific insights. Firstly, we obtain statistical evidence for the clinical result promulgation influence on the public pharma market value. Secondly, we witness inherently different patterns of responses to positive and negative announcements, reflected in a stronger and more pronounced reaction to negative clinical news. Thirdly, we discover two factors that play a crucial role in a predictive framework: (1) the drug portfolio size of the company, indicating the greater susceptibility to an announcement in the case of low diversification among drug products and (2) the announcement network effect, manifesting through an increase in predictive power when exploiting interdependencies of events belonging to the same company or nosology. Finally, we prove the viability of the forecast setting by getting ROC AUC scores predominantly greater than 0.7 for the classification of price change on historical data. We emphasize the transferability and generalizability of the developed framework on other datasets and domains but on the condition of the presence of two key entities: events and the associated time series