136 research outputs found
Inductive Meta-path Learning for Schema-complex Heterogeneous Information Networks
Heterogeneous Information Networks (HINs) are information networks with
multiple types of nodes and edges. The concept of meta-path, i.e., a sequence
of entity types and relation types connecting two entities, is proposed to
provide the meta-level explainable semantics for various HIN tasks.
Traditionally, meta-paths are primarily used for schema-simple HINs, e.g.,
bibliographic networks with only a few entity types, where meta-paths are often
enumerated with domain knowledge. However, the adoption of meta-paths for
schema-complex HINs, such as knowledge bases (KBs) with hundreds of entity and
relation types, has been limited due to the computational complexity associated
with meta-path enumeration. Additionally, effectively assessing meta-paths
requires enumerating relevant path instances, which adds further complexity to
the meta-path learning process. To address these challenges, we propose
SchemaWalk, an inductive meta-path learning framework for schema-complex HINs.
We represent meta-paths with schema-level representations to support the
learning of the scores of meta-paths for varying relations, mitigating the need
of exhaustive path instance enumeration for each relation. Further, we design a
reinforcement-learning based path-finding agent, which directly navigates the
network schema (i.e., schema graph) to learn policies for establishing
meta-paths with high coverage and confidence for multiple relations. Extensive
experiments on real data sets demonstrate the effectiveness of our proposed
paradigm
Analyzing drop coalescence in microfluidic devices with a deep learning generative model
Predicting drop coalescence based on process parameters is crucial for experimental design in chemical engineering. However, predictive models can suffer from the lack of training data and more importantly, the label imbalance problem. In this study, we propose the use of deep learning generative models to tackle this bottleneck by training the predictive models using generated synthetic data. A novel generative model, named double space conditional variational autoencoder (DSCVAE) is developed for labelled tabular data. By introducing label constraints in both the latent and the original space, DSCVAE is capable of generating consistent and realistic samples compared to the standard conditional variational autoencoder (CVAE). Two predictive models, namely random forest and gradient boosting classifiers, are enhanced on synthetic data and their performances are evaluated based on real experimental data. Numerical results show that a considerable improvement in prediction accuracy can be achieved by using synthetic data and the proposed DSCVAE clearly outperforms the standard CVAE. This research clearly provides more insights into handling imbalanced data for classification problems, especially in chemical engineering
Analyzing drop coalescence in microfluidic devices with a deep learning generative model
Predicting drop coalescence based on process parameters is crucial for experimental design in chemical engineering. However, predictive models can suffer from the lack of training data and more importantly, the label imbalance problem. In this study, we propose the use of deep learning generative models to tackle this bottleneck by training the predictive models using generated synthetic data. A novel generative model, named double space conditional variational autoencoder (DSCVAE) is developed for labelled tabular data. By introducing label constraints in both the latent and the original space, DSCVAE is capable of generating consistent and realistic samples compared to the standard conditional variational autoencoder (CVAE). Two predictive models, namely random forest and gradient boosting classifiers, are enhanced on synthetic data and their performances are evaluated based on real experimental data. Numerical results show that a considerable improvement in prediction accuracy can be achieved by using synthetic data and the proposed DSCVAE clearly outperforms the standard CVAE. This research clearly provides more insights into handling imbalanced data for classification problems, especially in chemical engineering
Explainable AI models for predicting drop coalescence in microfluidics device
In the field of chemical engineering, understanding the dynamics and probability of drop coalescence is not just an academic pursuit, but a critical requirement for advancing process design by applying energy only where it is needed to build necessary interfacial structures, increasing efficiency towards Net Zero manufacture. This research applies machine learning predictive models to unravel the sophisticated relationships embedded in the experimental data on drop coalescence in a microfluidics device. Through the deployment of SHapley Additive exPlanations values, critical features relevant to coalescence processes are consistently identified. Comprehensive feature ablation tests further delineate the robustness and susceptibility of each model. Furthermore, the incorporation of Local Interpretable Model-agnostic Explanations for local interpretability offers an elucidative perspective, clarifying the intricate decision-making mechanisms inherent to each model’s predictions. As a result, this research provides the relative importance of the features for the outcome of drop interactions. It also underscores the pivotal role of model interpretability in reinforcing confidence in machine learning predictions of complex physical phenomena that are central to chemical engineering applications
Accurate identification and measurement of the precipitate area by two-stage deep neural networks in novel chromium-based alloys
The performance of advanced materials for extreme environments is underpinned by their microstructure, such as the size and distribution of nano- to micro-sized reinforcing phase(s). Chromium-based superalloys are a recently proposed alternative to conventional face-centred-cubic superalloys for high-temperature applications, e.g., Concentrated Solar Power. Their development requires the determination of precipitate volume fraction and size distribution using Electron Microscopy (EM), as these properties are crucial for the thermal stability and mechanical properties of chromium superalloys. Traditional approaches to EM image processing utilise filtering with a fixed contrast threshold, leads to weak robustness to background noise and poor generalisability to different materials. It also requires an enormous amount of time for manual object measurements on large datasets. Efficient and accurate object detection and segmentation are therefore highly desired to accelerate the development of novel materials like chromium-based superalloys. To address these bottlenecks, based on YOLOv5 and SegFormer structures, this study proposes an end-to-end, two-stage deep learning scheme, DT-SegNet, to perform object detection and segmentation for EM images. The proposed approach can thus benefit from the training efficiency of CNNs at the detection stage (i.e., a small number of training images required) and the accuracy of the ViT at the segmentation stage. Extensive numerical experiments demonstrate that the proposed DT-SegNet significantly outperforms the state-of-the-art segmentation tools offered by Weka and ilastik regarding a large number of metrics, including accuracy, precision, recall and F1-score. This model forms a useful tool to aid alloy development microstructure examinations, and offers significant advantages to address the large datasets associated with highthroughput alloy development approaches
Accurate identification and measurement of the precipitate area by two-stage deep neural networks in novel chromium-based alloys
The performance of advanced materials for extreme environments is underpinned by their microstructure, such as the size and distribution of nano- to micro-sized reinforcing phase(s). Chromium-based superalloys are a recently proposed alternative to conventional face-centred-cubic superalloys for high-temperature applications, e.g., Concentrated Solar Power. Their development requires the determination of precipitate volume fraction and size distribution using Electron Microscopy (EM), as these properties are crucial for the thermal stability and mechanical properties of chromium superalloys. Traditional approaches to EM image processing utilise filtering with a fixed contrast threshold, leads to weak robustness to background noise and poor generalisability to different materials. It also requires an enormous amount of time for manual object measurements on large datasets. Efficient and accurate object detection and segmentation are therefore highly desired to accelerate the development of novel materials like chromium-based superalloys. To address these bottlenecks, based on YOLOv5 and SegFormer structures, this study proposes an end-to-end, two-stage deep learning scheme, DT-SegNet, to perform object detection and segmentation for EM images. The proposed approach can thus benefit from the training efficiency of CNNs at the detection stage (i.e., a small number of training images required) and the accuracy of the ViT at the segmentation stage. Extensive numerical experiments demonstrate that the proposed DT-SegNet significantly outperforms the state-of-the-art segmentation tools offered by Weka and ilastik regarding a large number of metrics, including accuracy, precision, recall and F1-score. This model forms a useful tool to aid alloy development microstructure examinations, and offers significant advantages to address the large datasets associated with high-throughput alloy development approaches
- …