12 research outputs found

    Anomaly Detection in Batch Manufacturing Processes Using Localized Reconstruction Errors From 1-D Convolutional AutoEncoders

    Get PDF
    Multivariate batch time-series data sets within Semiconductor manufacturing processes present a difficult environment for effective Anomaly Detection (AD). The challenge is amplified by the limited availability of ground truth labelled data. In scenarios where AD is possible, black box modelling approaches constrain model interpretability. These challenges obstruct the widespread adoption of Deep Learning solutions. The objective of the study is to demonstrate an AD approach which employs 1-Dimensional Convolutional AutoEncoders (1d-CAE) and Localised Reconstruction Error (LRE) to improve AD performance and interpretability. Using LRE to identify sensors and data that result in the anomaly, the explainability of the Deep Learning solution is enhanced. The Tennessee Eastman Process (TEP) and LAM 9600 Metal Etcher datasets have been utilised to validate the proposed framework. The results show that the proposed LRE approach outperforms global reconstruction errors for similar model architectures achieving an AUC of 1.00. The proposed unsupervised learning approach with AE and LRE improves model explainability which is expected to be beneficial for deployment in semiconductor manufacturing where interpretable and trustworthy results are critical for process engineering teams

    Anomaly Detection in Batch Manufacturing Processes using Localised Reconstruction Errors from 1-Dimensional Convolutional AutoEncoders

    Get PDF
    Multivariate batch time-series data sets within Semiconductor manufacturing processes present a difficult environment for effective Anomaly Detection (AD). The challenge is amplified by the limited availability of ground truth labelled data. In scenarios where AD is possible, black box modelling approaches constrain model interpretability. These challenges obstruct the widespread adoption of Deep Learning solutions. The objective of the study is to demonstrate an AD approach which employs 1-Dimensional Convolutional AutoEncoders (1d-CAE) and Localised Reconstruction Error (LRE) to improve AD performance and interpretability. Using LRE to identify sensors and data that result in the anomaly, the explainability of the Deep Learning solution is enhanced. The Tennessee Eastman Process (TEP) and LAM 9600 Metal Etcher datasets have been utilised to validate the proposed framework. The results show that the proposed LRE approach outperforms global reconstruction errors for similar model architectures achieving an AUC of 1.00. The proposed unsupervised learning approach with AE and LRE improves model explainability which is expected to be beneficial for deployment in semiconductor manufacturing where interpretable and trustworthy results are critical for process engineering teams

    Intelligent Condition Monitoring of Industrial Plants: An Overview of Methodologies and Uncertainty Management Strategies

    Full text link
    Condition monitoring plays a significant role in the safety and reliability of modern industrial systems. Artificial intelligence (AI) approaches are gaining attention from academia and industry as a growing subject in industrial applications and as a powerful way of identifying faults. This paper provides an overview of intelligent condition monitoring and fault detection and diagnosis methods for industrial plants with a focus on the open-source benchmark Tennessee Eastman Process (TEP). In this survey, the most popular and state-of-the-art deep learning (DL) and machine learning (ML) algorithms for industrial plant condition monitoring, fault detection, and diagnosis are summarized and the advantages and disadvantages of each algorithm are studied. Challenges like imbalanced data, unlabelled samples and how deep learning models can handle them are also covered. Finally, a comparison of the accuracies and specifications of different algorithms utilizing the Tennessee Eastman Process (TEP) is conducted. This research will be beneficial for both researchers who are new to the field and experts, as it covers the literature on condition monitoring and state-of-the-art methods alongside the challenges and possible solutions to them

    A Review of Kernel Methods for Feature Extraction in Nonlinear Process Monitoring

    Get PDF
    Kernel methods are a class of learning machines for the fast recognition of nonlinear patterns in any data set. In this paper, the applications of kernel methods for feature extraction in industrial process monitoring are systematically reviewed. First, we describe the reasons for using kernel methods and contextualize them among other machine learning tools. Second, by reviewing a total of 230 papers, this work has identified 12 major issues surrounding the use of kernel methods for nonlinear feature extraction. Each issue was discussed as to why they are important and how they were addressed through the years by many researchers. We also present a breakdown of the commonly used kernel functions, parameter selection routes, and case studies. Lastly, this review provides an outlook into the future of kernel-based process monitoring, which can hopefully instigate more advanced yet practical solutions in the process industries

    Deep CNN-Based Automated Optical Inspection for Aerospace Components

    Get PDF
    ABSTRACT The defect detection problem is of outmost importance in high-tech industries such as aerospace manufacturing and is widely employed using automated industrial quality control systems. In the aerospace manufacturing industry, composite materials are extensively applied as structural components in civilian and military aircraft. To ensure the quality of the product and high reliability, manual inspection and traditional automatic optical inspection have been employed to identify the defects throughout production and maintenance. These inspection techniques have several limitations such as tedious, time- consuming, inconsistent, subjective, labor intensive, expensive, etc. To make the operation effective and efficient, modern automated optical inspection needs to be preferred. In this dissertation work, automatic defect detection techniques are tested on three levels using a novel aerospace composite materials image dataset (ACMID). First, classical machine learning models, namely, Support Vector Machine and Random Forest, are employed for both datasets. Second, deep CNN-based models, such as improved ResNet50 and MobileNetV2 architectures are trained on ACMID datasets. Third, an efficient defect detection technique that combines the features of deep learning and classical machine learning model is proposed for ACMID dataset. To assess the aerospace composite components, all the models are trained and tested on ACMID datasets with distinct sizes. In addition, this work investigates the scenario when defective and non-defective samples are scarce and imbalanced. To overcome the problems of imbalanced and scarce datasets, oversampling techniques and data augmentation using improved deep convolutional generative adversarial networks (DCGAN) are considered. Furthermore, the proposed models are also validated using one of the benchmark steel surface defects (SSD) dataset

    Credit Scoring Using Machine Learning

    Get PDF
    For financial institutions and the economy at large, the role of credit scoring in lending decisions cannot be overemphasised. An accurate and well-performing credit scorecard allows lenders to control their risk exposure through the selective allocation of credit based on the statistical analysis of historical customer data. This thesis identifies and investigates a number of specific challenges that occur during the development of credit scorecards. Four main contributions are made in this thesis. First, we examine the performance of a number supervised classification techniques on a collection of imbalanced credit scoring datasets. Class imbalance occurs when there are significantly fewer examples in one or more classes in a dataset compared to the remaining classes. We demonstrate that oversampling the minority class leads to no overall improvement to the best performing classifiers. We find that, in contrast, adjusting the threshold on classifier output yields, in many cases, an improvement in classification performance. Our second contribution investigates a particularly severe form of class imbalance, which, in credit scoring, is referred to as the low-default portfolio problem. To address this issue, we compare the performance of a number of semi-supervised classification algorithms with that of logistic regression. Based on the detailed comparison of classifier performance, we conclude that both approaches merit consideration when dealing with low-default portfolios. Third, we quantify the differences in classifier performance arising from various implementations of a real-world behavioural scoring dataset. Due to commercial sensitivities surrounding the use of behavioural scoring data, very few empirical studies which directly address this topic are published. This thesis describes the quantitative comparison of a range of dataset parameters impacting classification performance, including: (i) varying durations of historical customer behaviour for model training; (ii) different lengths of time from which a borrower’s class label is defined; and (iii) using alternative approaches to define a customer’s default status in behavioural scoring. Finally, this thesis demonstrates how artificial data may be used to overcome the difficulties associated with obtaining and using real-world data. The limitations of artificial data, in terms of its usefulness in evaluating classification performance, are also highlighted. In this work, we are interested in generating artificial data, for credit scoring, in the absence of any available real-world data

    Enhanced non-parametric sequence learning scheme for internet of things sensory data in cloud infrastructure

    Get PDF
    The Internet of Things (IoT) Cloud is an emerging technology that enables machine-to-machine, human-to-machine and human-to-human interaction through the Internet. IoT sensor devices tend to generate sensory data known for their dynamic and heterogeneous nature. Hence, it makes it elusive to be managed by the sensor devices due to their limited computation power and storage space. However, the Cloud Infrastructure as a Service (IaaS) leverages the limitations of the IoT devices by making its computation power and storage resources available to execute IoT sensory data. In IoT-Cloud IaaS, resource allocation is the process of distributing optimal resources to execute data request tasks that comprise data filtering operations. Recently, machine learning, non-heuristics, multi-objective and hybrid algorithms have been applied for efficient resource allocation to execute IoT sensory data filtering request tasks in IoT-enabled Cloud IaaS. However, the filtering task is still prone to some challenges. These challenges include global search entrapment of event and error outlier detection as the dimension of the dataset increases in size, the inability of missing data recovery for effective redundant data elimination and local search entrapment that leads to unbalanced workloads on available resources required for task execution. In this thesis, the enhancement of Non-Parametric Sequence Learning (NPSL), Perceptually Important Point (PIP) and Efficient Energy Resource Ranking- Virtual Machine Selection (ERVS) algorithms were proposed. The Non-Parametric Sequence-based Agglomerative Gaussian Mixture Model (NPSAGMM) technique was initially utilized to improve the detection of event and error outliers in the global space as the dimension of the dataset increases in size. Then, Perceptually Important Points K-means-enabled Cosine and Manhattan (PIP-KCM) technique was employed to recover missing data to improve the elimination of duplicate sensed data records. Finally, an Efficient Resource Balance Ranking- based Glow-warm Swarm Optimization (ERBV-GSO) technique was used to resolve the local search entrapment for near-optimal solutions and to reduce workload imbalance on available resources for task execution in the IoT-Cloud IaaS platform. Experiments were carried out using the NetworkX simulator and the results of N-PSAGMM, PIP-KCM and ERBV-GSO techniques with N-PSL, PIP, ERVS and Resource Fragmentation Aware (RF-Aware) algorithms were compared. The experimental results showed that the proposed NPSAGMM, PIP-KCM, and ERBV-GSO techniques produced a tremendous performance improvement rate based on 3.602%/6.74% Precision, 9.724%/8.77% Recall, 5.350%/4.42% Area under Curve for the detection of event and error outliers. Furthermore, the results indicated an improvement rate of 94.273% F1-score, 0.143 Reduction Ratio, and with minimum 0.149% Root Mean Squared Error for redundant data elimination as well as the minimum number of 608 Virtual Machine migrations, 47.62% Resource Utilization and 41.13% load balancing degree for the allocation of desired resources deployed to execute sensory data filtering tasks respectively. Therefore, the proposed techniques have proven to be effective for improving the load balancing of allocating the desired resources to execute efficient outlier (Event and Error) detection and eliminate redundant data records in the IoT-based Cloud IaaS Infrastructure

    Process Modeling in Pyrometallurgical Engineering

    Get PDF
    The Special Issue presents almost 40 papers on recent research in modeling of pyrometallurgical systems, including physical models, first-principles models, detailed CFD and DEM models as well as statistical models or models based on machine learning. The models cover the whole production chain from raw materials processing through the reduction and conversion unit processes to ladle treatment, casting, and rolling. The papers illustrate how models can be used for shedding light on complex and inaccessible processes characterized by high temperatures and hostile environment, in order to improve process performance, product quality, or yield and to reduce the requirements of virgin raw materials and to suppress harmful emissions

    Application of data analytics for predictive maintenance in aerospace: an approach to imbalanced learning.

    Get PDF
    The use of aircraft operational logs to predict potential failure that may lead to disruption poses many challenges and has yet to be fully explored. These logs are captured during each flight and contain streamed data from various aircraft subsystems relating to status and warning indicators. They may, therefore, be regarded as complex multivariate time-series data. Given that aircraft are high-integrity assets, failures are extremely rare, and hence the distribution of relevant data containing prior indicators will be highly skewed to the normal (healthy) case. This will present a significant challenge in using data-driven techniques to 'learning' relationships/patterns that depict fault scenarios since the model will be biased to the heavily weighted no-fault outcomes. This thesis aims to develop a predictive model for aircraft component failure utilising data from the aircraft central maintenance system (ACMS). The initial objective is to determine the suitability of the ACMS data for predictive maintenance modelling. An exploratory analysis of the data revealed several inherent irregularities, including an extreme data imbalance problem, irregular patterns and trends, class overlapping, and small class disjunct, all of which are significant drawbacks for traditional machine learning algorithms, resulting in low-performance models. Four novel advanced imbalanced classification techniques are developed to handle the identified data irregularities. The first algorithm focuses on pattern extraction and uses bootstrapping to oversample the minority class; the second algorithm employs the balanced calibrated hybrid ensemble technique to overcome class overlapping and small class disjunct; the third algorithm uses a derived loss function and new network architecture to handle extremely imbalanced ratios in deep neural networks; and finally, a deep reinforcement learning approach for imbalanced classification problems in log- based datasets is developed. An ACMS dataset and its accompanying maintenance records were used to validate the proposed algorithms. The research's overall finding indicates that an advanced method for handling extremely imbalanced problems using the log-based ACMS datasets is viable for developing robust data-driven predictive maintenance models for aircraft component failure. When the four implementations were compared, deep reinforcement learning (DRL) strategies, specifically the proposed double deep State-action-reward-state-action with prioritised experience reply memory (DDSARSA+PER), outperformed other methods in terms of false-positive and false-negative rates for all the components considered. The validation result further suggests that the DDSARSA+PER model is capable of predicting around 90% of aircraft component replacements with a 0.005 false-negative rate in both A330 and A320 aircraft families studied in this researchPhD in Transport System
    corecore