3 research outputs found
Weakly Supervised-Based Oversampling for High Imbalance and High Dimensionality Data Classification
With the abundance of industrial datasets, imbalanced classification has
become a common problem in several application domains. Oversampling is an
effective method to solve imbalanced classification. One of the main challenges
of the existing oversampling methods is to accurately label the new synthetic
samples. Inaccurate labels of the synthetic samples would distort the
distribution of the dataset and possibly worsen the classification performance.
This paper introduces the idea of weakly supervised learning to handle the
inaccurate labeling of synthetic samples caused by traditional oversampling
methods. Graph semi-supervised SMOTE is developed to improve the credibility of
the synthetic samples' labels. In addition, we propose cost-sensitive
neighborhood components analysis for high dimensional datasets and bootstrap
based ensemble framework for highly imbalanced datasets. The proposed method
has achieved good classification performance on 8 synthetic datasets and 3
real-world datasets, especially for high imbalance and high dimensionality
problems. The average performances and robustness are better than the benchmark
methods
Intelligent Condition Monitoring of Industrial Plants: An Overview of Methodologies and Uncertainty Management Strategies
Condition monitoring plays a significant role in the safety and reliability
of modern industrial systems. Artificial intelligence (AI) approaches are
gaining attention from academia and industry as a growing subject in industrial
applications and as a powerful way of identifying faults. This paper provides
an overview of intelligent condition monitoring and fault detection and
diagnosis methods for industrial plants with a focus on the open-source
benchmark Tennessee Eastman Process (TEP). In this survey, the most popular and
state-of-the-art deep learning (DL) and machine learning (ML) algorithms for
industrial plant condition monitoring, fault detection, and diagnosis are
summarized and the advantages and disadvantages of each algorithm are studied.
Challenges like imbalanced data, unlabelled samples and how deep learning
models can handle them are also covered. Finally, a comparison of the
accuracies and specifications of different algorithms utilizing the Tennessee
Eastman Process (TEP) is conducted. This research will be beneficial for both
researchers who are new to the field and experts, as it covers the literature
on condition monitoring and state-of-the-art methods alongside the challenges
and possible solutions to them