163 research outputs found
Exploiting gan as an oversampling method for imbalanced data augmentation with application to the fault diagnosis of an industrial robot
O diagnóstico inteligente de falhas baseado em aprendizagem máquina geralmente requer
um conjunto de dados balanceados para produzir um desempenho aceitável. No
entanto, a obtenção de dados quando o equipamento industrial funciona com falhas é
uma tarefa desafiante, resultando frequentemente num desequilíbrio entre dados obtidos
em condições nominais e com falhas. As técnicas de aumento de dados são das
abordagens mais promissoras para mitigar este problema.
Redes adversárias generativas (GAN) são um tipo de modelo generativo que consiste
de um módulo gerador e de um discriminador. Por meio de aprendizagem adversária
entre estes módulos, o gerador otimizado pode produzir padrões sintéticos que
podem ser usados para amumento de dados.
Investigamos se asGANpodem ser usadas como uma ferramenta de sobre amostra-
-gem para compensar um conjunto de dados desequilibrado em uma tarefa de diagnóstico
de falhas num manipulador robótico industrial. Realizaram-se uma série de
experiências para validar a viabilidade desta abordagem. A abordagem é comparada
com seis cenários, incluindo o método clássico de sobre amostragem SMOTE. Os resultados
mostram que a GAN supera todos os cenários comparados.
Para mitigar dois problemas reconhecidos no treino das GAN, ou seja, instabilidade
de treino e colapso de modo, é proposto o seguinte.
Propomos uma generalização da GAN de erro quadrado médio (MSE GAN) da
Wasserstein GAN com penalidade de gradiente (WGAN-GP), referida como VGAN (GAN baseado numa matriz V) para mitigar a instabilidade de treino. Além disso,
propomos um novo critério para rastrear o modelo mais adequado durante o treino.
Experiências com o MNIST e no conjunto de dados do manipulador robótico industrial
mostram que o VGAN proposto supera outros modelos competitivos.
A rede adversária generativa com consistência de ciclo (CycleGAN) visa lidar com
o colapso de modo, uma condição em que o gerador produz pouca ou nenhuma variabilidade.
Investigamos a distância fatiada de Wasserstein (SWD) na CycleGAN. O
SWD é avaliado tanto no CycleGAN incondicional quanto no CycleGAN condicional
com e sem mecanismos de compressão e excitação. Mais uma vez, dois conjuntos de
dados são avaliados, ou seja, o MNIST e o conjunto de dados do manipulador robótico
industrial. Os resultados mostram que o SWD tem menor custo computacional e supera
o CycleGAN convencional.Machine learning based intelligent fault diagnosis often requires a balanced data set for
yielding an acceptable performance. However, obtaining faulty data from industrial
equipment is challenging, often resulting in an imbalance between data acquired in
normal conditions and data acquired in the presence of faults. Data augmentation
techniques are among the most promising approaches to mitigate such issue.
Generative adversarial networks (GAN) are a type of generative model consisting
of a generator module and a discriminator. Through adversarial learning between
these modules, the optimised generator can produce synthetic patterns that can be
used for data augmentation.
We investigate whether GAN can be used as an oversampling tool to compensate
for an imbalanced data set in an industrial robot fault diagnosis task. A series of experiments
are performed to validate the feasibility of this approach. The approach is
compared with six scenarios, including the classical oversampling method (SMOTE).
Results show that GAN outperforms all the compared scenarios.
To mitigate two recognised issues in GAN training, i.e., instability and mode collapse,
the following is proposed.
We proposed a generalization of both mean sqaure error (MSE GAN) and Wasserstein
GAN with gradient penalty (WGAN-GP), referred to as VGAN (the V-matrix
based GAN) to mitigate training instability. Also, a novel criterion is proposed to keep
track of the most suitable model during training. Experiments on both the MNIST and the industrial robot data set show that the proposed VGAN outperforms other
competitive models.
Cycle consistency generative adversarial network (CycleGAN) is aiming at dealing
with mode collapse, a condition where the generator yields little to none variability.
We investigate the sliced Wasserstein distance (SWD) for CycleGAN. SWD is evaluated
in both the unconditional CycleGAN and the conditional CycleGAN with and
without squeeze-and-excitation mechanisms. Again, two data sets are evaluated, i.e.,
the MNIST and the industrial robot data set. Results show that SWD has less computational
cost and outperforms conventional CycleGAN
A Review of Graph Neural Networks and Their Applications in Power Systems
Deep neural networks have revolutionized many machine learning tasks in power
systems, ranging from pattern recognition to signal processing. The data in
these tasks is typically represented in Euclidean domains. Nevertheless, there
is an increasing number of applications in power systems, where data are
collected from non-Euclidean domains and represented as graph-structured data
with high dimensional features and interdependency among nodes. The complexity
of graph-structured data has brought significant challenges to the existing
deep neural networks defined in Euclidean domains. Recently, many publications
generalizing deep neural networks for graph-structured data in power systems
have emerged. In this paper, a comprehensive overview of graph neural networks
(GNNs) in power systems is proposed. Specifically, several classical paradigms
of GNNs structures (e.g., graph convolutional networks) are summarized, and key
applications in power systems, such as fault scenario application, time series
prediction, power flow calculation, and data generation are reviewed in detail.
Furthermore, main issues and some research trends about the applications of
GNNs in power systems are discussed
Domain knowledge-informed Synthetic fault sample generation with Health Data Map for cross-domain Planetary Gearbox Fault Diagnosis
Extensive research has been conducted on fault diagnosis of planetary
gearboxes using vibration signals and deep learning (DL) approaches. However,
DL-based methods are susceptible to the domain shift problem caused by varying
operating conditions of the gearbox. Although domain adaptation and data
synthesis methods have been proposed to overcome such domain shifts, they are
often not directly applicable in real-world situations where only healthy data
is available in the target domain. To tackle the challenge of extreme domain
shift scenarios where only healthy data is available in the target domain, this
paper proposes two novel domain knowledge-informed data synthesis methods
utilizing the health data map (HDMap). The two proposed approaches are referred
to as scaled CutPaste and FaultPaste. The HDMap is used to physically represent
the vibration signal of the planetary gearbox as an image-like matrix, allowing
for visualization of fault-related features. CutPaste and FaultPaste are then
applied to generate faulty samples based on the healthy data in the target
domain, using domain knowledge and fault signatures extracted from the source
domain, respectively. In addition to generating realistic faults, the proposed
methods introduce scaling of fault signatures for controlled synthesis of
faults with various severity levels. A case study is conducted on a planetary
gearbox testbed to evaluate the proposed approaches. The results show that the
proposed methods are capable of accurately diagnosing faults, even in cases of
extreme domain shift, and can estimate the severity of faults that have not
been previously observed in the target domain.Comment: Under review / added arXiv identifie
How to Do Machine Learning with Small Data? -- A Review from an Industrial Perspective
Artificial intelligence experienced a technological breakthrough in science,
industry, and everyday life in the recent few decades. The advancements can be
credited to the ever-increasing availability and miniaturization of
computational resources that resulted in exponential data growth. However,
because of the insufficient amount of data in some cases, employing machine
learning in solving complex tasks is not straightforward or even possible. As a
result, machine learning with small data experiences rising importance in data
science and application in several fields. The authors focus on interpreting
the general term of "small data" and their engineering and industrial
application role. They give a brief overview of the most important industrial
applications of machine learning and small data. Small data is defined in terms
of various characteristics compared to big data, and a machine learning
formalism was introduced. Five critical challenges of machine learning with
small data in industrial applications are presented: unlabeled data, imbalanced
data, missing data, insufficient data, and rare events. Based on those
definitions, an overview of the considerations in domain representation and
data acquisition is given along with a taxonomy of machine learning approaches
in the context of small data
Adversarial Attacks on Machine Learning Cybersecurity Defences in Industrial Control Systems
The proliferation and application of machine learning based Intrusion
Detection Systems (IDS) have allowed for more flexibility and efficiency in the
automated detection of cyber attacks in Industrial Control Systems (ICS).
However, the introduction of such IDSs has also created an additional attack
vector; the learning models may also be subject to cyber attacks, otherwise
referred to as Adversarial Machine Learning (AML). Such attacks may have severe
consequences in ICS systems, as adversaries could potentially bypass the IDS.
This could lead to delayed attack detection which may result in infrastructure
damages, financial loss, and even loss of life. This paper explores how
adversarial learning can be used to target supervised models by generating
adversarial samples using the Jacobian-based Saliency Map attack and exploring
classification behaviours. The analysis also includes the exploration of how
such samples can support the robustness of supervised models using adversarial
training. An authentic power system dataset was used to support the experiments
presented herein. Overall, the classification performance of two widely used
classifiers, Random Forest and J48, decreased by 16 and 20 percentage points
when adversarial samples were present. Their performances improved following
adversarial training, demonstrating their robustness towards such attacks.Comment: 9 pages. 7 figures. 7 tables. 46 references. Submitted to a special
issue Journal of Information Security and Applications, Machine Learning
Techniques for Cyber Security: Challenges and Future Trends, Elsevie
Artificial Intelligence-Based Methods for Power System Security Assessment with Limited Dataset
This thesis concerns the relationship between the load, load model, and power system stability. It investigates the possibility of developing a dynamic load model to represent the power system load characteristic during system faults when the power system operates at a high percentage of the power generation from wind farms, solar power, and vehicle-to-grid technology. Additionally, with artificial intelligence supporting the seamless integration of an increasingly distributed and multi-directional power system to unlock the vast potential of renewables, new approaches are proposed to improve the training performance for the applications of artificial neural networks in non-intrusive load monitoring and dynamic security assessment.
An improved hybrid load model is proposed to represent the load characteristics in the above power system operation. Genetic algorithms and the multi-curve identification method are applied to determine the parameters of the load model, aiming to minimize the error between the estimated and measured values. The results indicate that the proposed hybrid load model has a reasonably low fitting error to represent the load dynamics.
In addition, new approaches are proposed to tackle the challenges posed by limited data when training artificial neural networks (ANNs) for their application in power systems. The knowledge transfer approach is utilized to support the ANN training to generate synthetic data for non-intrusive load monitoring. The results indicate that this approach improves the issue of mode collapse and reduces the need for lengthy training iterations, making the ANN effective for generating synthetic data from limited data. Moreover, the knowledge transfer approach also supports ANN training with limited data for dynamic security assessment. Kernel principal component analysis is employed to eliminate the dimensionality reduction step. The results indicate an improvement in the training performance
Artificial Intelligence-Based Methods for Power System Security Assessment with Limited Dataset
This thesis concerns the relationship between the load, load model, and power system stability. It investigates the possibility of developing a dynamic load model to represent the power system load characteristic during system faults when the power system operates at a high percentage of the power generation from wind farms, solar power, and vehicle-to-grid technology. Additionally, with artificial intelligence supporting the seamless integration of an increasingly distributed and multi-directional power system to unlock the vast potential of renewables, new approaches are proposed to improve the training performance for the applications of artificial neural networks in non-intrusive load monitoring and dynamic security assessment.
An improved hybrid load model is proposed to represent the load characteristics in the above power system operation. Genetic algorithms and the multi-curve identification method are applied to determine the parameters of the load model, aiming to minimize the error between the estimated and measured values. The results indicate that the proposed hybrid load model has a reasonably low fitting error to represent the load dynamics.
In addition, new approaches are proposed to tackle the challenges posed by limited data when training artificial neural networks (ANNs) for their application in power systems. The knowledge transfer approach is utilized to support the ANN training to generate synthetic data for non-intrusive load monitoring. The results indicate that this approach improves the issue of mode collapse and reduces the need for lengthy training iterations, making the ANN effective for generating synthetic data from limited data. Moreover, the knowledge transfer approach also supports ANN training with limited data for dynamic security assessment. Kernel principal component analysis is employed to eliminate the dimensionality reduction step. The results indicate an improvement in the training performance
Realistic adversarial machine learning to improve network intrusion detection
Modern organizations can significantly benefit from the use of Artificial Intelligence (AI), and more specifically Machine Learning (ML), to tackle the growing number and increasing sophistication of cyber-attacks targeting their business processes. However, there are several technological and ethical challenges that undermine the trustworthiness of AI. One of the main challenges is the lack of robustness, which is an essential property to ensure that ML is used in a secure way. Improving robustness is no easy task because ML is inherently susceptible to adversarial examples: data samples with subtle perturbations that cause unexpected behaviors in ML models. ML engineers and security practitioners still lack the knowledge and tools to prevent such disruptions, so adversarial examples pose a major threat to ML and to the intelligent Network Intrusion Detection (NID) systems that rely on it. This thesis presents a methodology for a trustworthy adversarial robustness analysis of multiple ML models, and an intelligent method for the generation of realistic adversarial examples in complex tabular data domains like the NID domain: Adaptative Perturbation Pattern Method (A2PM). It is demonstrated that a successful adversarial attack is not guaranteed to be a successful cyber-attack, and that adversarial data perturbations can only be realistic if they are simultaneously valid and coherent, complying with the domain constraints of a real communication network and the class-specific constraints of a certain cyber-attack class. A2PM can be used for adversarial attacks, to iteratively cause misclassifications, and adversarial training, to perform data augmentation with slightly perturbed data samples. Two case studies were conducted to evaluate its suitability for the NID domain. The first verified that the generated perturbations preserved both validity and coherence in Enterprise and Internet-of Things (IoT) network scenarios, achieving realism. The second verified that adversarial training with simple perturbations enables the models to retain a good generalization to regular IoT network traffic flows, in addition to being more robust to adversarial examples. The key takeaway of this thesis is: ML models can be incredibly valuable to improve a cybersecurity system, but their own vulnerabilities must not be disregarded. It is essential to continue the research efforts to improve the security and trustworthiness of ML and of the intelligent systems that rely on it.Organizações modernas podem beneficiar significativamente do uso de Inteligência Artificial (AI), e mais especificamente Aprendizagem Automática (ML), para enfrentar a crescente quantidade e sofisticação de ciberataques direcionados aos seus processos de negócio. No entanto, há vários desafios tecnológicos e éticos que comprometem a confiabilidade da AI. Um dos maiores desafios é a falta de robustez, que é uma propriedade essencial para garantir que se usa ML de forma segura. Melhorar a robustez não é uma tarefa fácil porque ML é inerentemente suscetível a exemplos adversos: amostras de dados com perturbações subtis que causam comportamentos inesperados em modelos ML. Engenheiros de ML e profissionais de segurança ainda não têm o conhecimento nem asferramentas necessárias para prevenir tais disrupções, por isso os exemplos adversos representam uma grande ameaça a ML e aos sistemas de Deteção de Intrusões de Rede (NID) que dependem de ML. Esta tese apresenta uma metodologia para uma análise da robustez de múltiplos modelos ML, e um método inteligente para a geração de exemplos adversos realistas em domínios de dados tabulares complexos como o domínio NID: Método de Perturbação com Padrões Adaptativos (A2PM). É demonstrado que um ataque adverso bem-sucedido não é garantidamente um ciberataque bem-sucedido, e que as perturbações adversas só são realistas se forem simultaneamente válidas e coerentes, cumprindo as restrições de domínio de uma rede de computadores real e as restrições específicas de uma certa classe de ciberataque. A2PM pode ser usado para ataques adversos, para iterativamente causar erros de classificação, e para treino adverso, para realizar aumento de dados com amostras ligeiramente perturbadas. Foram efetuados dois casos de estudo para avaliar a sua adequação ao domínio NID. O primeiro verificou que as perturbações preservaram tanto a validade como a coerência em cenários de redes Empresariais e Internet-das-Coisas (IoT), alcançando o realismo. O segundo verificou que o treino adverso com perturbações simples permitiu aos modelos reter uma boa generalização a fluxos de tráfego de rede IoT, para além de serem mais robustos contra exemplos adversos. A principal conclusão desta tese é: os modelos ML podem ser incrivelmente valiosos para melhorar um sistema de cibersegurança, mas as suas próprias vulnerabilidades não devem ser negligenciadas. É essencial continuar os esforços de investigação para melhorar a segurança e a confiabilidade de ML e dos sistemas inteligentes que dependem de ML
- …