5 research outputs found
Anomaly Detection in Time Series: Current Focus and Future Challenges
Anomaly detection in time series has become an increasingly vital task, with applications such as fraud detection and intrusion monitoring. Tackling this problem requires an array of approaches, including statistical analysis, machine learning, and deep learning. Various techniques have been proposed to cater to the complexity of this problem. However, there are still numerous challenges in the field concerning how best to process high-dimensional and complex data streams in real time. This chapter offers insight into the cutting-edge models for anomaly detection in time series. Several of the models are discussed and their advantages and disadvantages are explored. We also look at new areas of research that are being explored by researchers today as their current focuses and how those new models or techniques are being implemented in them as they try to solve unique problems posed by complex data, high-volume data streams, and a need for real-time processing. These research areas will provide concrete examples of the applications of discussed models. Lastly, we identify some of the current issues and suggest future directions for research concerning anomaly detection systems. We aim to provide readers with a comprehensive picture of what is already out there so they can better understand the space – preparing them for further development within this growing field
An active learning framework with a class balancing strategy for time series classification
Training machine learning models for classification tasks often requires labeling numerous
samples, which is costly and time-consuming, especially in time series analysis.
This research investigates Active Learning (AL) strategies to reduce the amount of
labeled data needed for e↵ective time series classification. Traditional AL techniques
cannot control the selection of instances per class for labeling, leading to potential bias
in classification performance and instance selection, particularly in imbalanced time
series datasets. To address this, we propose a novel class-balancing instance selection
algorithm integrated with standard AL strategies. Our approach aims to select more
instances from classes with fewer labeled examples, thereby addressing imbalance in
time series datasets. We demonstrate the e↵ectiveness of our AL framework in selecting
informative data samples for two distinct domains of tactile texture recognition
and industrial fault detection. In robotics, our method achieves high-performance
texture categorization while significantly reducing labeled training data requirements
to 70%. We also evaluate the impact of di↵erent sliding window time intervals on
robotic texture classification using AL strategies. In synthetic fiber manufacturing,
we adapt AL techniques to address the challenge of fault classification, aiming to
minimize data annotation cost and time for industries. We also address real-life class
imbalances in the multiclass industrial anomalous dataset using our class-balancing
instance algorithm integrated with AL strategies. Overall, this thesis highlights the
potential of our AL framework across these two distinct domains
Exploiting gan as an oversampling method for imbalanced data augmentation with application to the fault diagnosis of an industrial robot
O diagnóstico inteligente de falhas baseado em aprendizagem máquina geralmente requer
um conjunto de dados balanceados para produzir um desempenho aceitável. No
entanto, a obtenção de dados quando o equipamento industrial funciona com falhas é
uma tarefa desafiante, resultando frequentemente num desequilÃbrio entre dados obtidos
em condições nominais e com falhas. As técnicas de aumento de dados são das
abordagens mais promissoras para mitigar este problema.
Redes adversárias generativas (GAN) são um tipo de modelo generativo que consiste
de um módulo gerador e de um discriminador. Por meio de aprendizagem adversária
entre estes módulos, o gerador otimizado pode produzir padrões sintéticos que
podem ser usados para amumento de dados.
Investigamos se asGANpodem ser usadas como uma ferramenta de sobre amostra-
-gem para compensar um conjunto de dados desequilibrado em uma tarefa de diagnóstico
de falhas num manipulador robótico industrial. Realizaram-se uma série de
experiências para validar a viabilidade desta abordagem. A abordagem é comparada
com seis cenários, incluindo o método clássico de sobre amostragem SMOTE. Os resultados
mostram que a GAN supera todos os cenários comparados.
Para mitigar dois problemas reconhecidos no treino das GAN, ou seja, instabilidade
de treino e colapso de modo, é proposto o seguinte.
Propomos uma generalização da GAN de erro quadrado médio (MSE GAN) da
Wasserstein GAN com penalidade de gradiente (WGAN-GP), referida como VGAN (GAN baseado numa matriz V) para mitigar a instabilidade de treino. Além disso,
propomos um novo critério para rastrear o modelo mais adequado durante o treino.
Experiências com o MNIST e no conjunto de dados do manipulador robótico industrial
mostram que o VGAN proposto supera outros modelos competitivos.
A rede adversária generativa com consistência de ciclo (CycleGAN) visa lidar com
o colapso de modo, uma condição em que o gerador produz pouca ou nenhuma variabilidade.
Investigamos a distância fatiada de Wasserstein (SWD) na CycleGAN. O
SWD é avaliado tanto no CycleGAN incondicional quanto no CycleGAN condicional
com e sem mecanismos de compressão e excitação. Mais uma vez, dois conjuntos de
dados são avaliados, ou seja, o MNIST e o conjunto de dados do manipulador robótico
industrial. Os resultados mostram que o SWD tem menor custo computacional e supera
o CycleGAN convencional.Machine learning based intelligent fault diagnosis often requires a balanced data set for
yielding an acceptable performance. However, obtaining faulty data from industrial
equipment is challenging, often resulting in an imbalance between data acquired in
normal conditions and data acquired in the presence of faults. Data augmentation
techniques are among the most promising approaches to mitigate such issue.
Generative adversarial networks (GAN) are a type of generative model consisting
of a generator module and a discriminator. Through adversarial learning between
these modules, the optimised generator can produce synthetic patterns that can be
used for data augmentation.
We investigate whether GAN can be used as an oversampling tool to compensate
for an imbalanced data set in an industrial robot fault diagnosis task. A series of experiments
are performed to validate the feasibility of this approach. The approach is
compared with six scenarios, including the classical oversampling method (SMOTE).
Results show that GAN outperforms all the compared scenarios.
To mitigate two recognised issues in GAN training, i.e., instability and mode collapse,
the following is proposed.
We proposed a generalization of both mean sqaure error (MSE GAN) and Wasserstein
GAN with gradient penalty (WGAN-GP), referred to as VGAN (the V-matrix
based GAN) to mitigate training instability. Also, a novel criterion is proposed to keep
track of the most suitable model during training. Experiments on both the MNIST and the industrial robot data set show that the proposed VGAN outperforms other
competitive models.
Cycle consistency generative adversarial network (CycleGAN) is aiming at dealing
with mode collapse, a condition where the generator yields little to none variability.
We investigate the sliced Wasserstein distance (SWD) for CycleGAN. SWD is evaluated
in both the unconditional CycleGAN and the conditional CycleGAN with and
without squeeze-and-excitation mechanisms. Again, two data sets are evaluated, i.e.,
the MNIST and the industrial robot data set. Results show that SWD has less computational
cost and outperforms conventional CycleGAN