403 research outputs found

    A Survey of Graph-based Deep Learning for Anomaly Detection in Distributed Systems

    Full text link
    Anomaly detection is a crucial task in complex distributed systems. A thorough understanding of the requirements and challenges of anomaly detection is pivotal to the security of such systems, especially for real-world deployment. While there are many works and application domains that deal with this problem, few have attempted to provide an in-depth look at such systems. In this survey, we explore the potentials of graph-based algorithms to identify anomalies in distributed systems. These systems can be heterogeneous or homogeneous, which can result in distinct requirements. One of our objectives is to provide an in-depth look at graph-based approaches to conceptually analyze their capability to handle real-world challenges such as heterogeneity and dynamic structure. This study gives an overview of the State-of-the-Art (SotA) research articles in the field and compare and contrast their characteristics. To facilitate a more comprehensive understanding, we present three systems with varying abstractions as use cases. We examine the specific challenges involved in anomaly detection within such systems. Subsequently, we elucidate the efficacy of graphs in such systems and explicate their advantages. We then delve into the SotA methods and highlight their strength and weaknesses, pointing out the areas for possible improvements and future works.Comment: The first two authors (A. Danesh Pazho and G. Alinezhad Noghre) have equal contribution. The article is accepted by IEEE Transactions on Knowledge and Data Engineerin

    Proceedings of the Deduktionstreffen 2019

    Get PDF
    The annual meeting Deduktionstreffen is the prime activity of the Special Interest Group on Deduction Systems (FG DedSys) of the AI Section of the German Society for Informatics (GI-FBKI). It is a meeting with a familiar, friendly atmosphere, where everyone interested in deduction can report on their work in an informal setting

    Rejection-oriented learning without complete class information

    Get PDF
    Machine Learning is commonly used to support decision-making in numerous, diverse contexts. Its usefulness in this regard is unquestionable: there are complex systems built on the top of machine learning techniques whose descriptive and predictive capabilities go far beyond those of human beings. However, these systems still have limitations, whose analysis enable to estimate their applicability and confidence in various cases. This is interesting considering that abstention from the provision of a response is preferable to make a mistake in doing so. In the context of classification-like tasks, the indication of such inconclusive output is called rejection. The research which culminated in this thesis led to the conception, implementation and evaluation of rejection-oriented learning systems for two distinct tasks: open set recognition and data stream clustering. These system were derived from WiSARD artificial neural network, which had rejection modelling incorporated into its functioning. This text details and discuss such realizations. It also presents experimental results which allow assess the scientific and practical importance of the proposed state-of-the-art methodology.Aprendizado de Máquina é comumente usado para apoiar a tomada de decisão em numerosos e diversos contextos. Sua utilidade neste sentido é inquestionável: existem sistemas complexos baseados em técnicas de aprendizado de máquina cujas capacidades descritivas e preditivas vão muito além das dos seres humanos. Contudo, esses sistemas ainda possuem limitações, cuja análise permite estimar sua aplicabilidade e confiança em vários casos. Isto é interessante considerando que a abstenção da provisão de uma resposta é preferível a cometer um equívoco ao realizar tal ação. No contexto de classificação e tarefas similares, a indicação desse resultado inconclusivo é chamada de rejeição. A pesquisa que culminou nesta tese proporcionou a concepção, implementação e avaliação de sistemas de aprendizado orientados `a rejeição para duas tarefas distintas: reconhecimento em cenário abertos e agrupamento de dados em fluxo contínuo. Estes sistemas foram derivados da rede neural artificial WiSARD, que teve a modelagem de rejeição incorporada a seu funcionamento. Este texto detalha e discute tais realizações. Ele também apresenta resultados experimentais que permitem avaliar a importância científica e prática da metodologia de ponta proposta

    Exploiting gan as an oversampling method for imbalanced data augmentation with application to the fault diagnosis of an industrial robot

    Get PDF
    O diagnóstico inteligente de falhas baseado em aprendizagem máquina geralmente requer um conjunto de dados balanceados para produzir um desempenho aceitável. No entanto, a obtenção de dados quando o equipamento industrial funciona com falhas é uma tarefa desafiante, resultando frequentemente num desequilíbrio entre dados obtidos em condições nominais e com falhas. As técnicas de aumento de dados são das abordagens mais promissoras para mitigar este problema. Redes adversárias generativas (GAN) são um tipo de modelo generativo que consiste de um módulo gerador e de um discriminador. Por meio de aprendizagem adversária entre estes módulos, o gerador otimizado pode produzir padrões sintéticos que podem ser usados para amumento de dados. Investigamos se asGANpodem ser usadas como uma ferramenta de sobre amostra- -gem para compensar um conjunto de dados desequilibrado em uma tarefa de diagnóstico de falhas num manipulador robótico industrial. Realizaram-se uma série de experiências para validar a viabilidade desta abordagem. A abordagem é comparada com seis cenários, incluindo o método clássico de sobre amostragem SMOTE. Os resultados mostram que a GAN supera todos os cenários comparados. Para mitigar dois problemas reconhecidos no treino das GAN, ou seja, instabilidade de treino e colapso de modo, é proposto o seguinte. Propomos uma generalização da GAN de erro quadrado médio (MSE GAN) da Wasserstein GAN com penalidade de gradiente (WGAN-GP), referida como VGAN (GAN baseado numa matriz V) para mitigar a instabilidade de treino. Além disso, propomos um novo critério para rastrear o modelo mais adequado durante o treino. Experiências com o MNIST e no conjunto de dados do manipulador robótico industrial mostram que o VGAN proposto supera outros modelos competitivos. A rede adversária generativa com consistência de ciclo (CycleGAN) visa lidar com o colapso de modo, uma condição em que o gerador produz pouca ou nenhuma variabilidade. Investigamos a distância fatiada de Wasserstein (SWD) na CycleGAN. O SWD é avaliado tanto no CycleGAN incondicional quanto no CycleGAN condicional com e sem mecanismos de compressão e excitação. Mais uma vez, dois conjuntos de dados são avaliados, ou seja, o MNIST e o conjunto de dados do manipulador robótico industrial. Os resultados mostram que o SWD tem menor custo computacional e supera o CycleGAN convencional.Machine learning based intelligent fault diagnosis often requires a balanced data set for yielding an acceptable performance. However, obtaining faulty data from industrial equipment is challenging, often resulting in an imbalance between data acquired in normal conditions and data acquired in the presence of faults. Data augmentation techniques are among the most promising approaches to mitigate such issue. Generative adversarial networks (GAN) are a type of generative model consisting of a generator module and a discriminator. Through adversarial learning between these modules, the optimised generator can produce synthetic patterns that can be used for data augmentation. We investigate whether GAN can be used as an oversampling tool to compensate for an imbalanced data set in an industrial robot fault diagnosis task. A series of experiments are performed to validate the feasibility of this approach. The approach is compared with six scenarios, including the classical oversampling method (SMOTE). Results show that GAN outperforms all the compared scenarios. To mitigate two recognised issues in GAN training, i.e., instability and mode collapse, the following is proposed. We proposed a generalization of both mean sqaure error (MSE GAN) and Wasserstein GAN with gradient penalty (WGAN-GP), referred to as VGAN (the V-matrix based GAN) to mitigate training instability. Also, a novel criterion is proposed to keep track of the most suitable model during training. Experiments on both the MNIST and the industrial robot data set show that the proposed VGAN outperforms other competitive models. Cycle consistency generative adversarial network (CycleGAN) is aiming at dealing with mode collapse, a condition where the generator yields little to none variability. We investigate the sliced Wasserstein distance (SWD) for CycleGAN. SWD is evaluated in both the unconditional CycleGAN and the conditional CycleGAN with and without squeeze-and-excitation mechanisms. Again, two data sets are evaluated, i.e., the MNIST and the industrial robot data set. Results show that SWD has less computational cost and outperforms conventional CycleGAN

    Machine Learning-based Predictive Maintenance for Optical Networks

    Get PDF
    Optical networks provide the backbone of modern telecommunications by connecting the world faster than ever before. However, such networks are susceptible to several failures (e.g., optical fiber cuts, malfunctioning optical devices), which might result in degradation in the network operation, massive data loss, and network disruption. It is challenging to accurately and quickly detect and localize such failures due to the complexity of such networks, the time required to identify the fault and pinpoint it using conventional approaches, and the lack of proactive efficient fault management mechanisms. Therefore, it is highly beneficial to perform fault management in optical communication systems in order to reduce the mean time to repair, to meet service level agreements more easily, and to enhance the network reliability. In this thesis, the aforementioned challenges and needs are tackled by investigating the use of machine learning (ML) techniques for implementing efficient proactive fault detection, diagnosis, and localization schemes for optical communication systems. In particular, the adoption of ML methods for solving the following problems is explored: - Degradation prediction of semiconductor lasers, - Lifetime (mean time to failure) prediction of semiconductor lasers, - Remaining useful life (the length of time a machine is likely to operate before it requires repair or replacement) prediction of semiconductor lasers, - Optical fiber fault detection, localization, characterization, and identification for different optical network architectures, - Anomaly detection in optical fiber monitoring. Such ML approaches outperform the conventionally employed methods for all the investigated use cases by achieving better prediction accuracy and earlier prediction or detection capability

    Machine Learning in Tribology

    Get PDF
    Tribology has been and continues to be one of the most relevant fields, being present in almost all aspects of our lives. The understanding of tribology provides us with solutions for future technical challenges. At the root of all advances made so far are multitudes of precise experiments and an increasing number of advanced computer simulations across different scales and multiple physical disciplines. Based upon this sound and data-rich foundation, advanced data handling, analysis and learning methods can be developed and employed to expand existing knowledge. Therefore, modern machine learning (ML) or artificial intelligence (AI) methods provide opportunities to explore the complex processes in tribological systems and to classify or quantify their behavior in an efficient or even real-time way. Thus, their potential also goes beyond purely academic aspects into actual industrial applications. To help pave the way, this article collection aimed to present the latest research on ML or AI approaches for solving tribology-related issues generating true added value beyond just buzzwords. In this sense, this Special Issue can support researchers in identifying initial selections and best practice solutions for ML in tribology

    BugDoc: Algorithms to Debug Computational Processes

    Get PDF
    Data analysis for scientific experiments and enterprises, large-scale simulations, and machine learning tasks all entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous outputs, the pipeline may fail to execute or produce incorrect results. Inferring the root cause(s) of such failures is challenging, usually requiring time and much human thought, while still being error-prone. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures. Through a detailed experimental evaluation, we assess the cost, precision, and recall of our approach compared to the state of the art. Our experimental data and processing software is available for use, reproducibility, and enhancement.Comment: To appear in SIGMOD 2020. arXiv admin note: text overlap with arXiv:2002.0464
    • …
    corecore