Search CORE

2,791 research outputs found

The New Abnormal: Network Anomalies in the AI Era

Author: Ben Houidi Zied
Drago Idilio
Favale Thomas
Giordano Danilo
Soro Francesca
Vassio Luca
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

Anomaly detection aims at finding unexpected patterns in data. It has been used in several problems in computer networks, from the detection of port scans and DDoS attacks to the monitoring of time-series collected from Internet monitoring systems. Data-driven approaches and machine learning have seen widespread application on anomaly detection too, and this trend has been accelerated by the recent developments on Artificial Intelligence research. This chapter summarizes ongoing recent progresses on anomaly detection research. In particular, we evaluate how developments on AI algorithms bring new possibilities for anomaly detection. We cover new representation learning techniques such as Generative Artificial Networks and Autoencoders, as well as techniques that can be used to improve models learned with machine learning algorithms, such as reinforcement learning. We survey both research works and tools implementing AI algorithms for anomaly detection. We found that the novel algorithms, while successful in other fields, have hardly been applied to networking problems. We conclude the chapter with a case study that illustrates a possible research direction

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Probabilistic Approach to Structural Change Prediction in Evolving Social Networks

Author: Budka Marcin
Gonczarek A.
Juszczyszyn Krzysztof
Musial Katarzyna
Tomczak J.
Publication venue
Publication date: 01/01/2012
Field of study

We propose a predictive model of structural changes in elementary subgraphs of social network based on Mixture of Markov Chains. The model is trained and verified on a dataset from a large corporate social network analyzed in short, one day-long time windows, and reveals distinctive patterns of evolution of connections on the level of local network topology. We argue that the network investigated in such short timescales is highly dynamic and therefore immune to classic methods of link prediction and structural analysis, and show that in the case of complex networks, the dynamic subgraph mining may lead to better prediction accuracy. The experiments were carried out on the logs from the Wroclaw University of Technology mail server

Bournemouth University Research Online

Review and Analysis of Failure Detection and Prevention Techniques in IT Infrastructure Monitoring

Author: Bhanage Deepali Arun
Kotecha K
Pawar Ambika Vishal
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 04/03/2021
Field of study

Maintaining the health of IT infrastructure components for improved reliability and availability is a research and innovation topic for many years. Identification and handling of failures are crucial and challenging due to the complexity of IT infrastructure. System logs are the primary source of information to diagnose and fix failures. In this work, we address three essential research dimensions about failures, such as the need for failure handling in IT infrastructure, understanding the contribution of system-generated log in failure detection and reactive & proactive approaches used to deal with failure situations. This study performs a comprehensive analysis of existing literature by considering three prominent aspects as log preprocessing, anomaly & failure detection, and failure prevention. With this coherent review, we (1) presume the need for IT infrastructure monitoring to avoid downtime, (2) examine the three types of approaches for anomaly and failure detection such as a rule-based, correlation method and classification, and (3) fabricate the recommendations for researchers on further research guidelines. As far as the authors\u27 knowledge, this is the first comprehensive literature review on IT infrastructure monitoring techniques. The review has been conducted with the help of meta-analysis and comparative study of machine learning and deep learning techniques. This work aims to outline significant research gaps in the area of IT infrastructure failure detection. This work will help future researchers understand the advantages and limitations of current methods and select an adequate approach to their problem

DigitalCommons@University of Nebraska

Towards Interpretable Anomaly Analytics for Secure Testing Environments

Author: Spee H.M.
Publication venue
Publication date: 21/09/2020
Field of study

Pure OAI Repository

Recommended from our members

Leveraging Distributed Tracing and Container Cloning for Replay Debugging of Microservices

Author: Mathur Mihir
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Microservice architectures have gained prominence in recent years for building large-scale industrial distributed systems. However, microservice architectures make the usage of replay debugging, a powerful technique for finding root causes of faults, very challenging because of the polyglot (written in several languages) services, large accumulated state of services, and tight latency limits imposed by long hop-chains. This work attempts to provide a framework for enabling replay debugging in production microservice applications. We study 25 real-world faults in microservice systems collected from diverse sources, categorize these faults by fault symptoms, and create 15 application agnostic mutation operators for microservices. We then propose a language agnostic replay debugging framework for microservice applications that uses a distributed tracing system to record network requests and enables replay of those requests on cloned service containers running in a debug environment. A key component of this framework is an anomaly detector that uses span-level and container-level monitoring to detect fault symptoms found in our study and localizes faults to trace level so that faulty traces can be easily replayed to find the root cause. An open-source microservices application injected successively with the mutation operators is used for an evaluation that shows that our framework is upto an order of magnitude lighter-weight than language-specific recording tools such as Chrome DevTools or VisualVM and can help in finding root causes of 9 out of 15 mutations at a line or function level

eScholarship - University of California

PerfCE: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis

Author: Ji Zhenlan
Ma Pingchuan
Wang Shuai
Publication venue
Publication date: 14/09/2023
Field of study

Debugging performance anomalies in real-world databases is challenging. Causal inference techniques enable qualitative and quantitative root cause analysis of performance downgrade. Nevertheless, causality analysis is practically challenging, particularly due to limited observability. Recently, chaos engineering has been applied to test complex real-world software systems. Chaos frameworks like Chaos Mesh mutate a set of chaos variables to inject catastrophic events (e.g., network slowdowns) to "stress" software systems. The systems under chaos stress are then tested using methods like differential testing to check if they retain their normal functionality (e.g., SQL query output is always correct under stress). Despite its ubiquity in the industry, chaos engineering is now employed mostly to aid software testing rather for performance debugging. This paper identifies novel usage of chaos engineering on helping developers diagnose performance anomalies in databases. Our presented framework, PERFCE, comprises an offline phase and an online phase. The offline phase learns the statistical models of the target database system, whilst the online phase diagnoses the root cause of monitored performance anomalies on the fly. During the offline phase, PERFCE leverages both passive observations and proactive chaos experiments to constitute accurate causal graphs and structural equation models (SEMs). When observing performance anomalies during the online phase, causal graphs enable qualitative root cause identification (e.g., high CPU usage) and SEMs enable quantitative counterfactual analysis (e.g., determining "when CPU usage is reduced to 45\%, performance returns to normal"). PERFCE notably outperforms prior works on common synthetic datasets, and our evaluation on real-world databases, MySQL and TiDB, shows that PERFCE is highly accurate and moderately expensive

arXiv.org e-Print Archive