4,072 research outputs found
Proactive Anomaly Detection in Large-Scale Cloud-Native Databases
This disclosure describes techniques to identify anomalous patterns in customer workloads from database logs and to enable timely, corrective action that ensures uninterrupted operation of the database. Examples of anomalies include sudden increases (bursts) in the number of error messages written to a log file. An adaptive behavior norm is defined for each message type. Time instances or periods when the gap between messages of a given type in the database log deviate from the expected behavior norms are detected. A deviation from the behavior norm is a potential indicator of database problems. An anomaly detection tool outputs a ranked list of log statements exhibiting spikes of activity along with their time intervals that a database administrator (DBA) can examine to take corrective action. By automating anomaly detection, the valuable time of DBAs can be spent acting on issues rather than finding them
Next stop 'NoOps': enabling cross-system diagnostics through graph-based composition of logs and metrics
Performing diagnostics in IT systems is an increasingly complicated task, and it is not doable in satisfactory time by even the most skillful operators. Systems and their architecture change very rapidly in response to business and user demand. Many organizations see value in the maintenance and management model of NoOps that stands for No Operations. One of the implementations of this model is a system that is maintained automatically without any human intervention. The path to NoOps involves not only precise and fast diagnostics but also reusing as much knowledge as possible after the system is reconfigured or changed. The biggest challenge is to leverage knowledge on one IT system and reuse this knowledge for diagnostics of another, different system. We propose a framework of weighted graphs which can transfer knowledge, and perform high-quality diagnostics of IT systems. We encode all possible data in a graph representation of a system state and automatically calculate weights of these graphs. Then, thanks to the evaluation of similarity between graphs, we transfer knowledge about failures from one system to another and use it for diagnostics. We successfully evaluate the proposed approach on Spark, Hadoop, Kafka and Cassandra systems.Peer ReviewedPostprint (author's final draft
Probabilistic Approach to Structural Change Prediction in Evolving Social Networks
We propose a predictive model of structural
changes in elementary subgraphs of social network based on
Mixture of Markov Chains. The model is trained and verified
on a dataset from a large corporate social network analyzed
in short, one day-long time windows, and reveals distinctive
patterns of evolution of connections on the level of local
network topology. We argue that the network investigated in
such short timescales is highly dynamic and therefore immune
to classic methods of link prediction and structural analysis,
and show that in the case of complex networks, the dynamic
subgraph mining may lead to better prediction accuracy. The
experiments were carried out on the logs from the Wroclaw
University of Technology mail server
Try with Simpler -- An Evaluation of Improved Principal Component Analysis in Log-based Anomaly Detection
The rapid growth of deep learning (DL) has spurred interest in enhancing
log-based anomaly detection. This approach aims to extract meaning from log
events (log message templates) and develop advanced DL models for anomaly
detection. However, these DL methods face challenges like heavy reliance on
training data, labels, and computational resources due to model complexity. In
contrast, traditional machine learning and data mining techniques are less
data-dependent and more efficient but less effective than DL. To make log-based
anomaly detection more practical, the goal is to enhance traditional techniques
to match DL's effectiveness. Previous research in a different domain (linking
questions on Stack Overflow) suggests that optimized traditional techniques can
rival state-of-the-art DL methods. Drawing inspiration from this concept, we
conducted an empirical study. We optimized the unsupervised PCA (Principal
Component Analysis), a traditional technique, by incorporating lightweight
semantic-based log representation. This addresses the issue of unseen log
events in training data, enhancing log representation. Our study compared seven
log-based anomaly detection methods, including four DL-based, two traditional,
and the optimized PCA technique, using public and industrial datasets. Results
indicate that the optimized unsupervised PCA technique achieves similar
effectiveness to advanced supervised/semi-supervised DL methods while being
more stable with limited training data and resource-efficient. This
demonstrates the adaptability and strength of traditional techniques through
small yet impactful adaptations
Telemetry Fault-Detection Algorithms: Applications for Spacecraft Monitoring and Space Environment Sensing
Algorithms have been developed that identify unusual behavior in satellite health telemetry. Telemetry from solid-state power amplifiers and amplifier thermistors from 32 geostationary Earth orbit communications satellites from 1991 to 2015 are examined. Transient event detection and change-point event detection techniques that use a sliding window-based median are used, statistically evaluating the telemetry stream compared to the local norm. This approach allows application of the algorithms to any spacecraft platform because there is no reliance in the algorithms on satellite- or component-specific parameters, and it does not require a priori knowledge about the data distribution. Individual telemetry data streams are analyzed with the event detection algorithms, resulting in a compiled list of unusual events for each satellite. This approach identifies up to six events of up to six events that affect 51 of 53 telemetry streams at once, indicative of a spacecraft system-level event. In two satellites, the same top event date (4 December 2008) occurs over more than 10 years of telemetry from both satellites. Of the five spacecraft with known maneuvers, the algorithms identify the maneuvers in all cases. Event dates are compared to known operational activities, space weather events, and available anomaly lists to assess the use of event detection algorithms for spacecraft monitoring and sensing of the space environment.The authors would like to acknowledge the U.S. Air Force Office of Sponsored Research grant FA9550-13-1-0099 and NASA for funding this work through NASA Space Technology and Research Fellowship grant NNX16AM74H
Anomaly detection and event mining in cold forming manufacturing processes
Predictive maintenance is one of the main goals within the Industry 4.0 trend. Advances in data-driven techniques offer new opportunities in terms of cost reduction, improved quality control, and increased work safety. This work brings data-driven techniques for two predictive maintenance tasks: anomaly detection and event prediction, applied in the real-world use case of a cold forming manufacturing line for consumer lifestyle products by using acoustic emissions sensors in proximity of the dies of the press module. The proposed models are robust and able to cope with problems such as noise, missing values, and irregular sampling. The detected anomalies are investigated by experts and confirmed to correspond to deviations in the normal operation of the machine. Moreover, we are able to find patterns which are related to the events of interest
- …