7,556 research outputs found
Data-driven root-cause analysis for distributed system anomalies
Modern distributed cyber-physical systems encounter a large variety of
anomalies and in many cases, they are vulnerable to catastrophic fault
propagation scenarios due to strong connectivity among the sub-systems. In this
regard, root-cause analysis becomes highly intractable due to complex fault
propagation mechanisms in combination with diverse operating modes. This paper
presents a new data-driven framework for root-cause analysis for addressing
such issues. The framework is based on a spatiotemporal feature extraction
scheme for distributed cyber-physical systems built on the concept of symbolic
dynamics for discovering and representing causal interactions among subsystems
of a complex system. We present two approaches for root-cause analysis, namely
the sequential state switching (, based on free energy concept of a
Restricted Boltzmann Machine, RBM) and artificial anomaly association (, a
multi-class classification framework using deep neural networks, DNN).
Synthetic data from cases with failed pattern(s) and anomalous node are
simulated to validate the proposed approaches, then compared with the
performance of vector autoregressive (VAR) model-based root-cause analysis.
Real dataset based on Tennessee Eastman process (TEP) is also used for
validation. The results show that: (1) and approaches can obtain
high accuracy in root-cause analysis and successfully handle multiple nominal
operation modes, and (2) the proposed tool-chain is shown to be scalable while
maintaining high accuracy.Comment: 6 pages, 3 figure
Root-cause Analysis for Time-series Anomalies via Spatiotemporal Graphical Modeling in Distributed Complex Systems
Performance monitoring, anomaly detection, and root-cause analysis in complex
cyber-physical systems (CPSs) are often highly intractable due to widely
diverse operational modes, disparate data types, and complex fault propagation
mechanisms. This paper presents a new data-driven framework for root-cause
analysis, based on a spatiotemporal graphical modeling approach built on the
concept of symbolic dynamics for discovering and representing causal
interactions among sub-systems of complex CPSs. We formulate the root-cause
analysis problem as a minimization problem via the proposed inference based
metric and present two approximate approaches for root-cause analysis, namely
the sequential state switching (, based on free energy concept of a
restricted Boltzmann machine, RBM) and artificial anomaly association (, a
classification framework using deep neural networks, DNN). Synthetic data from
cases with failed pattern(s) and anomalous node(s) are simulated to validate
the proposed approaches. Real dataset based on Tennessee Eastman process (TEP)
is also used for comparison with other approaches. The results show that: (1)
and approaches can obtain high accuracy in root-cause analysis
under both pattern-based and node-based fault scenarios, in addition to
successfully handling multiple nominal operating modes, (2) the proposed
tool-chain is shown to be scalable while maintaining high accuracy, and (3) the
proposed framework is robust and adaptive in different fault conditions and
performs better in comparison with the state-of-the-art methods.Comment: 42 pages, 5 figures. arXiv admin note: text overlap with
arXiv:1605.0642
Finding Likely Errors with Bayesian Specifications
We present a Bayesian framework for learning probabilistic specifications
from large, unstructured code corpora, and a method to use this framework to
statically detect anomalous, hence likely buggy, program behavior. The
distinctive insight here is to build a statistical model that correlates all
specifications hidden inside a corpus with the syntax and observed behavior of
programs that implement these specifications. During the analysis of a
particular program, this model is conditioned into a posterior distribution
that prioritizes specifications that are relevant to this program. This allows
accurate program analysis even if the corpus is highly heterogeneous. The
problem of finding anomalies is now framed quantitatively, as a problem of
computing a distance between a "reference distribution" over program behaviors
that our model expects from the program, and the distribution over behaviors
that the program actually produces.
We present a concrete embodiment of our framework that combines a topic model
and a neural network model to learn specifications, and queries the learned
models to compute anomaly scores. We evaluate this implementation on the task
of detecting anomalous usage of Android APIs. Our encouraging experimental
results show that the method can automatically discover subtle errors in
Android applications in the wild, and has high precision and recall compared to
competing probabilistic approaches
Towards Malware Detection via CPU Power Consumption: Data Collection Design and Analytics (Extended Version)
This paper presents an experimental design and data analytics approach aimed
at power-based malware detection on general-purpose computers. Leveraging the
fact that malware executions must consume power, we explore the postulate that
malware can be accurately detected via power data analytics. Our experimental
design and implementation allow for programmatic collection of CPU power
profiles for fixed tasks during uninfected and infected states using five
different rootkits. To characterize the power consumption profiles, we use both
simple statistical and novel, sophisticated features. We test a one-class
anomaly detection ensemble (that baselines non-infected power profiles) and
several kernel-based SVM classifiers (that train on both uninfected and
infected profiles) in detecting previously unseen malware and clean profiles.
The anomaly detection system exhibits perfect detection when using all features
and tasks, with smaller false detection rate than the supervised classifiers.
The primary contribution is the proof of concept that baselining power of fixed
tasks can provide accurate detection of rootkits. Moreover, our treatment
presents engineering hurdles needed for experimentation and allows analysis of
each statistical feature individually. This work appears to be the first step
towards a viable power-based detection capability for general-purpose
computers, and presents next steps toward this goal.Comment: Published version appearing in IEEE TrustCom-18. This version
contains more details on mathematics and data collectio
MCODE: Multivariate Conditional Outlier Detection
Outlier detection aims to identify unusual data instances that deviate from
expected patterns. The outlier detection is particularly challenging when
outliers are context dependent and when they are defined by unusual
combinations of multiple outcome variable values. In this paper, we develop and
study a new conditional outlier detection approach for multivariate outcome
spaces that works by (1) transforming the conditional detection to the outlier
detection problem in a new (unconditional) space and (2) defining outlier
scores by analyzing the data in the new space. Our approach relies on the
classifier chain decomposition of the multi-dimensional classification problem
that lets us transform the output space into a probability vector, one
probability for each dimension of the output space. Outlier scores applied to
these transformed vectors are then used to detect the outliers. Experiments on
multiple multi-dimensional classification problems with the different outlier
injection rates show that our methodology is robust and able to successfully
identify outliers when outliers are either sparse (manifested in one or very
few dimensions) or dense (affecting multiple dimensions)
Variational Inference for On-line Anomaly Detection in High-Dimensional Time Series
Approximate variational inference has shown to be a powerful tool for
modeling unknown complex probability distributions. Recent advances in the
field allow us to learn probabilistic models of sequences that actively exploit
spatial and temporal structure. We apply a Stochastic Recurrent Network (STORN)
to learn robot time series data. Our evaluation demonstrates that we can
robustly detect anomalies both off- and on-line.Comment: Accepted as workshop paper at ICLR 2016; accepted as workshop paper
for anomaly detection workshop at ICML 201
Extended Report: Fine-grained Recognition of Abnormal Behaviors for Early Detection of Mild Cognitive Impairment
According to the World Health Organization, the rate of people aged 60 or
more is growing faster than any other age group in almost every country, and
this trend is not going to change in a near future. Since senior citizens are
at high risk of non communicable diseases requiring long-term care, this trend
will challenge the sustainability of the entire health system. Pervasive
computing can provide innovative methods and tools for early detecting the
onset of health issues. In this paper we propose a novel method to detect
abnormal behaviors of elderly people living at home. The method relies on
medical models, provided by cognitive neuroscience researchers, describing
abnormal activity routines that may indicate the onset of early symptoms of
mild cognitive impairment. A non-intrusive sensor-based infrastructure acquires
low-level data about the interaction of the individual with home appliances and
furniture, as well as data from environmental sensors. Based on those data, a
novel hybrid statistical-symbolical technique is used to detect the abnormal
behaviors of the patient, which are communicated to the medical center.
Differently from related works, our method can detect abnormal behaviors at a
fine-grained level, thus providing an important tool to support the medical
diagnosis. In order to evaluate our method we have developed a prototype of the
system and acquired a large dataset of abnormal behaviors carried out in an
instrumented smart home. Experimental results show that our technique is able
to detect most anomalies while generating a small number of false positives
Setting the threshold for high throughput detectors: A mathematical approach for ensembles of dynamic, heterogeneous, probabilistic anomaly detectors
Anomaly detection (AD) has garnered ample attention in security research, as
such algorithms complement existing signature-based methods but promise
detection of never-before-seen attacks. Cyber operations manage a high volume
of heterogeneous log data; hence, AD in such operations involves multiple
(e.g., per IP, per data type) ensembles of detectors modeling heterogeneous
characteristics (e.g., rate, size, type) often with adaptive online models
producing alerts in near real time. Because of high data volume, setting the
threshold for each detector in such a system is an essential yet underdeveloped
configuration issue that, if slightly mistuned, can leave the system useless,
either producing a myriad of alerts and flooding downstream systems, or giving
none. In this work, we build on the foundations of Ferragut et al. to provide a
set of rigorous results for understanding the relationship between threshold
values and alert quantities, and we propose an algorithm for setting the
threshold in practice. Specifically, we give an algorithm for setting the
threshold of multiple, heterogeneous, possibly dynamic detectors completely a
priori, in principle. Indeed, if the underlying distribution of the incoming
data is known (closely estimated), the algorithm provides provably manageable
thresholds. If the distribution is unknown (e.g., has changed over time) our
analysis reveals how the model distribution differs from the actual
distribution, indicating a period of model refitting is necessary. We provide
empirical experiments showing the efficacy of the capability by regulating the
alert rate of a system with 2,500 adaptive detectors scoring over 1.5M
events in 5 hours. Further, we demonstrate on the real network data and
detection framework of Harshaw et al. the alternative case, showing how the
inability to regulate alerts indicates the detection model is a bad fit to the
data.Comment: 11 pages, 5 figures. Proceedings of IEEE Big Data Conference, 201
Review of Smart Meter Data Analytics: Applications, Methodologies, and Challenges
The widespread popularity of smart meters enables an immense amount of
fine-grained electricity consumption data to be collected. Meanwhile, the
deregulation of the power industry, particularly on the delivery side, has
continuously been moving forward worldwide. How to employ massive smart meter
data to promote and enhance the efficiency and sustainability of the power grid
is a pressing issue. To date, substantial works have been conducted on smart
meter data analytics. To provide a comprehensive overview of the current
research and to identify challenges for future research, this paper conducts an
application-oriented review of smart meter data analytics. Following the three
stages of analytics, namely, descriptive, predictive and prescriptive
analytics, we identify the key application areas as load analysis, load
forecasting, and load management. We also review the techniques and
methodologies adopted or developed to address each application. In addition, we
also discuss some research trends, such as big data issues, novel machine
learning technologies, new business models, the transition of energy systems,
and data privacy and security.Comment: IEEE Transactions on Smart Grid, 201
A Survey of Stealth Malware: Attacks, Mitigation Measures, and Steps Toward Autonomous Open World Solutions
As our professional, social, and financial existences become increasingly
digitized and as our government, healthcare, and military infrastructures rely
more on computer technologies, they present larger and more lucrative targets
for malware. Stealth malware in particular poses an increased threat because it
is specifically designed to evade detection mechanisms, spreading dormant, in
the wild for extended periods of time, gathering sensitive information or
positioning itself for a high-impact zero-day attack. Policing the growing
attack surface requires the development of efficient anti-malware solutions
with improved generalization to detect novel types of malware and resolve these
occurrences with as little burden on human experts as possible. In this paper,
we survey malicious stealth technologies as well as existing solutions for
detecting and categorizing these countermeasures autonomously. While machine
learning offers promising potential for increasingly autonomous solutions with
improved generalization to new malware types, both at the network level and at
the host level, our findings suggest that several flawed assumptions inherent
to most recognition algorithms prevent a direct mapping between the stealth
malware recognition problem and a machine learning solution. The most notable
of these flawed assumptions is the closed world assumption: that no sample
belonging to a class outside of a static training set will appear at query
time. We present a formalized adaptive open world framework for stealth malware
recognition and relate it mathematically to research from other machine
learning domains.Comment: Pre-Print of a manuscript Accepted to IEEE Communications Surveys and
Tutorials (COMST) on December 1, 201
- …