20,650 research outputs found
A Grammatical Inference Approach to Language-Based Anomaly Detection in XML
False-positives are a problem in anomaly-based intrusion detection systems.
To counter this issue, we discuss anomaly detection for the eXtensible Markup
Language (XML) in a language-theoretic view. We argue that many XML-based
attacks target the syntactic level, i.e. the tree structure or element content,
and syntax validation of XML documents reduces the attack surface. XML offers
so-called schemas for validation, but in real world, schemas are often
unavailable, ignored or too general. In this work-in-progress paper we describe
a grammatical inference approach to learn an automaton from example XML
documents for detecting documents with anomalous syntax.
We discuss properties and expressiveness of XML to understand limits of
learnability. Our contributions are an XML Schema compatible lexical datatype
system to abstract content in XML and an algorithm to learn visibly pushdown
automata (VPA) directly from a set of examples. The proposed algorithm does not
require the tree representation of XML, so it can process large documents or
streams. The resulting deterministic VPA then allows stream validation of
documents to recognize deviations in the underlying tree structure or
datatypes.Comment: Paper accepted at First Int. Workshop on Emerging Cyberthreats and
Countermeasures ECTCM 201
Revisiting Actor Programming in C++
The actor model of computation has gained significant popularity over the
last decade. Its high level of abstraction makes it appealing for concurrent
applications in parallel and distributed systems. However, designing a
real-world actor framework that subsumes full scalability, strong reliability,
and high resource efficiency requires many conceptual and algorithmic additives
to the original model.
In this paper, we report on designing and building CAF, the "C++ Actor
Framework". CAF targets at providing a concurrent and distributed native
environment for scaling up to very large, high-performance applications, and
equally well down to small constrained systems. We present the key
specifications and design concepts---in particular a message-transparent
architecture, type-safe message interfaces, and pattern matching
facilities---that make native actors a viable approach for many robust,
elastic, and highly distributed developments. We demonstrate the feasibility of
CAF in three scenarios: first for elastic, upscaling environments, second for
including heterogeneous hardware like GPGPUs, and third for distributed runtime
systems. Extensive performance evaluations indicate ideal runtime behaviour for
up to 64 cores at very low memory footprint, or in the presence of GPUs. In
these tests, CAF continuously outperforms the competing actor environments
Erlang, Charm++, SalsaLite, Scala, ActorFoundry, and even the OpenMPI.Comment: 33 page
An Autonomous Engine for Services Configuration and Deployment.
The runtime management of the infrastructure providing service-based systems is a complex task, up to the point where manual operation struggles to be cost effective. As the functionality is provided by a set of dynamically composed distributed services, in order to achieve a management objective multiple operations have to be applied over the distributed elements of the managed infrastructure. Moreover, the manager must cope with the highly heterogeneous characteristics and management interfaces of the runtime resources. With this in mind, this paper proposes to support the configuration and deployment of services with an automated closed control loop. The automation is enabled by the definition of a generic information model, which captures all the information relevant to the management of the services with the same abstractions, describing the runtime elements, service dependencies, and business objectives. On top of that, a technique based on satisfiability is described which automatically diagnoses the state of the managed environment and obtains the required changes for correcting it (e.g., installation, service binding, update, or configuration). The results from a set of case studies extracted from the banking domain are provided to validate the feasibility of this propos
Recommended from our members
Parallel data compression
Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested
A Protocol for the Atomic Capture of Multiple Molecules at Large Scale
With the rise of service-oriented computing, applications are more and more
based on coordination of autonomous services. Envisioned over largely
distributed and highly dynamic platforms, expressing this coordination calls
for alternative programming models. The chemical programming paradigm, which
models applications as chemical solutions where molecules representing digital
entities involved in the computation, react together to produce a result, has
been recently shown to provide the needed abstractions for autonomic
coordination of services. However, the execution of such programs over large
scale platforms raises several problems hindering this paradigm to be actually
leveraged. Among them, the atomic capture of molecules participating in concur-
rent reactions is one of the most significant. In this paper, we propose a
protocol for the atomic capture of these molecules distributed and evolving
over a large scale platform. As the density of possible reactions is crucial
for the liveness and efficiency of such a capture, the protocol proposed is
made up of two sub-protocols, each of them aimed at addressing different levels
of densities of potential reactions in the solution. While the decision to
choose one or the other is local to each node participating in a program's
execution, a global coherent behaviour is obtained. Proof of liveness, as well
as intensive simulation results showing the efficiency and limited overhead of
the protocol are given.Comment: 13th International Conference on Distributed Computing and Networking
(2012
Survey of Distributed Decision
We survey the recent distributed computing literature on checking whether a
given distributed system configuration satisfies a given boolean predicate,
i.e., whether the configuration is legal or illegal w.r.t. that predicate. We
consider classical distributed computing environments, including mostly
synchronous fault-free network computing (LOCAL and CONGEST models), but also
asynchronous crash-prone shared-memory computing (WAIT-FREE model), and mobile
computing (FSYNC model)
AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments
This report considers the application of Articial Intelligence (AI) techniques to
the problem of misuse detection and misuse localisation within telecommunications
environments. A broad survey of techniques is provided, that covers inter alia
rule based systems, model-based systems, case based reasoning, pattern matching,
clustering and feature extraction, articial neural networks, genetic algorithms, arti
cial immune systems, agent based systems, data mining and a variety of hybrid
approaches. The report then considers the central issue of event correlation, that
is at the heart of many misuse detection and localisation systems. The notion of
being able to infer misuse by the correlation of individual temporally distributed
events within a multiple data stream environment is explored, and a range of techniques,
covering model based approaches, `programmed' AI and machine learning
paradigms. It is found that, in general, correlation is best achieved via rule based approaches,
but that these suffer from a number of drawbacks, such as the difculty of
developing and maintaining an appropriate knowledge base, and the lack of ability
to generalise from known misuses to new unseen misuses. Two distinct approaches
are evident. One attempts to encode knowledge of known misuses, typically within
rules, and use this to screen events. This approach cannot generally detect misuses
for which it has not been programmed, i.e. it is prone to issuing false negatives.
The other attempts to `learn' the features of event patterns that constitute normal
behaviour, and, by observing patterns that do not match expected behaviour, detect
when a misuse has occurred. This approach is prone to issuing false positives,
i.e. inferring misuse from innocent patterns of behaviour that the system was not
trained to recognise. Contemporary approaches are seen to favour hybridisation,
often combining detection or localisation mechanisms for both abnormal and normal
behaviour, the former to capture known cases of misuse, the latter to capture
unknown cases. In some systems, these mechanisms even work together to update
each other to increase detection rates and lower false positive rates. It is concluded
that hybridisation offers the most promising future direction, but that a rule or state
based component is likely to remain, being the most natural approach to the correlation
of complex events. The challenge, then, is to mitigate the weaknesses of
canonical programmed systems such that learning, generalisation and adaptation
are more readily facilitated
- …