15,681 research outputs found
A Grammatical Inference Approach to Language-Based Anomaly Detection in XML
False-positives are a problem in anomaly-based intrusion detection systems.
To counter this issue, we discuss anomaly detection for the eXtensible Markup
Language (XML) in a language-theoretic view. We argue that many XML-based
attacks target the syntactic level, i.e. the tree structure or element content,
and syntax validation of XML documents reduces the attack surface. XML offers
so-called schemas for validation, but in real world, schemas are often
unavailable, ignored or too general. In this work-in-progress paper we describe
a grammatical inference approach to learn an automaton from example XML
documents for detecting documents with anomalous syntax.
We discuss properties and expressiveness of XML to understand limits of
learnability. Our contributions are an XML Schema compatible lexical datatype
system to abstract content in XML and an algorithm to learn visibly pushdown
automata (VPA) directly from a set of examples. The proposed algorithm does not
require the tree representation of XML, so it can process large documents or
streams. The resulting deterministic VPA then allows stream validation of
documents to recognize deviations in the underlying tree structure or
datatypes.Comment: Paper accepted at First Int. Workshop on Emerging Cyberthreats and
Countermeasures ECTCM 201
Rewriting Logic Semantics of a Plan Execution Language
The Plan Execution Interchange Language (PLEXIL) is a synchronous language
developed by NASA to support autonomous spacecraft operations. In this paper,
we propose a rewriting logic semantics of PLEXIL in Maude, a high-performance
logical engine. The rewriting logic semantics is by itself a formal interpreter
of the language and can be used as a semantic benchmark for the implementation
of PLEXIL executives. The implementation in Maude has the additional benefit of
making available to PLEXIL designers and developers all the formal analysis and
verification tools provided by Maude. The formalization of the PLEXIL semantics
in rewriting logic poses an interesting challenge due to the synchronous nature
of the language and the prioritized rules defining its semantics. To overcome
this difficulty, we propose a general procedure for simulating synchronous set
relations in rewriting logic that is sound and, for deterministic relations,
complete. We also report on two issues at the design level of the original
PLEXIL semantics that were identified with the help of the executable
specification in Maude
Probabilistic Program Abstractions
Abstraction is a fundamental tool for reasoning about complex systems.
Program abstraction has been utilized to great effect for analyzing
deterministic programs. At the heart of program abstraction is the relationship
between a concrete program, which is difficult to analyze, and an abstract
program, which is more tractable. Program abstractions, however, are typically
not probabilistic. We generalize non-deterministic program abstractions to
probabilistic program abstractions by explicitly quantifying the
non-deterministic choices. Our framework upgrades key definitions and
properties of abstractions to the probabilistic context. We also discuss
preliminary ideas for performing inference on probabilistic abstractions and
general probabilistic programs
Mechanized semantics
The goal of this lecture is to show how modern theorem provers---in this
case, the Coq proof assistant---can be used to mechanize the specification of
programming languages and their semantics, and to reason over individual
programs and over generic program transformations, as typically found in
compilers. The topics covered include: operational semantics (small-step,
big-step, definitional interpreters); a simple form of denotational semantics;
axiomatic semantics and Hoare logic; generation of verification conditions,
with application to program proof; compilation to virtual machine code and its
proof of correctness; an example of an optimizing program transformation (dead
code elimination) and its proof of correctness
Higher-Order Operator Precedence Languages
Floyd's Operator Precedence (OP) languages are a deterministic context-free
family having many desirable properties. They are locally and parallely
parsable, and languages having a compatible structure are closed under Boolean
operations, concatenation and star; they properly include the family of Visibly
Pushdown (or Input Driven) languages. OP languages are based on three relations
between any two consecutive terminal symbols, which assign syntax structure to
words. We extend such relations to k-tuples of consecutive terminal symbols, by
using the model of strictly locally testable regular languages of order k at
least 3. The new corresponding class of Higher-order Operator Precedence
languages (HOP) properly includes the OP languages, and it is still included in
the deterministic (also in reverse) context free family. We prove Boolean
closure for each subfamily of structurally compatible HOP languages. In each
subfamily, the top language is called max-language. We show that such languages
are defined by a simple cancellation rule and we prove several properties, in
particular that max-languages make an infinite hierarchy ordered by parameter
k. HOP languages are a candidate for replacing OP languages in the various
applications where they have have been successful though sometimes too
restrictive.Comment: In Proceedings AFL 2017, arXiv:1708.0622
- …