Search CORE

3,901 research outputs found

A Grammatical Inference Approach to Language-Based Anomaly Detection in XML

Author: Lampesberger Harald
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

False-positives are a problem in anomaly-based intrusion detection systems. To counter this issue, we discuss anomaly detection for the eXtensible Markup Language (XML) in a language-theoretic view. We argue that many XML-based attacks target the syntactic level, i.e. the tree structure or element content, and syntax validation of XML documents reduces the attack surface. XML offers so-called schemas for validation, but in real world, schemas are often unavailable, ignored or too general. In this work-in-progress paper we describe a grammatical inference approach to learn an automaton from example XML documents for detecting documents with anomalous syntax. We discuss properties and expressiveness of XML to understand limits of learnability. Our contributions are an XML Schema compatible lexical datatype system to abstract content in XML and an algorithm to learn visibly pushdown automata (VPA) directly from a set of examples. The proposed algorithm does not require the tree representation of XML, so it can process large documents or streams. The resulting deterministic VPA then allows stream validation of documents to recognize deviations in the underlying tree structure or datatypes.Comment: Paper accepted at First Int. Workshop on Emerging Cyberthreats and Countermeasures ECTCM 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Some Combinatorial Operators in Language Theory

Author: Luque Jean-Gabriel
Mignot Ludovic
Nicart Florent
Publication venue
Publication date: 01/01/2012
Field of study

Multitildes are regular operators that were introduced by Caron et al. in order to increase the number of Glushkov automata. In this paper, we study the family of the multitilde operators from an algebraic point of view using the notion of operad. This leads to a combinatorial description of already known results as well as new results on compositions, actions and enumerations.Comment: 21 page

arXiv.org e-Print Archive

HAL - Normandie Université

Symbolic Algorithms for Language Equivalence and Kleene Algebra with Tests

Author: Bouajjani A.
Goel A.
Henriksen J. G.
Hopcroft J. E.
Huet G.
Kozen D.
Kozen D.
Moore E. F.
Pottier F.
Pous D.
Rémy D.
Publication venue
Publication date: 09/07/2014
Field of study

We first propose algorithms for checking language equivalence of finite automata over a large alphabet. We use symbolic automata, where the transition function is compactly represented using a (multi-terminal) binary decision diagrams (BDD). The key idea consists in computing a bisimulation by exploring reachable pairs symbolically, so as to avoid redundancies. This idea can be combined with already existing optimisations, and we show in particular a nice integration with the disjoint sets forest data-structure from Hopcroft and Karp's standard algorithm. Then we consider Kleene algebra with tests (KAT), an algebraic theory that can be used for verification in various domains ranging from compiler optimisation to network programming analysis. This theory is decidable by reduction to language equivalence of automata on guarded strings, a particular kind of automata that have exponentially large alphabets. We propose several methods allowing to construct symbolic automata out of KAT expressions, based either on Brzozowski's derivatives or standard automata constructions. All in all, this results in efficient algorithms for deciding equivalence of KAT expressions

arXiv.org e-Print Archive

HAL-ENS-LYON

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Deterministic Automata for Unordered Trees

Author: Boiret Adrien
Hugot Vincent
Niehren Joachim
Treinen Ralf
Publication venue: 'Open Publishing Association'
Publication date: 01/08/2014
Field of study

Automata for unordered unranked trees are relevant for defining schemas and queries for data trees in Json or Xml format. While the existing notions are well-investigated concerning expressiveness, they all lack a proper notion of determinism, which makes it difficult to distinguish subclasses of automata for which problems such as inclusion, equivalence, and minimization can be solved efficiently. In this paper, we propose and investigate different notions of "horizontal determinism", starting from automata for unranked trees in which the horizontal evaluation is performed by finite state automata. We show that a restriction to confluent horizontal evaluation leads to polynomial-time emptiness and universality, but still suffers from coNP-completeness of the emptiness of binary intersections. Finally, efficient algorithms can be obtained by imposing an order of horizontal evaluation globally for all automata in the class. Depending on the choice of the order, we obtain different classes of automata, each of which has the same expressiveness as CMso.Comment: In Proceedings GandALF 2014, arXiv:1408.556

arXiv.org e-Print Archive

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Directory of Open Access Journals

HAL-Rennes 1

Translation from Classical Two-Way Automata to Pebble Two-Way Automata

Author: A. Szepietowski
Bianca Truthe
C. Mereghetti
Ch. A. Kapoutsis
Giovanni Pighizzini
J. Berman
J. H. Chang
J. Hopcroft
J. Hromkoviv c
Jürgen Dassow
M. Chrobak
M. Sipser
M. Sipser
N. Immerman
R. Chang
R. Szelepcs'enyi
V. Geffert
V. Geffert
V. Geffert
V. Geffert
V. Geffert
Viliam Geffert
W. Ellison
W. Sakoda
Ľubomíra Ištoňová
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2009
Field of study

We study the relation between the standard two-way automata and more powerful devices, namely, two-way finite automata with an additional "pebble" movable along the input tape. Similarly as in the case of the classical two-way machines, it is not known whether there exists a polynomial trade-off, in the number of states, between the nondeterministic and deterministic pebble two-way automata. However, we show that these two machine models are not independent: if there exists a polynomial trade-off for the classical two-way automata, then there must also exist a polynomial trade-off for the pebble two-way automata. Thus, we have an upward collapse (or a downward separation) from the classical two-way automata to more powerful pebble automata, still staying within the class of regular languages. The same upward collapse holds for complementation of nondeterministic two-way machines. These results are obtained by showing that each pebble machine can be, by using suitable inputs, simulated by a classical two-way automaton with a linear number of states (and vice versa), despite the existing exponential blow-up between the classical and pebble two-way machines

arXiv.org e-Print Archive

CiteSeerX

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Directory of Open Access Journals

Numérisation de Documents Anciens Mathématiques