9,211 research outputs found
Type-Based Detection of XML Query-Update Independence
This paper presents a novel static analysis technique to detect XML
query-update independence, in the presence of a schema. Rather than types, our
system infers chains of types. Each chain represents a path that can be
traversed on a valid document during query/update evaluation. The resulting
independence analysis is precise, although it raises a challenging issue:
recursive schemas may lead to infer infinitely many chains. A sound and
complete approximation technique ensuring a finite analysis in any case is
presented, together with an efficient implementation performing the chain-based
analysis in polynomial space and time.Comment: VLDB201
A Grammatical Inference Approach to Language-Based Anomaly Detection in XML
False-positives are a problem in anomaly-based intrusion detection systems.
To counter this issue, we discuss anomaly detection for the eXtensible Markup
Language (XML) in a language-theoretic view. We argue that many XML-based
attacks target the syntactic level, i.e. the tree structure or element content,
and syntax validation of XML documents reduces the attack surface. XML offers
so-called schemas for validation, but in real world, schemas are often
unavailable, ignored or too general. In this work-in-progress paper we describe
a grammatical inference approach to learn an automaton from example XML
documents for detecting documents with anomalous syntax.
We discuss properties and expressiveness of XML to understand limits of
learnability. Our contributions are an XML Schema compatible lexical datatype
system to abstract content in XML and an algorithm to learn visibly pushdown
automata (VPA) directly from a set of examples. The proposed algorithm does not
require the tree representation of XML, so it can process large documents or
streams. The resulting deterministic VPA then allows stream validation of
documents to recognize deviations in the underlying tree structure or
datatypes.Comment: Paper accepted at First Int. Workshop on Emerging Cyberthreats and
Countermeasures ECTCM 201
Discovering Restricted Regular Expressions with Interleaving
Discovering a concise schema from given XML documents is an important problem
in XML applications. In this paper, we focus on the problem of learning an
unordered schema from a given set of XML examples, which is actually a problem
of learning a restricted regular expression with interleaving using positive
example strings. Schemas with interleaving could present meaningful knowledge
that cannot be disclosed by previous inference techniques. Moreover, inference
of the minimal schema with interleaving is challenging. The problem of finding
a minimal schema with interleaving is shown to be NP-hard. Therefore, we
develop an approximation algorithm and a heuristic solution to tackle the
problem using techniques different from known inference algorithms. We do
experiments on real-world data sets to demonstrate the effectiveness of our
approaches. Our heuristic algorithm is shown to produce results that are very
close to optimal.Comment: 12 page
Survey over Existing Query and Transformation Languages
A widely acknowledged obstacle for realizing the vision of the Semantic Web is the inability
of many current Semantic Web approaches to cope with data available in such diverging
representation formalisms as XML, RDF, or Topic Maps. A common query language is the first
step to allow transparent access to data in any of these formats. To further the understanding
of the requirements and approaches proposed for query languages in the conventional as well
as the Semantic Web, this report surveys a large number of query languages for accessing
XML, RDF, or Topic Maps. This is the first systematic survey to consider query languages from
all these areas. From the detailed survey of these query languages, a common classification
scheme is derived that is useful for understanding and differentiating languages within and
among all three areas
Web and Semantic Web Query Languages
A number of techniques have been developed to facilitate
powerful data retrieval on the Web and Semantic Web. Three categories
of Web query languages can be distinguished, according to the format
of the data they can retrieve: XML, RDF and Topic Maps. This article
introduces the spectrum of languages falling into these categories
and summarises their salient aspects. The languages are introduced using
common sample data and query types. Key aspects of the query
languages considered are stressed in a conclusion
Semantic web technologies for video surveillance metadata
Video surveillance systems are growing in size and complexity. Such systems typically consist of integrated modules of different vendors to cope with the increasing demands on network and storage capacity, intelligent video analytics, picture quality, and enhanced visual interfaces. Within a surveillance system, relevant information (like technical details on the video sequences, or analysis results of the monitored environment) is described using metadata standards. However, different modules typically use different standards, resulting in metadata interoperability problems. In this paper, we introduce the application of Semantic Web Technologies to overcome such problems. We present a semantic, layered metadata model and integrate it within a video surveillance system. Besides dealing with the metadata interoperability problem, the advantages of using Semantic Web Technologies and the inherent rule support are shown. A practical use case scenario is presented to illustrate the benefits of our novel approach
A Vernacular for Coherent Logic
We propose a simple, yet expressive proof representation from which proofs
for different proof assistants can easily be generated. The representation uses
only a few inference rules and is based on a frag- ment of first-order logic
called coherent logic. Coherent logic has been recognized by a number of
researchers as a suitable logic for many ev- eryday mathematical developments.
The proposed proof representation is accompanied by a corresponding XML format
and by a suite of XSL transformations for generating formal proofs for
Isabelle/Isar and Coq, as well as proofs expressed in a natural language form
(formatted in LATEX or in HTML). Also, our automated theorem prover for
coherent logic exports proofs in the proposed XML format. All tools are
publicly available, along with a set of sample theorems.Comment: CICM 2014 - Conferences on Intelligent Computer Mathematics (2014
- …