Search CORE

255 research outputs found

Intuitionistic fuzzy XML query matching and rewriting

Author: Alzebdi M.
Alzebdi M.
Publication venue
Publication date: 01/01/2013
Field of study

With the emergence of XML as a standard for data representation, particularly on the web, the need for intelligent query languages that can operate on XML documents with structural heterogeneity has recently gained a lot of popularity. Traditional Information Retrieval and Database approaches have limitations when dealing with such scenarios. Therefore, fuzzy (flexible) approaches have become the predominant. In this thesis, we propose a new approach for approximate XML query matching and rewriting which aims at achieving soft matching of XML queries with XML data sources following different schemas. Unlike traditional querying approaches, which require exact matching, the proposed approach makes use of Intuitionistic Fuzzy Trees to achieve approximate (soft) query matching. Through this new approach, not only the exact answer of a query, but also approximate answers are retrieved. Furthermore, partial results can be obtained from multiple data sources and merged together to produce a single answer to a query. The proposed approach introduced a new tree similarity measure that considers the minimum and maximum degrees of similarity/inclusion of trees that are based on arc matching. New techniques for soft node and arc matching were presented for matching queries against data sources with highly varied structures. A prototype was developed to test the proposed ideas and it proved the ability to achieve approximate matching for pattern queries with a number of XML schemas and rewrite the original query so that it obtain results from the underlying data sources. This has been achieved through several novel algorithms which were tested and proved efficiency and low CPU/Memory cost even for big number of data sources

WestminsterResearch

Bounded repairability for regular tree languages

Author: Bourhis P.
Puppis G.
Riveros C.
Staworko S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

We study the problem of bounded repairability of a given restriction tree language R into a target tree language T. More precisely, we say that R is bounded repairable w.r.t. T if there exists a bound on the number of standard tree editing operations necessary to apply to any tree in R in order to obtain a tree in T. We consider a number of possible specifications for tree languages: bottom-up tree automata (on curry encoding of unranked trees) that capture the class of XML Schemas and DTDs. We also consider a special case when the restriction language R is universal, i.e., contains all trees over a given alphabet. We give an effective characterization of bounded repairability between pairs of tree languages represented with automata. This characterization introduces two tools, synopsis trees and a coverage relation between them, allowing one to reason about tree languages that undergo a bounded number of editing operations. We then employ this characterization to provide upper bounds to the complexity of deciding bounded repairability and we show that these bounds are tight. In particular, when the input tree languages are specified with arbitrary bottom-up automata, the problem is coNEXPTIME-complete. The problem remains coNEXPTIME-complete even if we use deterministic non-recursive DTDs to specify the input languages. The complexity of the problem can be reduced if we assume that the alphabet, the set of node labels, is fixed: the problem becomes PSPACE-complete for non-recursive DTDs and coNP-complete for deterministic non-recursive DTDs. Finally, when the restriction tree language R is universal, we show that the bounded repairability problem becomes EXPTIME-complete if the target language is specified by an arbitrary bottom-up tree automaton and becomes tractable (PTIME-complete, in fact) when a deterministic bottom-up automaton is used

Archivio istituzionale della ricerca - Università degli Studi di Udine

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

XML Schema Clustering with Semantic and Hierarchical Similarity Measures

Author: Iryadi Wina
Nayak Richi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis

Crossref

Queensland University of Technology ePrints Archive

Bounded repairability for regular tree languages

Author: Puppis Gabriele
Riveros Cristian
Staworko Slawek
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

International audienceWe consider the problem of repairing unranked trees (e.g., XML documents) satisfying a given restriction specification R (e.g., a DTD) into unranked trees satisfying a given target specification T. Specifically, we focus on the question of whether one can get from any tree in a regular language R to some tree in another regular language T with a finite, uniformly bounded, number of edit operations (i.e., deletions and insertions of nodes). We give effective characterizations of the pairs of specifications R and T for which such a uniform bound exists, and we study the complexity of the problem under different representations of the regular tree languages (e.g., non-deterministic stepwise automata, deterministic stepwise automata, DTDs). Finally, we point out some connections with the analogous problem for regular languages of words

HAL - Lille 3

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

Similarity of XML Data

Author: Wojnar Aleš
Publication venue: Univerzita Karlova, Matematicko-fyzikální fakulta
Publication date: 01/01/2009
Field of study

Currently, XML is still more and more important format for storing and exchanging data. Evaluation of similarity of XML data plays a crucial role in efficient storing, processing and manipulating data. This work deals with possibility to evaluate similarity of DTDs. Firstly, suitable DTD tree representation is designed. Next, the algorithm based on tree edit distance technique is proposed. Finally, we are focusing on various aspects of similarity, such as, e.g., structural and linguistic information, and integrate them into our method.Jazyk XML se v dnešní době stává stále důležitějším formátem pro uchování a výměnu dat. Provnání podobnosti XML dat hraje zásadní roli v efektivním ukládání, zpracovávání a manipulaci s daty. Tato práce se zabývá možnostmi jak zjišťovat podobnost mezi DTD. Napřed je navržena vhodná reprezentace DTD stromů. Následně je navržen také algoritmus, který je založený na editační vzdálenosti stromů. Nakonec se zaměřujeme na různé aspekty podobnosti, jako jsou například strukturální a lingvistické informace, a snažíme se je zahrnout do naší metody.Department of Software EngineeringKatedra softwarového inženýrstvíFaculty of Mathematics and PhysicsMatematicko-fyzikální fakult

CU Digital Repository

Bounded Repairability for Regular Tree Languages

Author: Boobna Utsav
Carme Julien
Chen Shan
Cristian Riveros
Emde Boas Peter Van
Gabriele Puppis
Grahne Gösta
Pierre Bourhis
Staworko Slawomir
Sławek Staworko
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Exploring and visualizing the ”Alma” of XML documents

Author: Cruz Daniela
Henriques Pedro Rangel
Pereira Maria João
Publication venue: 'Universidade de Evora'
Publication date: 01/01/2008
Field of study

In this paper we introduce eXVisXML, a visual tool to explore documents annotated with the mark-up language XML, in order to easily perform over them tasks as knowledge extraction or document engineering. eXVisXML was designed mainly for two kind of users. Those who want to analyze an annotated document to explore the information contained, for them a visual inspection tool can be of great help, and a slicing functionality can be an efective complement. The other target group is composed by document engineers who might be interested in assessing the quality of the annotation created. This can be achieved through the measurements of some parameters that will allow to compare the elements and attributes of the DTD/Schema against those efectively used in the document instances. Both functionalities and the way they were delineated and implemented will be discussed along the paper.FC

Biblioteca Digital do IPB

An Algorithm for Detecting and Correcting XSLT Rules Affected by Schema Updates

Author: Wu Yang
呉揚
Publication venue
Publication date: 01/01/2017
Field of study

Thesis (Master of Science in Informatics)--University of Tsukuba, no. 37776, 2017.3.2

Tsukuba Repository

Visual exploration and retrieval of XML document collections with the generic system X2

Author: Felix Weigel
François Bry
H Meuss
Holger Meuss
Klaus U. Schulz
S Ceri
S Mizzaro
Simone Leonardi
T Catarci
T Schlieder
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2005
Field of study

This article reports on the XML retrieval system X2 which has been developed at the University of Munich over the last five years. In a typical session with X2, the user first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically. After query evaluation, the full set of answers is presented in a visual and structured way. X2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of X2 which distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering and grouping retrieved elements once the complete answer set has been computed

Crossref

Open Access LMU