Search CORE

103 research outputs found

TypEx : a type based approach to XML stream querying

Author: Connor Richard
Neumüller Mathias
Russell George
Publication venue: WebDB
Publication date: 01/01/2003
Field of study

We consider the topic of query evaluation over semistructured information streams, and XML data streams in particular. Streaming evaluation methods are necessarily eventdriven, which is in tension with high-level query models; in general, the more expressive the query language, the harder it is to translate queries into an event-based implementation with finite resource bounds

CiteSeerX

University of Strathclyde Institutional Repository

A Progressive Clustering Algorithm to Group the XML Data by Structural and Semantic Similarity

Author: Nayak Richi
Tran Tien
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2007
Field of study

Since the emergence in the popularity of XML for data representation and exchange over the Web, the distribution of XML documents has rapidly increased. It has become a challenge for researchers to turn these documents into a more useful information utility. In this paper, we introduce a novel clustering algorithm PCXSS that keeps the heterogeneous XML documents into various groups according to their similar structural and semantic representations. We develop a global criterion function CPSim that progressively measures the similarity between a XML document and existing clusters, ignoring the need to compute the similarity between two individual documents. The experimental analysis shows the method to be fast and accurate

CiteSeerX

Queensland University of Technology ePrints Archive

Citation analysis of database publications

Author: Rahm Erhard
Thor Andreas
Publication venue
Publication date: 19/10/2018
Field of study

We analyze citation frequencies for two main database conferences (SIGMOD, VLDB) and three database journals (TODS, VLDB Journal, Sigmod Record) over 10 years. The citation data is obtained by integrating and cleaning data from DBLP and Google Scholar. Our analysis considers different comparative metrics per publication venue, in particular the total and average number of citations as well as the impact factor which has so far only been considered for journals. We also determine the most cited papers, authors, author institutions and their countries

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Qucosa - Publikationsserver der Universität Leipzig

Schema Management for Data Integration: A Short Survey

Author: Almarimi A.
Pokorný J.
Publication venue: 'Czech Technical University in Prague - Central Library'
Publication date: 01/01/2005
Field of study

Schema management is a basic problem in many database application domains such as data integration systems. Users need to access and manipulate data from several databases. In this context, in order to integrate data from distributed heterogeneous database sources, data integration systems demand the resolution of several issues that arise in managing schemas. In this paper, we present a brief survey of the problem of schema matching which is used for solving problems of schema integration processing. Moreover, we propose a technique for integrating and querying distributed heterogeneous XML schemas.

Directory of Open Access Journals

CTU Open Journal Systems (Czech Technical University, Prague / České vysoké učení technické v Praze)

Indexing Temporal XML documents

Author: A MENDELZON
A VAISMAN
F RIZZOLO
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Crossref

XML Matchers: approaches and challenges

Author: Agreste Santa
De Meo Pasquale
Ferrara Emilio
Ursino Domenico
Publication venue: 'Elsevier BV'
Publication date: 10/07/2014
Field of study

Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

arXiv.org e-Print Archive

IRIS UniversitÃ Politecnica delle Marche

Processing Queries in a Large Peer-to-Peer System

Author: David J. Dewitt
Leonidas Galanis
Shawn R. Jeffery
Yuan Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Abstract. While current search engines seem to easily handle the size of the data available on the Internet, they cannot provide fresh results. The most up-to-date data always resides on the data sources. Efficiently interconnecting data providers, however, is not an easy problem. Peer-to-peer computing is the latest technology to address this problem. However, efficient query processing in peer-to-peer networks remains an open research area. In this paper, we present a performance study of a system that facilitates efficient searches of large numbers of independent data providers on the Internet. In our scenario, each data provider becomes an autonomous node in a large peer-to-peer system. Using small indices on each node, we can efficiently direct queries submitted on any node to the relevant sources. Experiments with a large peer-to-peer network demonstrate the feasibility of our approach.

CiteSeerX

Crossref