2,866 research outputs found
An introduction to Graph Data Management
A graph database is a database where the data structures for the schema
and/or instances are modeled as a (labeled)(directed) graph or generalizations
of it, and where querying is expressed by graph-oriented operations and type
constructors. In this article we present the basic notions of graph databases,
give an historical overview of its main development, and study the main current
systems that implement them
XML Reconstruction View Selection in XML Databases: Complexity Analysis and Approximation Scheme
Query evaluation in an XML database requires reconstructing XML subtrees
rooted at nodes found by an XML query. Since XML subtree reconstruction can be
expensive, one approach to improve query response time is to use reconstruction
views - materialized XML subtrees of an XML document, whose nodes are
frequently accessed by XML queries. For this approach to be efficient, the
principal requirement is a framework for view selection. In this work, we are
the first to formalize and study the problem of XML reconstruction view
selection. The input is a tree , in which every node has a size
and profit , and the size limitation . The target is to find a subset
of subtrees rooted at nodes respectively such that
, and is maximal.
Furthermore, there is no overlap between any two subtrees selected in the
solution. We prove that this problem is NP-hard and present a fully
polynomial-time approximation scheme (FPTAS) as a solution
A Join Index for XML Data Warehouses
XML data warehouses form an interesting basis for decision-support
applications that exploit complex data. However, native-XML database management
systems (DBMSs) currently bear limited performances and it is necessary to
research for ways to optimize them. In this paper, we propose a new join index
that is specifically adapted to the multidimensional architecture of XML
warehouses. It eliminates join operations while preserving the information
contained in the original warehouse. A theoretical study and experimental
results demonstrate the efficiency of our join index. They also show that
native XML DBMSs can compete with XML-compatible, relational DBMSs when
warehousing and analyzing XML data.Comment: 2008 International Conference on Information Resources Management
(Conf-IRM 08), Niagra Falls : Canada (2008
Efficient XML Keyword Search based on DAG-Compression
In contrast to XML query languages as e.g. XPath which require knowledge on
the query language as well as on the document structure, keyword search is open
to anybody. As the size of XML sources grows rapidly, the need for efficient
search indices on XML data that support keyword search increases. In this
paper, we present an approach of XML keyword search which is based on the DAG
of the XML data, where repeated substructures are considered only once, and
therefore, have to be searched only once. As our performance evaluation shows,
this DAG-based extension of the set intersection search algorithm[1], [2], can
lead to search times that are on large documents more than twice as fast as the
search times of the XML-based approach. Additionally, we utilize a smaller
index, i.e., we consume less main memory to compute the results
Translation of Heterogeneous Databases into RDF, and Application to the Construction of a SKOS Taxonomical Reference
International audienceWhile the data deluge accelerates, most of the data produced remains locked in deep Web databases. For the linked open data to benefit from the potential represented by this huge amount of data, it is crucial to come up with solutions to expose heterogeneous databases as linked data. The xR2RML mapping language is an endeavor towards this goal: it is designed to map various types of databases to RDF, by flexibly adapting to heterogeneous query languages and data models while remaining free from any specific language. It extends R2RML, the W3C recommendation for the mapping of relational databases to RDF, and relies on RML for the handling of various data formats. In this paper we present xR2RML, we analyse data models of several modern databases as well as the format in which query results are returned , and we show how xR2RML translates any result data element into RDF, relying on existing languages such as XPath and JSONPath when necessary. We illustrate some features of xR2RML such as the generation of RDF collections and containers, and the ability to deal with mixed data formats. We also describe a real-world use case in which we applied xR2RML to build a SKOS thesaurus aimed at supporting studies on History of Zoology, Archaeozoology and Conservation Biology
Efficient Incremental Breadth-Depth XML Event Mining
Many applications log a large amount of events continuously. Extracting
interesting knowledge from logged events is an emerging active research area in
data mining. In this context, we propose an approach for mining frequent events
and association rules from logged events in XML format. This approach is
composed of two-main phases: I) constructing a novel tree structure called
Frequency XML-based Tree (FXT), which contains the frequency of events to be
mined; II) querying the constructed FXT using XQuery to discover frequent
itemsets and association rules. The FXT is constructed with a single-pass over
logged data. We implement the proposed algorithm and study various performance
issues. The performance study shows that the algorithm is efficient, for both
constructing the FXT and discovering association rules
- …