95 research outputs found

    Framework for Live Synchronization of RDF Views of Relational Data

    Get PDF
    This Demo presents a framework for the live synchronization of an RDF view defined on top of relational database. In the proposed framework, rules are responsible for computing and publishing the changeset required for the RDB-RDF view to stay synchronized with the relational database. The computed changesets are then used for the incremental maintenance of the RDB_RDF views as well as application views. The Demo is based on the LinkedBrainz Live tool, developed to validate the proposed framework

    Incremental View Maintenance for Property Graph Queries

    Get PDF
    This paper discusses the challenges of incremental view maintenance for property graph queries. We select a subset of property graph queries and present an approach that uses nested relational algebra to allow incremental evaluation

    Graph Summarization

    Full text link
    The continuous and rapid growth of highly interconnected datasets, which are both voluminous and complex, calls for the development of adequate processing and analytical techniques. One method for condensing and simplifying such datasets is graph summarization. It denotes a series of application-specific algorithms designed to transform graphs into more compact representations while preserving structural patterns, query answers, or specific property distributions. As this problem is common to several areas studying graph topologies, different approaches, such as clustering, compression, sampling, or influence detection, have been proposed, primarily based on statistical and optimization methods. The focus of our chapter is to pinpoint the main graph summarization methods, but especially to focus on the most recent approaches and novel research trends on this topic, not yet covered by previous surveys.Comment: To appear in the Encyclopedia of Big Data Technologie

    Social Network Data Management

    Get PDF
    With the increasing usage of online social networks and the semantic web's graph structured RDF framework, and the rising adoption of networks in various fields from biology to social science, there is a rapidly growing need for indexing, querying, and analyzing massive graph structured data. Facebook has amassed over 500 million users creating huge volumes of highly connected data. Governments have made RDF datasets containing billions of triples available to the public. In the life sciences, researches have started to connect disparate data sets of research results into one giant network of valuable information. Clearly, networks are becoming increasingly popular and growing rapidly in size, requiring scalable solutions for network data management. This thesis focuses on the following aspects of network data management. We present a hierarchical index structure for external memory storage of network data that aims to maximize data locality. We propose efficient algorithms to answer subgraph matching queries against network databases and discuss effective pruning strategies to improve performance. We show how adaptive cost models can speed up subgraph matching query answering by assigning budgets to index retrieval operations and adjusting the query plan while executing. We develop a cloud oriented social network database, COSI, which handles massive network datasets too large for a single computer by partitioning the data across multiple machines and achieving high performance query answering through asynchronous parallelization and cluster-aware heuristics. Tracking multiple standing queries against a social network database is much faster with our novel multi-view maintenance algorithm, which exploits common substructures between queries. To capture uncertainty inherent in social network querying, we define probabilistic subgraph matching queries over deterministic graph data and propose algorithms to answer them efficiently. Finally, we introduce a general relational machine learning framework and rule-based language, Probabilistic Soft Logic, to learn from and probabilistically reason about social network data and describe applications to information integration and information fusion

    Self Maintenance of Materialized XQuery Views via Query Containment and Re-Writing

    Get PDF
    In recent years XML, the eXtensible Markup Language has become the de-facto standard for publishing and exchanging information on the web and in enterprise data integration systems. Materialized views are often used in information integration systems to present a unified schema for efficient querying of distributed and possibly heterogenous data sources. On similar lines, ACE-XQ, an XQuery based semantic caching system shows the significant performance gains achieved by caching query results (as materialized views) and using these materialized views along with query containment techniques for answering future queries over distributed XML data sources. To keep data in these materialized views of ACE-XQ up-to-date, the view must be maintained i.e. whenever the base data changes, the corresponding cached data in the materialized view must also be updated. This thesis builds on the query containment ideas of ACE-XQ and proposes an efficient approach for self-maintenance of materialized views. Our experimental results illustrate the significant performance improvement achieved by this strategy over view re-computation for a variety of situations

    Archaeological site monitoring: UAV photogrammetry can be an answer

    Get PDF
    During archaeological excavations it is important to monitor the new excavated areas and findings day by day in order to be able to plan future excavation activities. At present, this daily activity is usually performed by using total stations, which survey the changes of the archaeological site: the surveyors are asked to produce day by day draft plans and sections which allow archaeologists to plan their future activities. The survey is realized during the excavations or just at the end of every working day and drawings have to be produced as soon as possible in order to allow the comprehension of the work done and to plan the activities for the following day. By using this technique, all the measurements, even those not necessary for the day after, have to be acquired in order to avoid a ‘loss of memory'. A possible alternative to this traditional approach is aerial photogrammetry, if the images can be acquired quickly and at a taken distance able to guarantee the necessary accuracy of a few centimeters. Today the use of UAVs (Unmanned Aerial Vehicles) can be considered a proven technology able to acquire images at distances ranging from 4 m up to 20 m: and therefore as a possible monitoring system to provide the necessary information to the archaeologists day by day. The control network, usually present at each archaeological site, can give the stable control points useful for orienting a photogrammetric block acquired by using an UAV equipped with a calibrated digital camera and a navigation control system able to drive the aircraft following a pre-planned flight scheme. Modern digital photogrammetric software can solve for the block orientation and generate a DSM automatically, allowing rapid orthophoto generation and the possibility of producing sections and plans. The present paper describes a low cost UAV system realized by the research group of the Politecnico di Torino and tested on a Roman villa archaeological site located in Aquileia (Italy), a well-known UNESCO WHL site. The results of automatic orientation and orthophoto production are described in terms of their accuracy and the completeness of information guaranteed for archaeological site excavation managemen

    K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources

    Get PDF
    The integration of heterogeneous data sources and software systems is a major issue in the biomed ical community and several approaches have been explored: linking databases, on-the- fly integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear winner . Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application
    corecore