    Towards Analytics Aware Ontology Based Access to Static and Streaming Data (Extended Version)

    Real-time analytics that requires integration and aggregation of heterogeneous and distributed streaming and static data is a typical task in many industrial scenarios such as diagnostics of turbines in Siemens. OBDA approach has a great potential to facilitate such tasks; however, it has a number of limitations in dealing with analytics that restrict its use in important industrial applications. Based on our experience with Siemens, we argue that in order to overcome those limitations OBDA should be extended and become analytics, source, and cost aware. In this work we propose such an extension. In particular, we propose an ontology, mapping, and query language for OBDA, where aggregate and other analytical functions are first class citizens. Moreover, we develop query optimisation techniques that allow to efficiently process analytical tasks over static and streaming data. We implement our approach in a system and evaluate our system with Siemens turbine data

    Accessing RDF(S) data resources in service-based Grid infrastructures

    We describe the results of the RDF(S) activity within the Open Grid Forum (http://www.ogf.org) (OGF) Database Access and Integration Services (DAIS) Working Group (http://forge.gridforum.org/projects/dais-wg) whose objective is to develop standard service-based grid access mechanisms for data expressed in RDF and RDF Schema. We produce two specifications, focused on the provision of SPARQL querying capabilities for accessing RDF data and a set of RDF Schema ontology handling primitives for creating, retrieving, updating, and deleting RDF data. In this paper we present a set of use cases that justify this work and an overview of these specifications, which will enter in editorial process at OGF25. We conclude by outlining the future work that will be made in the context of this standardization process

    Bench-Ranking: ettekirjutav analĂŒĂŒsimeetod suurte teadmiste graafide pĂ€ringutele

    Relatsiooniliste suurandmete (BD) töötlemisraamistike kasutamine suurte teadmiste graafide töötlemiseks kĂ€tkeb endas vĂ”imalust pĂ€ringu jĂ”udlust optimeerimida. Kaasaegsed BD-sĂŒsteemid on samas keerulised andmesĂŒsteemid, mille konfiguratsioonid omavad olulist mĂ”ju jĂ”udlusele. Erinevate raamistike ja konfiguratsioonide vĂ”rdlusuuringud pakuvad kogukonnale parimaid tavasid parema jĂ”udluse saavutamiseks. Enamik neist vĂ”rdlusuuringutest saab liigitada siiski vaid kirjeldavaks ja diagnostiliseks analĂŒĂŒtikaks. Lisaks puudub ĂŒhtne standard nende uuringute vĂ”rdlemiseks kvantitatiivselt jĂ€rjestatud kujul. Veelgi enam, suurte graafide töötlemiseks vajalike konveierite kavandamine eeldab tĂ€iendavaid disainiotsuseid mis tulenevad mitteloomulikust (relatsioonilisest) graafi töötlemise paradigmast. Taolisi disainiotsuseid ei saa automaatselt langetada, nt relatsiooniskeemi, partitsioonitehnika ja salvestusvormingute valikut. KĂ€esolevas töös kĂ€sitleme kuidas me antud uurimuslĂŒnga tĂ€idame. Esmalt nĂ€itame disainiotsuste kompromisside mĂ”ju BD-sĂŒsteemide jĂ”udluse korratavusele suurte teadmiste graafide pĂ€ringute tegemisel. Lisaks nĂ€itame BD-raamistike jĂ”udluse kirjeldavate ja diagnostiliste analĂŒĂŒside piiranguid suurte graafide pĂ€ringute tegemisel. SeejĂ€rel uurime, kuidas lubada ettekirjutavat analĂŒĂŒtikat jĂ€rjestamisfunktsioonide ja mitmemÔÔtmeliste optimeerimistehnikate (nn "Bench-Ranking") kaudu. See lĂ€henemine peidab kirjeldava tulemusanalĂŒĂŒsi keerukuse, suunates praktiku otse teostatavate teadlike otsusteni.Leveraging relational Big Data (BD) processing frameworks to process large knowledge graphs yields a great interest in optimizing query performance. Modern BD systems are yet complicated data systems, where the configurations notably affect the performance. Benchmarking different frameworks and configurations provides the community with best practices for better performance. However, most of these benchmarking efforts are classified as descriptive and diagnostic analytics. Moreover, there is no standard for comparing these benchmarks based on quantitative ranking techniques. Moreover, designing mature pipelines for processing big graphs entails considering additional design decisions that emerge with the non-native (relational) graph processing paradigm. Those design decisions cannot be decided automatically, e.g., the choice of the relational schema, partitioning technique, and storage formats. Thus, in this thesis, we discuss how our work fills this timely research gap. Particularly, we first show the impact of those design decisions’ trade-offs on the BD systems’ performance replicability when querying large knowledge graphs. Moreover, we showed the limitations of the descriptive and diagnostic analyses of BD frameworks’ performance for querying large graphs. Thus, we investigate how to enable prescriptive analytics via ranking functions and Multi-Dimensional optimization techniques (called ”Bench-Ranking”). This approach abstracts out from the complexity of descriptive performance analysis, guiding the practitioner directly to actionable informed decisions.https://www.ester.ee/record=b553332

    Linked Data - the story so far

    The term “Linked Data” refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions— the Web of Data. In this article, the authors present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. They describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward

    RDF Querying

    Reactive Web systems, Web services, and Web-based publish/ subscribe systems communicate events as XML messages, and in many cases require composite event detection: it is not sufficient to react to single event messages, but events have to be considered in relation to other events that are received over time. Emphasizing language design and formal semantics, we describe the rule-based query language XChangeEQ for detecting composite events. XChangeEQ is designed to completely cover and integrate the four complementary querying dimensions: event data, event composition, temporal relationships, and event accumulation. Semantics are provided as model and fixpoint theories; while this is an established approach for rule languages, it has not been applied for event queries before
