2,506 research outputs found

    Knowledge-infused and Consistent Complex Event Processing over Real-time and Persistent Streams

    Full text link
    Emerging applications in Internet of Things (IoT) and Cyber-Physical Systems (CPS) present novel challenges to Big Data platforms for performing online analytics. Ubiquitous sensors from IoT deployments are able to generate data streams at high velocity, that include information from a variety of domains, and accumulate to large volumes on disk. Complex Event Processing (CEP) is recognized as an important real-time computing paradigm for analyzing continuous data streams. However, existing work on CEP is largely limited to relational query processing, exposing two distinctive gaps for query specification and execution: (1) infusing the relational query model with higher level knowledge semantics, and (2) seamless query evaluation across temporal spaces that span past, present and future events. These allow accessible analytics over data streams having properties from different disciplines, and help span the velocity (real-time) and volume (persistent) dimensions. In this article, we introduce a Knowledge-infused CEP (X-CEP) framework that provides domain-aware knowledge query constructs along with temporal operators that allow end-to-end queries to span across real-time and persistent streams. We translate this query model to efficient query execution over online and offline data streams, proposing several optimizations to mitigate the overheads introduced by evaluating semantic predicates and in accessing high-volume historic data streams. The proposed X-CEP query model and execution approaches are implemented in our prototype semantic CEP engine, SCEPter. We validate our query model using domain-aware CEP queries from a real-world Smart Power Grid application, and experimentally analyze the benefits of our optimizations for executing these queries, using event streams from a campus-microgrid IoT deployment.Comment: 34 pages, 16 figures, accepted in Future Generation Computer Systems, October 27, 201

    Reasoning & Querying – State of the Art

    Get PDF
    Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

    When Things Matter: A Data-Centric View of the Internet of Things

    Full text link
    With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed

    Mapping Large Scale Research Metadata to Linked Data: A Performance Comparison of HBase, CSV and XML

    Full text link
    OpenAIRE, the Open Access Infrastructure for Research in Europe, comprises a database of all EC FP7 and H2020 funded research projects, including metadata of their results (publications and datasets). These data are stored in an HBase NoSQL database, post-processed, and exposed as HTML for human consumption, and as XML through a web service interface. As an intermediate format to facilitate statistical computations, CSV is generated internally. To interlink the OpenAIRE data with related data on the Web, we aim at exporting them as Linked Open Data (LOD). The LOD export is required to integrate into the overall data processing workflow, where derived data are regenerated from the base data every day. We thus faced the challenge of identifying the best-performing conversion approach.We evaluated the performances of creating LOD by a MapReduce job on top of HBase, by mapping the intermediate CSV files, and by mapping the XML output.Comment: Accepted in 0th Metadata and Semantics Research Conferenc

    A Nine Month Progress Report on an Investigation into Mechanisms for Improving Triple Store Performance

    No full text
    This report considers the requirement for fast, efficient, and scalable triple stores as part of the effort to produce the Semantic Web. It summarises relevant information in the major background field of Database Management Systems (DBMS), and provides an overview of the techniques currently in use amongst the triple store community. The report concludes that for individuals and organisations to be willing to provide large amounts of information as openly-accessible nodes on the Semantic Web, storage and querying of the data must be cheaper and faster than it is currently. Experiences from the DBMS field can be used to maximise triple store performance, and suggestions are provided for lines of investigation in areas of storage, indexing, and query optimisation. Finally, work packages are provided describing expected timetables for further study of these topics
    corecore