Search CORE

4 research outputs found

GeneaLog: Fine-Grained Data Streaming Provenance at the Edge

Author: Abadi Daniel J
Arasu Arvind
Carbone Paris
Gulisano Vincenzo
Richard Wang Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Fine-grained data provenance in data streaming allows linking each result tuple back to the source data that contributed to it, something beneficial for many applications (e.g., to find the conditions triggering a security- or safety-related alert). Further, when data transmission or storage has to be minimized, as in edge computing and cyber-physical systems, it can help in identifying the source data to be prioritized.The memory and processing costs of fine-grained data provenance, possibly afforded by high-end servers, can be prohibitive for the resource-constrained devices deployed in edge computing and cyber-physical systems. Motivated by this challenge, we present GeneaLog, a novel fine-grained data provenance technique for data streaming applications. Leveraging the logical dependencies of the data, GeneaLog takes advantage of cross-layer properties of the software stack and incurs a minimal, constant size per-tuple overhead. Furthermore, it allows for a modular and efficient algorithmic implementation using only standard data streaming operators. This is particularly useful for distributed streaming applications since the provenance processing can be executed at separate nodes, orthogonal to the data processing. We evaluate an implementation of GeneaLog using vehicular and smart grid applications, confirming it efficiently captures fine-grained provenance data with minimal overhead

Crossref

Chalmers Research

A Survey of Scholarly Data: From Big Data Perspective

Author: Alam Mansaf
Khan Samiya
Liu Xiufeng
Shakil Kashish A.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Recently, there has been a shifting focus of organizations and governments towards digitization of academic and technical documents, adding a new facet to the concept of digital libraries. The volume, variety and velocity of this generated data, satisfies the big data definition, as a result of which, this scholarly reserve is popularly referred to as big scholarly data. In order to facilitate data analytics for big scholarly data, architectures and services for the same need to be developed. The evolving nature of research problems has made them essentially interdisciplinary. As a result, there is a growing demand for scholarly applications like collaborator discovery, expert finding and research recommendation systems, in addition to several others. This research paper investigates the current trends and identifies the existing challenges in development of a big scholarly data platform, with specific focus on directions for future research and maps them to the different phases of the big data lifecycle

Crossref

Greenwich Academic Literature Archive

Online Research Database In Technology

Provenance Research Issues and Challenges in the Big Data Era

Author: Cuzzocrea Alfredo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Provenance of Big Data is a hot-topic in the database and data mining research communities. Basically, provenance is the process of detecting the lineage and the derivation of data and data objects, and it plays a major role in database management systems as well as in workflow management systems and distributed systems. Despite this, provenance of big data research is still in its embryonic phase, and a lot of efforts must still be done in this area. Inspired by these considerations, in this paper we provide an overview of relevant issues and challenges in the context of big data provenance research, by also highlighting possible future efforts within these research directions

Archivio istituzionale della ricerca - Università di Trieste

Crossref