3,545 research outputs found
Decentralized provenance-aware publishing with nanopublications
Publication and archival of scientific results is still commonly considered the responsability of classical publishing companies. Classical forms of publishing, however, which center around printed narrative articles, no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. In this article, we propose to design scientific data publishing as a web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data. We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used as a low-level data publication layer to serve the Semantic Web in general. Our evaluation of the current network shows that this system is efficient and reliable
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
Toward Self-Organising Service Communities
This paper discusses a framework in which catalog service communities are built, linked for interaction, and constantly monitored and adapted over time. A catalog service community (represented as a peer node in a peer-to-peer network) in our system can be viewed as domain specific data integration mediators representing the domain knowledge and the registry information. The query routing among communities is performed to identify a set of data sources that are relevant to answering a given query. The system monitors the interactions between the communities to discover patterns that may lead to restructuring of the network (e.g., irrelevant peers removed, new relationships created, etc.)
Webbox+Page Blossom: exploring design for AKTive data interaction
We give away our data to multiple data services without, for the most part, being able to get that data back to reuse in any other way, leaving us, at best, to re-find, re-cover, retype, remember and re-manage this material. In this work in progress, we hypothesize that if we facilitate easy interaction to store, access and reuse our personal, social and public data, we will not only decrease time spent to recreate it for multiple walled data contexts, but in particular, we will develop novel interactions for new kinds of knowledge building. To facilitate exploration of this hypothesis, we propose Page Blossom an exemplar of such dynamic data interaction that is based on data reuse via our open data platform Webbox + Active (active knowledge technology) lenses
Secure data sharing and processing in heterogeneous clouds
The extensive cloud adoption among the European Public Sector Players empowered them to own and operate a range of cloud infrastructures. These deployments vary both in the size and capabilities, as well as in the range of employed technologies and processes. The public sector, however, lacks the necessary technology to enable effective, interoperable and secure integration of a multitude of its computing clouds and services. In this work we focus on the federation of private clouds and the approaches that enable secure data sharing and processing among the collaborating infrastructures and services of public entities. We investigate the aspects of access control, data and security policy languages, as well as cryptographic approaches that enable fine-grained security and data processing in semi-trusted environments. We identify the main challenges and frame the future work that serve as an enabler of interoperability among heterogeneous infrastructures and services. Our goal is to enable both security and legal conformance as well as to facilitate transparency, privacy and effectivity of private cloud federations for the public sector needs. © 2015 The Authors
- …