26,598 research outputs found
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
Towards a Cloud-Based Service for Maintaining and Analyzing Data About Scientific Events
We propose the new cloud-based service OpenResearch for managing and
analyzing data about scientific events such as conferences and workshops in a
persistent and reliable way. This includes data about scientific articles,
participants, acceptance rates, submission numbers, impact values as well as
organizational details such as program committees, chairs, fees and sponsors.
OpenResearch is a centralized repository for scientific events and supports
researchers in collecting, organizing, sharing and disseminating information
about scientific events in a structured way. An additional feature currently
under development is the possibility to archive web pages along with the
extracted semantic data in order to lift the burden of maintaining new and old
conference web sites from public research institutions. However, the main
advantage is that this cloud-based repository enables a comprehensive analysis
of conference data. Based on extracted semantic data, it is possible to
determine quality estimations, scientific communities, research trends as well
the development of acceptance rates, fees, and number of participants in a
continuous way complemented by projections into the future. Furthermore, data
about research articles can be systematically explored using a content-based
analysis as well as citation linkage. All data maintained in this
crowd-sourcing platform is made freely available through an open SPARQL
endpoint, which allows for analytical queries in a flexible and user-defined
way.Comment: A completed version of this paper had been accepted in SAVE-SD
workshop 2017 at WWW conferenc
State of The Art and Hot Aspects in Cloud Data Storage Security
Along with the evolution of cloud computing and cloud storage towards matu-
rity, researchers have analyzed an increasing range of cloud computing security
aspects, data security being an important topic in this area. In this paper, we
examine the state of the art in cloud storage security through an overview of
selected peer reviewed publications. We address the question of defining cloud
storage security and its different aspects, as well as enumerate the main vec-
tors of attack on cloud storage. The reviewed papers present techniques for key
management and controlled disclosure of encrypted data in cloud storage, while
novel ideas regarding secure operations on encrypted data and methods for pro-
tection of data in fully virtualized environments provide a glimpse of the toolbox
available for securing cloud storage. Finally, new challenges such as emergent
government regulation call for solutions to problems that did not receive enough
attention in earlier stages of cloud computing, such as for example geographical
location of data. The methods presented in the papers selected for this review
represent only a small fraction of the wide research effort within cloud storage
security. Nevertheless, they serve as an indication of the diversity of problems
that are being addressed
SWISH: SWI-Prolog for Sharing
Recently, we see a new type of interfaces for programmers based on web
technology. For example, JSFiddle, IPython Notebook and R-studio. Web
technology enables cloud-based solutions, embedding in tutorial web pages,
atractive rendering of results, web-scale cooperative development, etc. This
article describes SWISH, a web front-end for Prolog. A public website exposes
SWI-Prolog using SWISH, which is used to run small Prolog programs for
demonstration, experimentation and education. We connected SWISH to the
ClioPatria semantic web toolkit, where it allows for collaborative development
of programs and queries related to a dataset as well as performing maintenance
tasks on the running server and we embedded SWISH in the Learn Prolog Now!
online Prolog book.Comment: International Workshop on User-Oriented Logic Programming (IULP
2015), co-located with the 31st International Conference on Logic Programming
(ICLP 2015), Proceedings of the International Workshop on User-Oriented Logic
Programming (IULP 2015), Editors: Stefan Ellmauthaler and Claudia Schulz,
pages 99-113, August 201
- …