6,293 research outputs found
S-Store: Streaming Meets Transaction Processing
Stream processing addresses the needs of real-time applications. Transaction
processing addresses the coordination and safety of short atomic computations.
Heretofore, these two modes of operation existed in separate, stove-piped
systems. In this work, we attempt to fuse the two computational paradigms in a
single system called S-Store. In this way, S-Store can simultaneously
accommodate OLTP and streaming applications. We present a simple transaction
model for streams that integrates seamlessly with a traditional OLTP system. We
chose to build S-Store as an extension of H-Store, an open-source, in-memory,
distributed OLTP database system. By implementing S-Store in this way, we can
make use of the transaction processing facilities that H-Store already
supports, and we can concentrate on the additional implementation features that
are needed to support streaming. Similar implementations could be done using
other main-memory OLTP platforms. We show that we can actually achieve higher
throughput for streaming workloads in S-Store than an equivalent deployment in
H-Store alone. We also show how this can be achieved within H-Store with the
addition of a modest amount of new functionality. Furthermore, we compare
S-Store to two state-of-the-art streaming systems, Spark Streaming and Storm,
and show how S-Store matches and sometimes exceeds their performance while
providing stronger transactional guarantees
Distributed data mining in grid computing environments
The official published version of this article can be found at the link below.The computing-intensive data mining for inherently Internet-wide distributed data, referred to as Distributed Data Mining (DDM), calls for the support of a powerful Grid with an effective scheduling framework. DDM often shares the computing paradigm of local processing and global synthesizing. It involves every phase of Data Mining (DM) processes, which makes the workflow of DDM very complex and can be modelled only by a Directed Acyclic Graph (DAG) with multiple data entries. Motivated by the need for a practical solution of the Grid scheduling problem for the DDM workflow, this paper proposes a novel two-phase scheduling framework, including External Scheduling and Internal Scheduling, on a two-level Grid architecture (InterGrid, IntraGrid). Currently a DM IntraGrid, named DMGCE (Data Mining Grid Computing Environment), has been developed with a dynamic scheduling framework for competitive DAGs in a heterogeneous computing environment. This system is implemented in an established Multi-Agent System (MAS) environment, in which the reuse of existing DM algorithms is achieved by encapsulating them into agents. Practical classification problems from oil well logging analysis are used to measure the system performance. The detailed experiment procedure and result analysis are also discussed in this paper
CYCLONE Unified Deployment and Management of Federated, Multi-Cloud Applications
Various Cloud layers have to work in concert in order to manage and deploy
complex multi-cloud applications, executing sophisticated workflows for Cloud
resource deployment, activation, adjustment, interaction, and monitoring. While
there are ample solutions for managing individual Cloud aspects (e.g. network
controllers, deployment tools, and application security software), there are no
well-integrated suites for managing an entire multi cloud environment with
multiple providers and deployment models. This paper presents the CYCLONE
architecture that integrates a number of existing solutions to create an open,
unified, holistic Cloud management platform for multi-cloud applications,
tailored to the needs of research organizations and SMEs. It discusses major
challenges in providing a network and security infrastructure for the
Intercloud and concludes with the demonstration how the architecture is
implemented in a real life bioinformatics use case
Designing Traceability into Big Data Systems
Providing an appropriate level of accessibility and traceability to data or
process elements (so-called Items) in large volumes of data, often
Cloud-resident, is an essential requirement in the Big Data era.
Enterprise-wide data systems need to be designed from the outset to support
usage of such Items across the spectrum of business use rather than from any
specific application view. The design philosophy advocated in this paper is to
drive the design process using a so-called description-driven approach which
enriches models with meta-data and description and focuses the design process
on Item re-use, thereby promoting traceability. Details are given of the
description-driven design of big data systems at CERN, in health informatics
and in business process management. Evidence is presented that the approach
leads to design simplicity and consequent ease of management thanks to loose
typing and the adoption of a unified approach to Item management and usage.Comment: 10 pages; 6 figures in Proceedings of the 5th Annual International
Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015), Singapore
July 2015. arXiv admin note: text overlap with arXiv:1402.5764,
arXiv:1402.575
Personalizing Situated Workflows for Pervasive Healthcare Applications
In this paper, we present an approach where a workflow system is combined with a policy-based framework for the specification and enforcement of policies for healthcare applications. In our approach, workflows are used to capture entitiespsila responsibilities and to assist entities in fulfilling them. The policy-based framework allows us to express authorisation policies to define the rights that entities have in the system, and event-condition-action (ECA) policies that are used to adapt the system to the actual situation. Authorisations will often depend on the context in which patientspsila care takes place, and our policies support predicates that reflect the environment. ECA policies capture events that reflect the current state of the environment and can perform actions to accordingly adapt the workflow execution. We show how the approach can be used for the Edema treatment and how fine-grained authorisation and ECA policies are expressed and used
- âŠ