Search CORE

6,293 research outputs found

S-Store: Streaming Meets Transaction Processing

Author: Aslantas Cansu
Cetintemel Ugur
Du Jiang
Kraska Tim
Madden Samuel
Maier David
Meehan John
Pavlo Andrew
Stonebraker Michael
Tatbul Nesime
Tufte Kristin
Wang Hao
Zdonik Stan
Publication venue
Publication date: 01/01/2015
Field of study

Stream processing addresses the needs of real-time applications. Transaction processing addresses the coordination and safety of short atomic computations. Heretofore, these two modes of operation existed in separate, stove-piped systems. In this work, we attempt to fuse the two computational paradigms in a single system called S-Store. In this way, S-Store can simultaneously accommodate OLTP and streaming applications. We present a simple transaction model for streams that integrates seamlessly with a traditional OLTP system. We chose to build S-Store as an extension of H-Store, an open-source, in-memory, distributed OLTP database system. By implementing S-Store in this way, we can make use of the transaction processing facilities that H-Store already supports, and we can concentrate on the additional implementation features that are needed to support streaming. Similar implementations could be done using other main-memory OLTP platforms. We show that we can actually achieve higher throughput for streaming workloads in S-Store than an equivalent deployment in H-Store alone. We also show how this can be achieved within H-Store with the addition of a modest amount of new functionality. Furthermore, we compare S-Store to two state-of-the-art streaming systems, Spark Streaming and Storm, and show how S-Store matches and sometimes exceeds their performance while providing stronger transactional guarantees

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

PDXScholar (Portland State University)

Distributed data mining in grid computing environments

Author: Cannataro
Cannataro
Fernandez-Baca
Kevin Lü
Kwok
Ping Luo
Qing He
Zhongzhi Shi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

The official published version of this article can be found at the link below.The computing-intensive data mining for inherently Internet-wide distributed data, referred to as Distributed Data Mining (DDM), calls for the support of a powerful Grid with an effective scheduling framework. DDM often shares the computing paradigm of local processing and global synthesizing. It involves every phase of Data Mining (DM) processes, which makes the workflow of DDM very complex and can be modelled only by a Directed Acyclic Graph (DAG) with multiple data entries. Motivated by the need for a practical solution of the Grid scheduling problem for the DDM workflow, this paper proposes a novel two-phase scheduling framework, including External Scheduling and Internal Scheduling, on a two-level Grid architecture (InterGrid, IntraGrid). Currently a DM IntraGrid, named DMGCE (Data Mining Grid Computing Environment), has been developed with a dynamic scheduling framework for competitive DAGs in a heterogeneous computing environment. This system is implemented in an established Multi-Agent System (MAS) environment, in which the reuse of existing DM algorithms is achieved by encapsulating them into agents. Practical classification problems from oil well logging analysis are used to measure the system performance. The detailed experiment procedure and result analysis are also discussed in this paper

Crossref

Brunel University Research Archive

CYCLONE Unified Deployment and Management of Federated, Multi-Cloud Applications

Author: Baranda José Ignacio Aznar
Blanchet Christophe
Branchat Robert
Demchenko Yuri
Lodygensky Oleg
Loomis Charles
Slawik Mathias
Zilci Begüm İlke
Publication venue
Publication date: 22/07/2016
Field of study

Various Cloud layers have to work in concert in order to manage and deploy complex multi-cloud applications, executing sophisticated workflows for Cloud resource deployment, activation, adjustment, interaction, and monitoring. While there are ample solutions for managing individual Cloud aspects (e.g. network controllers, deployment tools, and application security software), there are no well-integrated suites for managing an entire multi cloud environment with multiple providers and deployment models. This paper presents the CYCLONE architecture that integrates a number of existing solutions to create an open, unified, holistic Cloud management platform for multi-cloud applications, tailored to the needs of research organizations and SMEs. It discusses major challenges in providing a network and security infrastructure for the Intercloud and concludes with the demonstration how the architecture is implemented in a real life bioinformatics use case

arXiv.org e-Print Archive

Crossref

Designing Traceability into Big Data Systems

Author: Branson Andrew
Consortium the CRISTAL-ISE
Kovacs Zsolt
McClatchey Richard
Shamdasani Jetendr
Publication venue
Publication date: 01/01/2015
Field of study

Providing an appropriate level of accessibility and traceability to data or process elements (so-called Items) in large volumes of data, often Cloud-resident, is an essential requirement in the Big Data era. Enterprise-wide data systems need to be designed from the outset to support usage of such Items across the spectrum of business use rather than from any specific application view. The design philosophy advocated in this paper is to drive the design process using a so-called description-driven approach which enriches models with meta-data and description and focuses the design process on Item re-use, thereby promoting traceability. Details are given of the description-driven design of big data systems at CERN, in health informatics and in business process management. Evidence is presented that the approach leads to design simplicity and consequent ease of management thanks to loose typing and the adoption of a unified approach to Item management and usage.Comment: 10 pages; 6 figures in Proceedings of the 5th Annual International Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015), Singapore July 2015. arXiv admin note: text overlap with arXiv:1402.5764, arXiv:1402.575

arXiv.org e-Print Archive

CERN Document Server

Personalizing Situated Workflows for Pervasive Healthcare Applications

Author: Dong Changyu
Dulay Naranker
Russello Giovanni
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

In this paper, we present an approach where a workflow system is combined with a policy-based framework for the specification and enforcement of policies for healthcare applications. In our approach, workflows are used to capture entitiespsila responsibilities and to assist entities in fulfilling them. The policy-based framework allows us to express authorisation policies to define the rights that entities have in the system, and event-condition-action (ECA) policies that are used to adapt the system to the actual situation. Authorisations will often depend on the context in which patientspsila care takes place, and our policies support predicates that reflect the environment. ECA policies capture events that reflect the current state of the environment and can perform actions to accordingly adapt the workflow execution. We show how the approach can be used for the Edema treatment and how fine-grained authorisation and ECA policies are expressed and used

CiteSeerX

Crossref

University of Strathclyde Institutional Repository

EGI user forum 2011 : book of abstracts

Author
Publication venue
Publication date: 01/01/2011
Field of study

Hochschulschriftenserver - Universität Frankfurt am Main