Search CORE

99,481 research outputs found

Data-Intensive architecture for scientific knowledge discovery

Author: Atkinson Malcolm
Corcho Oscar
Galea Michelle
Krause Amrey
Liew Chee Sun
Martin Paul
Mouat Adrian
Snelling D.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

This paper presents a data-intensive architecture that demonstrates the ability to support applications from a wide range of application domains, and support the different types of users involved in defining, designing and executing data-intensive processing tasks. The prototype architecture is introduced, and the pivotal role of DISPEL as a canonical language is explained. The architecture promotes the exploration and exploitation of distributed and heterogeneous data and spans the complete knowledge discovery process, from data preparation, to analysis, to evaluation and reiteration. The architecture evaluation included large-scale applications from astronomy, cosmology, hydrology, functional genetics, imaging processing and seismology

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Computing environments for reproducibility: Capturing the 'Whole Tale'

Author: Brinckman Adam
Chard Kyle
Gaffney Niall
Hategan Mihael
Jones Matthew B.
Kowalik Kacper
Kulasekaran Sivakumar
Ludäscher Bertram
Mecum Bryce D.
Nabrzyski Jarek
Stodden Victoria
Taylor Ian J.
Turk Matthew J.
Turner Kandace
Publication venue: 'Elsevier BV'
Publication date: 01/02/2018
Field of study

The act of sharing scientific knowledge is rapidly evolving away from traditional articles and presentations to the delivery of executable objects that integrate the data and computational details (e.g., scripts and workflows) upon which the findings rely. This envisioned coupling of data and process is essential to advancing science but faces technical and institutional barriers. The Whole Tale project aims to address these barriers by connecting computational, data-intensive research efforts with the larger research process—transforming the knowledge discovery and dissemination process into one where data products are united with research articles to create “living publications” or tales. The Whole Tale focuses on the full spectrum of science, empowering users in the long tail of science, and power users with demands for access to big data and compute resources. We report here on the design, architecture, and implementation of the Whole Tale environment

arXiv.org e-Print Archive

Online Research @ Cardiff

Collaborative e-science architecture for Reaction Kinetics research community

Author: Dew P.M.
Lau L.M.S.
Pham T.V.
Pilling M.J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

This paper presents a novel collaborative e-science architecture (CeSA) to address two challenging issues in e-science that arise from the management of heterogeneous distributed environments: (i) how to provide individual scientists an integrated environment to collaborate with each other in distributed, loosely coupled research communities where each member might be using a disparate range of tools; and (ii) how to provide easy access to a range of computationally intensive resources from a desktop. The Reaction Kinetics research community was used to capture the requirements and in the evaluation of the proposed architecture. The result demonstrated the feasibility of the approach and the potential benefits of the CeSA

Crossref

White Rose Research Online

Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure

Author: Zhuge Hai
Publication venue
Publication date: 18/07/2015
Field of study

Big data research has attracted great attention in science, technology, industry and society. It is developing with the evolving scientific paradigm, the fourth industrial revolution, and the transformational innovation of technologies. However, its nature and fundamental challenge have not been recognized, and its own methodology has not been formed. This paper explores and answers the following questions: What is big data? What are the basic methods for representing, managing and analyzing big data? What is the relationship between big data and knowledge? Can we find a mapping from big data into knowledge space? What kind of infrastructure is required to support not only big data management and analysis but also knowledge discovery, sharing and management? What is the relationship between big data and science paradigm? What is the nature and fundamental challenge of big data computing? A multi-dimensional perspective is presented toward a methodology of big data computing.Comment: 59 page

arXiv.org e-Print Archive

CiteSeerX

Enabling e-Research in combustion research community

Author: Dew P.M.
Lau L.M.S.
Pham T.V.
Pilling M.J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2006
Field of study

Abstract This paper proposes an application of the Collaborative e-Science Architecture (CeSA) to enable e-Research in combustion research community. A major problem of the community is that data required for constructing modelling might already exist but scattered and improperly evaluated. That makes the collection of data for constructing models difficult and time-consuming. The decentralised P2P collaborative environment of the CeSA is well suited to solve this distributed problem. It opens up access to scattered data and turns them to valuable resources. Other issues of the community addressed here are the needs for computational resources, storages and interoperability amongst different data formats can also be addressed by the use of Grid environment in the CeSA

White Rose Research Online

Data as a Service (DaaS) for sharing and processing of large data collections in the cloud

Author: Bucci Enrico
Ruiu Pietro
Terzo Olivier
Xhafa Xhafa Fatos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Data as a Service (DaaS) is among the latest kind of services being investigated in the Cloud computing community. The main aim of DaaS is to overcome limitations of state-of-the-art approaches in data technologies, according to which data is stored and accessed from repositories whose location is known and is relevant for sharing and processing. Besides limitations for the data sharing, current approaches also do not achieve to fully separate/decouple software services from data and thus impose limitations in inter-operability. In this paper we propose a DaaS approach for intelligent sharing and processing of large data collections with the aim of abstracting the data location (by making it relevant to the needs of sharing and accessing) and to fully decouple the data and its processing. The aim of our approach is to build a Cloud computing platform, offering DaaS to support large communities of users that need to share, access, and process the data for collectively building knowledge from data. We exemplify the approach from large data collections from health and biology domains.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Data Driven Discovery in Astrophysics

Author: Brescia M.
Cavuoti S.
Djorgovski S. G.
Donalek C.
Longo G.
Publication venue
Publication date: 01/01/2014
Field of study

We review some aspects of the current state of data-intensive astronomy, its methods, and some outstanding data analysis challenges. Astronomy is at the forefront of "big data" science, with exponentially growing data volumes and data rates, and an ever-increasing complexity, now entering the Petascale regime. Telescopes and observatories from both ground and space, covering a full range of wavelengths, feed the data via processing pipelines into dedicated archives, where they can be accessed for scientific analysis. Most of the large archives are connected through the Virtual Observatory framework, that provides interoperability standards and services, and effectively constitutes a global data grid of astronomy. Making discoveries in this overabundance of data requires applications of novel, machine learning tools. We describe some of the recent examples of such applications.Comment: Keynote talk in the proceedings of ESA-ESRIN Conference: Big Data from Space 2014, Frascati, Italy, November 12-14, 2014, 8 pages, 2 figure

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

A Taxonomy of Workflow Management Systems for Grid Computing

Author: Buyya Rajkumar
Yu Jia
Publication venue
Publication date: 01/01/2005
Field of study

With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

arXiv.org e-Print Archive

CiteSeerX