Search CORE

105 research outputs found

The Evolution of myExperiment

Author: Aleksejevs Sergejs
Bechhofer Sean
Bhagat Jiten
Cruickshank Don
De Roure David
Fisher Paul
Goble Carole
Kollara Nandkumar
Michaelides Danius
Missier Paolo
Newman David
Ramsden Marcus
Roos Marco
Wolstencroft Katy
Zaluska Ed
Zhao Jun
Publication venue
Publication date
Field of study

The myExperiment social website for sharing scientific workflows, designed according to Web 2.0 principles, has grown to be the largest public repository of its kind. It is distinctive for its focus on sharing methods, its researcher-centric design and its facility to aggregate content into sharable 'research objects'. This evolution of myExperiment has occurred hand in hand with its users. myExperiment now supports Linked Data as a step toward our vision of the future research environment, which we categorise here as '3rd generation e-Research'

Southampton (e-Prints Soton)

Couplers for linking environmental models: scoping study and potential next steps

Author: Barkwith A.K.A.P.
Hughes A.G.
Pachocka M.
Watson C.
Publication venue: British Geological Survey
Publication date: 16/09/2014
Field of study

This report scopes out what couplers there are available in the hydrology and atmospheric modelling fields. The work reported here examines both dynamic runtime and one way file based coupling. Based on a review of the peer-reviewed literature and other open sources, there are a plethora of coupling technologies and standards relating to file formats. The available approaches have been evaluated against criteria developed as part of the DREAM project. Based on these investigations, the following recommendations are made: • The most promising dynamic coupling technologies for use within BGS are OpenMI 2.0 and CSDMS (either 1.0 or 2.0) • Investigate the use of workflow engines: Trident and Pyxis, the latter as part of the TSB/AHRC project “Confluence” • There is a need to include database standards CSW and GDAL and use data formats from the climate community NetCDF and CF standards. • Development of a “standard” composition which will consist of two process models and a 3D geological model all linked to data stored in the BGS corporate database and flat file format. Web Feature Services should be included in these compositions. There is also a need to investigate other approaches in different disciplines: The Loss Modelling Framework, OASIS-LMF is the best candidate

NERC Open Research Archive

Data-Intensive architecture for scientific knowledge discovery

Author: Atkinson Malcolm
Corcho Oscar
Galea Michelle
Krause Amrey
Liew Chee Sun
Martin Paul
Mouat Adrian
Snelling D.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

This paper presents a data-intensive architecture that demonstrates the ability to support applications from a wide range of application domains, and support the different types of users involved in defining, designing and executing data-intensive processing tasks. The prototype architecture is introduced, and the pivotal role of DISPEL as a canonical language is explained. The architecture promotes the exploration and exploitation of distributed and heterogeneous data and spans the complete knowledge discovery process, from data preparation, to analysis, to evaluation and reiteration. The architecture evaluation included large-scale applications from astronomy, cosmology, hydrology, functional genetics, imaging processing and seismology

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

CaGrid Workflow Toolkit: A taverna based workflow tool for cancer grid

Author: A Von Eschenbach
Alexandra Nenadic
B Ludäscher
B Sotomayor
Carole A Goble
D De Roure
D Hull
Dinanath Sulakhe
E Deelman
I Foster
I Foster
I Foster
I Taylor
Ian Foster
J Bhagat
J Saltz
J Saltz
MA Shipp
MB Blake
P Li
Ravi Madduri
RS Barga
Stian Soiland-Reyes
T Kuhn
T Oinn
V Welch
W Tan
Wei Tan
X Bian
Y Zhao
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

SOFTWAR

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The University of Manchester - Institutional Repository

Executing Large Scale Scientific Workflows in Public Clouds

Author: Jiang Qingye
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2015
Field of study

Scientists in different fields, such as high-energy physics, earth science, and astronomy are developing large-scale workflow applications. In many use cases, scientists need to run a set of interrelated but independent workflows (i.e., workflow ensembles) for the entire scientific analysis. As a workflow ensemble usually contains many sub-workflows in each of which hundreds or thousands of jobs exist with precedence constraints, the execution of such a workflow ensemble makes a great concern with cost even using elastic and pay-as-you-go cloud resources. In this thesis, we develop a set of methods to optimize the execution of large-scale scientific workflows in public clouds with both cost and deadline constraints with a two-step approach. Firstly, we present a set of methods to optimize the execution of scientific workflow in public clouds, with the Montage astronomical mosaic engine running on Amazon EC2 as an example. Secondly, we address three main challenges in realizing benefits of using public clouds when executing large-scale workflow ensembles: (1) execution coordination, (2) resource provisioning, and (3) data staging. To this end, we develop a new pulling-based workflow execution system with a profiling-based resource provisioning strategy. Our results show that our solution system can achieve 80% speed-up, by removing scheduling overhead, compared to the well-known Pegasus workflow management system when running scientific workflow ensembles. Besides, our evaluation using Montage workflow ensembles on around 1000-core Amazon EC2 clusters has demonstrated the efficacy of our resource provisioning strategy in terms of cost effectiveness within deadline

Sydney eScholarship

Tools for Researchers – Microsoft Research and the Scholarly Information Ecosystem

Author: Dirks Lee
Wade Alex
Publication venue
Publication date: 01/01/2009
Field of study

University of Queensland eSpace

End-to-end eScience: integrating workflow, query, visualization, and provenance at an ocean observatory

Author: Freire Juliana
Howe Bill
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2008
Field of study

Journal ArticleData analysis tasks at an Ocean Observatory require integrative and and domain-specialized use of database, workflow, visualization systems. We describe a platform to support these tasks developed as part of the cyberinfrastructure at the NSF Science and Technology Center for Coastal Margin Observation and Prediction integrating a provenance-aware workflow system, 3D visualization, and a remote query engine for large-scale ocean circulation models. We show how these disparate tools complement each other and give examples of real scientific insights delivered by the integrated system. We conclude that data management solutions for eScience require this kind of holistic, integrative approach, explain how our approach may be generalized, and recommend a broader, application-oriented research agenda to explore relevant architectures

The University of Utah: J. Willard Marriott Digital Library

EdiFlow: data-intensive interactive workflows for visual analytics

Author: Benzaken Veronique
Fekete Jean-Daniel
Hémery Pierre-Luc
Khemiri Wael
Manolescu Ioana
Publication venue: HAL CCSD
Publication date: 11/04/2011
Field of study

International audienceVisual analytics aims at combining interactive data visualization with data analysis tasks. Given the explosion in volume and complexity of scientific data, e.g., associated to biological or physical processes or social networks, visual analytics is called to play an important role in scientific data management. Most visual analytics platforms, however, are memory-based, and are therefore limited in the volume of data handled. Moreover, the integration of each new algorithm (e.g. for clustering) requires integrating it by hand into the platform. Finally, they lack the capability to define and deploy well-structured processes where users with different roles interact in a coordinated way sharing the same data and possibly the same visualizations. We have designed and implemented EdiFlow, a workflow platform for visual analytics applications. EdiFlow uses a simple structured process model, and is backed by a persistent database, storing both process information and process instance data. EdiFlow processes provide the usual process features (roles, structured control) and may integrate visual analytics tasks as activities. We present its architecture, deployment on a sample application, and main technical challenges involved

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Distributed storage and queryng techniques for a semantic web of scientific workflow provenance

Author: Navarro Jaime Alberto
Publication venue: ScholarWorks @ UTRGV
Publication date: 01/08/2010
Field of study

In scientific workflow environments, scientists depend on provenance, which records the history of an experiment. Resource Description Framework is frequently used to represent provenance based on vocabularies such as the Open Provenance Model. For complex scientific workflows that generate large amounts of RDF triples, single-machine provenance management becomes inadequate over time. In this thesis, we research how HBase capabilities can be leveraged for distributed storage and querying of provenance data represented in RDF. We architect the ProvBase system that incorporates an HBase/Hadoop backend, propose a storage schema to hold provenance triples, and design querying algorithms to evaluate SPARQL queries in the system. We conduct an experimental study to show the feasibility of our approach

Scholarworks@UTRGV Univ. of Texas RioGrande Valley