Search CORE

56 research outputs found

Enabling FAIR Research in Earth Science through Research Objects

Data-intensive science communities are progressively adopting FAIR practices that enhance the visibility of scientific breakthroughs and enable reuse. At the core of this movement, research objects contain and describe scientific information and resources in a way compliant with the FAIR principles and sustain the development of key infrastructure and tools. This paper provides an account of the challenges, experiences and solutions involved in the adoption of FAIR around research objects over several Earth Science disciplines. During this journey, our work has been comprehensive, with outcomes including: an extended research object model adapted to the needs of earth scientists; the provisioning of digital object identifiers (DOI) to enable persistent identification and to give due credit to authors; the generation of content-based, semantically rich, research object metadata through natural language processing, enhancing visibility and reuse through recommendation systems and third-party search engines; and various types of checklists that provide a compact representation of research object quality as a key enabler of scientific reuse. All these results have been integrated in ROHub, a platform that provides research object management functionality to a wealth of applications and interfaces across different scientific communities. To monitor and quantify the community uptake of research objects, we have defined indicators and obtained measures via ROHub that are also discussed herein

arXiv.org e-Print Archive

OA Earth-prints Repository

Enabling FAIR research in Earth Science through research objects

Author: Albani Mirko
Albani Sergio
Aldridge Timothy
Altintas Ilkay
Boler Fran
Crawl Daniel
De Leo Francesco
Foglini Federica
Garcia-Silva Andres
Genazzio Melissa
Glaves Helen M.
Gomez-Perez Jose Manuel
Grande Valentina
Krystek Marcin
Laney Christine
Lazzarini Michele
Leone Rosemarie
Loescher Henry W.
Mantovani Simone
Marelli Fulvio
Meertens Charles
Napier Hazel J.
Palma Raul
Romaniello Vito
Salvi Stefano
Silvagni Cristiano
Trasatti Elisa
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Workflows and extensions to the Kepler scientific workflow system to support environmental sensor data access and analysis

Author: Altintas
Altintas
Altintas
Altintas
Baker
Barcena
Baumgartner
Borer
Botts
Bowen
Bowers
Callaghan
Chelton
Cornillon
Daniel Crawl
Deelman
Derik Barseghian
Dong
Elizabeth T. Borer
Emery
Eric W. Seabloom
Fegraus
Gallagher
Gallagher
Halbert
Hare
Ilkay Altintas
Jackson
James Gallagher
Jones
Kelly
Ludäscher
Mark Schildhauer
Matthew B. Jones
Nathan Potter
O'Neill
Park
Park
Parviez R. Hosseini
Pennington
Peter Cornillon
Podhorszki
Song
Taylor
Tilak
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Scientific Workflow Tools

Author: Altintas Ilkay
Crawl Daniel
Publication venue
Publication date: 03/11/2010
Field of study

Although an increasing amount of cyberinfrastructure technologies have emerged in the last few years to achieve remote data access, distributed job execution, and data management, orchestrating these components with minimal overhead still remains a difficult task for scientists. Scientific workflow systems improve this situation by creating interfaces to a variety of technologies and automating the execution and monitoring of the workflows. A scientific workflow is the process of combining data and processes into a structured set of steps that implement semi-automated computational solutions of a scientific problem. Kepler is a cross-project collaboration, with a purpose to develop a domain-independent scientific workflow system. It provides an environment in which scientists can design and execute scientific workflows by specifying the desired sequence of computational actions and the appropriate dataflow. Currently deployed workflows range from local analytical pipelines to distributed, high-performance applications that can run in cluster, grid, or cloud computing environments. The scientific workflow approach offers a number of advantages over traditional scripting-based approaches, including simplified configuration; improved reusability, maintenance and sharing; automated provenance management to capture and browse the lineage of data products; and support for fault-tolerance. This talk presents an overview of common scientific workflow requirements and illustrates these features using the Kepler scientific workflow system. We highlight the features of Kepler in several scientific applications, as well as describe upcoming extensions and improvements

InterNano Nanomanufacturing Repository

Provenance for MapReduce-based data-intensive workflows

Author: Daniel Crawl
Ilkay Altintas
Jianwu Wang
Publication venue: ACM
Publication date: 14/11/2011
Field of study

Crossref

A Framework for Distributed Data-Parallel Execution in the Kepler Scientific Workflow System

Author: Altintas Ilkay
Crawl Daniel
Wang Jianwu
Publication venue: Published by Elsevier B.V.
Publication date: 01/01/2012
Field of study

AbstractDistributed Data-Parallel (DDP) patterns such as MapReduce have become increasingly popular as solutions to facilitate data-intensive applications, resulting in a number of systems supporting DDP workﬂows. Yet, applications or workﬂows built using these patterns are usually tightly-coupled with the underlying DDP execution engine they select. We present a framework for distributed data-parallel execution in the Kepler scientiﬁc workﬂow system that enables users to easily switch between different DDP execution engines. We describe a set of DDP actors based on DDP patterns and directors for DDP workﬂow executions within the presented framework. We demonstrate how DDP workﬂows can be easily composed in the Kepler graphic user interface through the reuse of these DDP actors and directors and how the generated DDP workﬂows can be executed in different distributed environments. Via a bioinformatics usecase, we discuss the usability of the proposed framework and validate its execution scalability

Elsevier - Publisher Connector

Crossref

Kepler WebView: A Lightweight, Portable Framework for Constructing Real-time Web Interfaces of Scientific Workflows

Author: Altintas Ilkay
Crawl Daniel
Singh Alok
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 01/01/2016
Field of study

AbstractModern web technologies facilitate the creation of high-quality data visualizations, and rich, interactive components across a wide variety of devices. Scientific workflow systems can greatly benefit from these technologies by giving scientists a better understanding of their data or model leading to new insights. While several projects have enabled web access to scientific workflow systems, they are primarily organized as a large portal server encapsulating the workflow engine. In this vision paper, we propose the design for Kepler WebView, a lightweight framework that integrates web technologies with the Kepler Scientific Workflow System. By embedding a web server in the Kepler process, Kepler WebView enables a wide variety of usage scenarios that would be difficult or impossible using the portal model

Elsevier - Publisher Connector

Crossref

PubMed Central

Provenance for MapReduce-based data-intensive workflows

Author: Altintas Ilkay
Crawl Daniel
Wang Jianwu
Publication venue: ACM
Publication date: 14/11/2011
Field of study

C '11: International Conference for High Performance Computing, Networking, Storage and Analysis Seattle Washington USA 14 November 2011MapReduce has been widely adopted by many business and scientific applications for data-intensive processing of large datasets. There are increasing efforts for workflows and systems to work with the MapReduce programming model and the Hadoop environment including our work on a higher-level programming model for MapReduce within the Kepler Scientific Workflow System. However, to date, provenance of MapReduce-based workflows and its effects on workflow execution performance have not been studied in depth. In this paper, we present an extension to our earlier work on MapReduce in Kepler to record the provenance of MapReduce workflows created using the Kepler+Hadoop framework. In particular, we present: (i) a data model that is able to capture provenance inside a MapReduce job as well as the provenance for the workflow that submitted it; (ii) an extension to the Kepler+Hadoop architecture to record provenance using this data model on MySQL Cluster; (iii) a programming interface to query the collected information; and (iv) an evaluation of the scalability of collecting and querying this provenance information using two scenarios with different characteristics.The authors would like to thank the rest of the Kepler team for their collaboration. This work was supported by NSF SDCI Award OCI-0722079 for Kepler/CORE and ABI Award DBI-1062565 for bioKepler, DOE SciDAC Award DE-FC02-07ER25811 for SDM Center, the UCGRID Project, and an SDSC Triton Research Opportunities grant.https://dl.acm.org/doi/10.1145/2110497.211050

MD-SOAR Maryland Shared Open Access Repository