8,319 research outputs found
Enabling quantitative data analysis through e-infrastructures
This paper discusses how quantitative data analysis in the social sciences can engage with and exploit an e-Infrastructure. We highlight how a number of activities which are central to quantitative data analysis, referred to as ‘data management’, can benefit from e-infrastructure support. We conclude by discussing how these issues are relevant to the DAMES (Data Management through e-Social Science) research Node, an ongoing project that aims to develop e-Infrastructural resources for quantitative data analysis in the social sciences
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
Towards a service-oriented e-infrastructure for multidisciplinary environmental research
Research e-infrastructures are considered to have generic and thematic parts. The generic part provids high-speed networks, grid (large-scale distributed computing) and database systems (digital repositories and data transfer systems) applicable to all research commnities irrespective of discipline. Thematic parts are specific deployments of e-infrastructures to support diverse virtual research communities. The needs of a virtual community of multidisciplinary envronmental researchers are yet to be investigated. We envisage and argue for an e-infrastructure that will enable environmental researchers to develop environmental models and software entirely out of existing components through loose coupling of diverse digital resources based on the service-oriented achitecture. We discuss four specific aspects for consideration for a future e-infrastructure: 1) provision of digital resources (data, models & tools) as web services, 2) dealing with stateless and non-transactional nature of web services using workflow management systems, 3) enabling web servce discovery, composition and orchestration through semantic registries, and 4) creating synergy with existing grid infrastructures
PaPaS: A Portable, Lightweight, and Generic Framework for Parallel Parameter Studies
The current landscape of scientific research is widely based on modeling and
simulation, typically with complexity in the simulation's flow of execution and
parameterization properties. Execution flows are not necessarily
straightforward since they may need multiple processing tasks and iterations.
Furthermore, parameter and performance studies are common approaches used to
characterize a simulation, often requiring traversal of a large parameter
space. High-performance computers offer practical resources at the expense of
users handling the setup, submission, and management of jobs. This work
presents the design of PaPaS, a portable, lightweight, and generic workflow
framework for conducting parallel parameter and performance studies. Workflows
are defined using parameter files based on keyword-value pairs syntax, thus
removing from the user the overhead of creating complex scripts to manage the
workflow. A parameter set consists of any combination of environment variables,
files, partial file contents, and command line arguments. PaPaS is being
developed in Python 3 with support for distributed parallelization using SSH,
batch systems, and C++ MPI. The PaPaS framework will run as user processes, and
can be used in single/multi-node and multi-tenant computing systems. An example
simulation using the BehaviorSpace tool from NetLogo and a matrix multiply
using OpenMP are presented as parameter and performance studies, respectively.
The results demonstrate that the PaPaS framework offers a simple method for
defining and managing parameter studies, while increasing resource utilization.Comment: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced
Research Computing, July 22--26, 2018, Pittsburgh, PA, US
Toward a first-principles integrated simulation of tokamak edge plasmas
Performance of the ITER is anticipated to be highly sensitive to the edge plasma condition. The edge pedestal in ITER needs to be predicted from an integrated simulation of the necessary first-principles, multi-scale physics codes. The mission of the SciDAC Fusion Simulation Project (FSP) Prototype Center for Plasma Edge Simulation (CPES) is to deliver such a code integration framework by (1) building new kinetic codes XGC0 and XGC1, which can simulate the edge pedestal buildup; (2) using and improving the existing MHD codes ELITE, M3D-OMP, M3D-MPP and NIMROD, for study of large-scale edge instabilities called Edge Localized Modes (ELMs); and (3) integrating the codes into a framework using cutting-edge computer science technology. Collaborative effort among physics, computer science, and applied mathematics within CPES has created the first working version of the End-to-end Framework for Fusion Integrated Simulation (EFFIS), which can be used to study the pedestal-ELM cycles
Enhancing Workflow with a Semantic Description of Scientific Intent
Peer reviewedPreprin
- …