7,541 research outputs found
SciTokens: Capability-Based Secure Access to Remote Scientific Data
The management of security credentials (e.g., passwords, secret keys) for
computational science workflows is a burden for scientists and information
security officers. Problems with credentials (e.g., expiration, privilege
mismatch) cause workflows to fail to fetch needed input data or store valuable
scientific results, distracting scientists from their research by requiring
them to diagnose the problems, re-run their computations, and wait longer for
their results. In this paper, we introduce SciTokens, open source software to
help scientists manage their security credentials more reliably and securely.
We describe the SciTokens system architecture, design, and implementation
addressing use cases from the Laser Interferometer Gravitational-Wave
Observatory (LIGO) Scientific Collaboration and the Large Synoptic Survey
Telescope (LSST) projects. We also present our integration with widely-used
software that supports distributed scientific computing, including HTCondor,
CVMFS, and XrootD. SciTokens uses IETF-standard OAuth tokens for
capability-based secure access to remote scientific data. The access tokens
convey the specific authorizations needed by the workflows, rather than
general-purpose authentication impersonation credentials, to address the risks
of scientific workflows running on distributed infrastructure including NSF
resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds
(e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the
interoperability and security of scientific workflows, SciTokens 1) enables use
of distributed computing for scientific domains that require greater data
protection and 2) enables use of more widely distributed computing resources by
reducing the risk of credential abuse on remote systems.Comment: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced
Research Computing, July 22--26, 2018, Pittsburgh, PA, US
Scientific Workflow Applications on Amazon EC2
The proliferation of commercial cloud computing providers has generated
significant interest in the scientific computing community. Much recent
research has attempted to determine the benefits and drawbacks of cloud
computing for scientific applications. Although clouds have many attractive
features, such as virtualization, on-demand provisioning, and "pay as you go"
usage-based pricing, it is not clear whether they are able to deliver the
performance required for scientific applications at a reasonable price. In this
paper we examine the performance and cost of clouds from the perspective of
scientific workflow applications. We use three characteristic workflows to
compare the performance of a commercial cloud with that of a typical HPC
system, and we analyze the various costs associated with running those
workflows in the cloud. We find that the performance of clouds is not
unreasonable given the hardware resources provided, and that performance
comparable to HPC systems can be achieved given similar resources. We also find
that the cost of running workflows on a commercial cloud can be reduced by
storing data in the cloud rather than transferring it from outside
Performance optimization of big data computing workflows for batch and stream data processing in multi-clouds
Workflow techniques have been widely used as a major computing solution in many science domains. With the rapid deployment of cloud infrastructures around the globe and the economic benefits of cloud-based computing and storage services, an increasing number of scientific workflows have migrated or are in active transition to clouds. As the scale of scientific applications continues to grow, it is now common to deploy various data- and network-intensive computing workflows such as serial computing workflows, MapReduce/Spark-based workflows, and Storm-based stream data processing workflows in multi-cloud environments, where inter-cloud data transfer oftentimes plays a significant role in both workflow performance and financial cost. Rigorous mathematical models are constructed to analyze the intra- and inter-cloud execution process of scientific workflows and a class of budget-constrained workflow mapping problems are formulated to optimize the network performance of big data workflows in multi-cloud environments. Research shows that these problems are all NP-complete and a heuristic solution is designed for each that takes into consideration module execution, data transfer, and I/O operations. The performance superiority of the proposed solutions over existing methods are illustrated through extensive simulations and further verified by real-life workflow experiments deployed in public clouds
Recommended from our members
Scientific Workflows on Clouds with Heterogeneous and Preemptible Instances
QoS-aware Scientific Application Scheduling Algorithm in Cloud Environment
Many complex scientific applications are modeled in the form of workflows to carry out large-scale experiments. Because of complexity of scientific processes, scientific workflows need intensive computation and data requirements. Clouds make opportunity for scientific that need high performance computing infrastructure. So scientific can run their application on cloud by their desired QoS. We propose an algorithm that able scientific to select execute plan based on their preference QoS, like time and cost. Proposed algorithm ranks the tasks in workflow and then use UPFF function for select accurate resource, based on user’s QoS. We compared our proposed algorithm with the same work by several scenarios and results show proposed algorithm has better efficiency. Keywords Scientific application, Workflow scheduling, Cloud computin
IMP Science Gateway: from the Portal to the Hub of Virtual Experimental Labs in Materials Science
"Science gateway" (SG) ideology means a user-friendly intuitive interface
between scientists (or scientific communities) and different software
components + various distributed computing infrastructures (DCIs) (like grids,
clouds, clusters), where researchers can focus on their scientific goals and
less on peculiarities of software/DCI. "IMP Science Gateway Portal"
(http://scigate.imp.kiev.ua) for complex workflow management and integration of
distributed computing resources (like clusters, service grids, desktop grids,
clouds) is presented. It is created on the basis of WS-PGRADE and gUSE
technologies, where WS-PGRADE is designed for science workflow operation and
gUSE - for smooth integration of available resources for parallel and
distributed computing in various heterogeneous distributed computing
infrastructures (DCI). The typical scientific workflows with possible scenarios
of its preparation and usage are presented. Several typical use cases for these
science applications (scientific workflows) are considered for molecular
dynamics (MD) simulations of complex behavior of various nanostructures
(nanoindentation of graphene layers, defect system relaxation in metal
nanocrystals, thermal stability of boron nitride nanotubes, etc.). The user
experience is analyzed in the context of its practical applications for MD
simulations in materials science, physics and nanotechnologies with available
heterogeneous DCIs. In conclusion, the "science gateway" approach - workflow
manager (like WS-PGRADE) + DCI resources manager (like gUSE)- gives opportunity
to use the SG portal (like "IMP Science Gateway Portal") in a very promising
way, namely, as a hub of various virtual experimental labs (different software
components + various requirements to resources) in the context of its practical
MD applications in materials science, physics, chemistry, biology, and
nanotechnologies.Comment: 6 pages, 5 figures, 3 tables; 6th International Workshop on Science
Gateways, IWSG-2014 (Dublin, Ireland, 3-5 June, 2014). arXiv admin note:
substantial text overlap with arXiv:1404.545
- …