7,541 research outputs found

    SciTokens: Capability-Based Secure Access to Remote Scientific Data

    Full text link
    The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. In this paper, we introduce SciTokens, open source software to help scientists manage their security credentials more reliably and securely. We describe the SciTokens system architecture, design, and implementation addressing use cases from the Laser Interferometer Gravitational-Wave Observatory (LIGO) Scientific Collaboration and the Large Synoptic Survey Telescope (LSST) projects. We also present our integration with widely-used software that supports distributed scientific computing, including HTCondor, CVMFS, and XrootD. SciTokens uses IETF-standard OAuth tokens for capability-based secure access to remote scientific data. The access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems.Comment: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US

    Scientific Workflow Applications on Amazon EC2

    Get PDF
    The proliferation of commercial cloud computing providers has generated significant interest in the scientific computing community. Much recent research has attempted to determine the benefits and drawbacks of cloud computing for scientific applications. Although clouds have many attractive features, such as virtualization, on-demand provisioning, and "pay as you go" usage-based pricing, it is not clear whether they are able to deliver the performance required for scientific applications at a reasonable price. In this paper we examine the performance and cost of clouds from the perspective of scientific workflow applications. We use three characteristic workflows to compare the performance of a commercial cloud with that of a typical HPC system, and we analyze the various costs associated with running those workflows in the cloud. We find that the performance of clouds is not unreasonable given the hardware resources provided, and that performance comparable to HPC systems can be achieved given similar resources. We also find that the cost of running workflows on a commercial cloud can be reduced by storing data in the cloud rather than transferring it from outside

    Performance optimization of big data computing workflows for batch and stream data processing in multi-clouds

    Get PDF
    Workflow techniques have been widely used as a major computing solution in many science domains. With the rapid deployment of cloud infrastructures around the globe and the economic benefits of cloud-based computing and storage services, an increasing number of scientific workflows have migrated or are in active transition to clouds. As the scale of scientific applications continues to grow, it is now common to deploy various data- and network-intensive computing workflows such as serial computing workflows, MapReduce/Spark-based workflows, and Storm-based stream data processing workflows in multi-cloud environments, where inter-cloud data transfer oftentimes plays a significant role in both workflow performance and financial cost. Rigorous mathematical models are constructed to analyze the intra- and inter-cloud execution process of scientific workflows and a class of budget-constrained workflow mapping problems are formulated to optimize the network performance of big data workflows in multi-cloud environments. Research shows that these problems are all NP-complete and a heuristic solution is designed for each that takes into consideration module execution, data transfer, and I/O operations. The performance superiority of the proposed solutions over existing methods are illustrated through extensive simulations and further verified by real-life workflow experiments deployed in public clouds

    QoS-aware Scientific Application Scheduling Algorithm in Cloud Environment

    Get PDF
    Many complex scientific applications are modeled in the form of workflows to carry out large-scale experiments. Because of complexity of scientific processes, scientific workflows need intensive computation and data requirements. Clouds make opportunity for scientific that need high performance computing infrastructure. So scientific can run their application on cloud by their desired QoS. We propose an algorithm that able scientific to select execute plan based on their preference QoS, like time and cost. Proposed algorithm ranks the tasks in workflow and then use UPFF function for select accurate resource, based on user’s QoS. We compared our proposed algorithm with the same work by several scenarios and results show proposed algorithm has better efficiency. Keywords Scientific application, Workflow scheduling, Cloud computin

    IMP Science Gateway: from the Portal to the Hub of Virtual Experimental Labs in Materials Science

    Full text link
    "Science gateway" (SG) ideology means a user-friendly intuitive interface between scientists (or scientific communities) and different software components + various distributed computing infrastructures (DCIs) (like grids, clouds, clusters), where researchers can focus on their scientific goals and less on peculiarities of software/DCI. "IMP Science Gateway Portal" (http://scigate.imp.kiev.ua) for complex workflow management and integration of distributed computing resources (like clusters, service grids, desktop grids, clouds) is presented. It is created on the basis of WS-PGRADE and gUSE technologies, where WS-PGRADE is designed for science workflow operation and gUSE - for smooth integration of available resources for parallel and distributed computing in various heterogeneous distributed computing infrastructures (DCI). The typical scientific workflows with possible scenarios of its preparation and usage are presented. Several typical use cases for these science applications (scientific workflows) are considered for molecular dynamics (MD) simulations of complex behavior of various nanostructures (nanoindentation of graphene layers, defect system relaxation in metal nanocrystals, thermal stability of boron nitride nanotubes, etc.). The user experience is analyzed in the context of its practical applications for MD simulations in materials science, physics and nanotechnologies with available heterogeneous DCIs. In conclusion, the "science gateway" approach - workflow manager (like WS-PGRADE) + DCI resources manager (like gUSE)- gives opportunity to use the SG portal (like "IMP Science Gateway Portal") in a very promising way, namely, as a hub of various virtual experimental labs (different software components + various requirements to resources) in the context of its practical MD applications in materials science, physics, chemistry, biology, and nanotechnologies.Comment: 6 pages, 5 figures, 3 tables; 6th International Workshop on Science Gateways, IWSG-2014 (Dublin, Ireland, 3-5 June, 2014). arXiv admin note: substantial text overlap with arXiv:1404.545
    • …
    corecore