3,774 research outputs found

    Implementing and Running a Workflow Application on Cloud Resources

    Get PDF
    Scientist need to run applications that are time and resource consuming, but, not all of them, have the requires knowledge to run this applications in a parallel manner, by using grid, cluster or cloud resources. In the past few years many workflow building frameworks were developed in order to help scientist take a better advantage of computing resources, by designing workflows based on their applications and executing them on heterogeneous resources. This paper presents a case study of implementing and running a workflow for an E-bay data retrieval application. The workflow was designed using Askalon framework and executed on the cloud resources. The purpose of this paper is to demonstrate how workflows and cloud resources can be used by scientists in order to achieve speedup for their application without the need of spending large amounts of money on computational resources.Workflow, Cloud Resource

    Macaw: The Machine Learning Magnetometer Calibration Workflow

    Full text link
    In Earth Systems Science, many complex data pipelines combine different data sources and apply data filtering and analysis steps. Typically, such data analysis processes are historically grown and implemented with many sequentially executed scripts. Scientific workflow management systems (SWMS) allow scientists to use their existing scripts and provide support for parallelization, reusability, monitoring, or failure handling. However, many scientists still rely on their sequentially called scripts and do not profit from the out-of-the-box advantages a SWMS can provide. In this work, we transform the data analysis processes of a Machine Learning-based approach to calibrate the platform magnetometers of non-dedicated satellites utilizing neural networks into a workflow called Macaw (MAgnetometer CAlibration Workflow). We provide details on the workflow and the steps needed to port these scripts to a scientific workflow. Our experimental evaluation compares the original sequential script executions on the original HPC cluster with our workflow implementation on a commodity cluster. Our results show that through porting, our implementation decreased the allocated CPU hours by 50.2% and the memory hours by 59.5%, leading to significantly less resource wastage. Further, through parallelizing single tasks, we reduced the runtime by 17.5%.Comment: Paper accepted in 2022 IEEE International Conference on Data Mining Workshops (ICDMW

    EPiK-a Workflow for Electron Tomography in Kepler.

    Get PDF
    Scientific workflows integrate data and computing interfaces as configurable, semi-automatic graphs to solve a scientific problem. Kepler is such a software system for designing, executing, reusing, evolving, archiving and sharing scientific workflows. Electron tomography (ET) enables high-resolution views of complex cellular structures, such as cytoskeletons, organelles, viruses and chromosomes. Imaging investigations produce large datasets. For instance, in Electron Tomography, the size of a 16 fold image tilt series is about 65 Gigabytes with each projection image including 4096 by 4096 pixels. When we use serial sections or montage technique for large field ET, the dataset will be even larger. For higher resolution images with multiple tilt series, the data size may be in terabyte range. Demands of mass data processing and complex algorithms require the integration of diverse codes into flexible software structures. This paper describes a workflow for Electron Tomography Programs in Kepler (EPiK). This EPiK workflow embeds the tracking process of IMOD, and realizes the main algorithms including filtered backprojection (FBP) from TxBR and iterative reconstruction methods. We have tested the three dimensional (3D) reconstruction process using EPiK on ET data. EPiK can be a potential toolkit for biology researchers with the advantage of logical viewing, easy handling, convenient sharing and future extensibility
    • 

    corecore