2,944 research outputs found

    Dynamic Resource Allocation for Parallel Data Processing in Cloud Computing Environment

    Get PDF
    Dynamic resource allocation problem is one of the most challenging problems in the resource management problems. The dynamic resource allocation in cloud computing has attracted attention of the research community in the last few years. Many researchers around the world have come up with new ways of facing this challenge. Ad-hoc parallel data processing has emerged to be one of the killer applications for Infrastructure-as-a-Service (IaaS) cloud. Number of Cloud provider companies has started to include frameworks for parallel data processing in their product which making it easy for customers to access these services and to deploy their programs. The processing frameworks which are currently used have been designed for static and homogeneous cluster setups. So the allocated resources may be inadequate for large parts of the submitted tasks and unnecessarily increase processing cost and time. Again due to opaque nature of cloud, static allocation of resources is possible, but vice-versa in dynamic situations. The proposed new Generic data processing framework is intended to explicitly exploit the dynamic resource allocation in cloud for task scheduling and execution

    Plasma Edge Kinetic-MHD Modeling in Tokamaks Using Kepler Workflow for Code Coupling, Data Management and Visualization

    Get PDF
    A new predictive computer simulation tool targeting the development of the H-mode pedestal at the plasma edge in tokamaks and the triggering and dynamics of edge localized modes (ELMs) is presented in this report. This tool brings together, in a coordinated and effective manner, several first-principles physics simulation codes, stability analysis packages, and data processing and visualization tools. A Kepler workflow is used in order to carry out an edge plasma simulation that loosely couples the kinetic code, XGC0, with an ideal MHD linear stability analysis code, ELITE, and an extended MHD initial value code such as M3D or NIMROD. XGC0 includes the neoclassical ion-electron-neutral dynamics needed to simulate pedestal growth near the separatrix. The Kepler workflow processes the XGC0 simulation results into simple images that can be selected and displayed via the Dashboard, a monitoring tool implemented in AJAX allowing the scientist to track computational resources, examine running and archived jobs, and view key physics data, all within a standard Web browser. The XGC0 simulation is monitored for the conditions needed to trigger an ELM crash by periodically assessing the edge plasma pressure and current density profiles using the ELITE code. If an ELM crash is triggered, the Kepler workflow launches the M3D code on a moderate-size Opteron cluster to simulate the nonlinear ELM crash and to compute the relaxation of plasma profiles after the crash. This process is monitored through periodic outputs of plasma fluid quantities that are automatically visualized with AVS/Express and may be displayed on the Dashboard. Finally, the Kepler workflow archives all data outputs and processed images using HPSS, as well as provenance information about the software and hardware used to create the simulation. The complete process of preparing, executing and monitoring a coupled-code simulation of the edge pressure pedestal buildup and the ELM cycle using the Kepler scientific workflow system is described in this paper

    Constructing distributed time-critical applications using cognitive enabled services

    Get PDF
    Time-critical analytics applications are increasingly making use of distributed service interfaces (e.g., micro-services) that support the rapid construction of new applications by dynamically linking the services into different workflow configurations. Traditional service-based applications, in fixed networks, are typically constructed and managed centrally and assume stable service endpoints and adequate network connectivity. Constructing and maintaining such applications in dynamic heterogeneous wireless networked environments, where limited bandwidth and transient connectivity are commonplace, presents significant challenges and makes centralized application construction and management impossible. In this paper we present an architecture which is capable of providing an adaptable and resilient method for on-demand decentralized construction and management of complex time-critical applications in such environments. The approach uses a Vector Symbolic Architecture (VSA) to compactly represent an application as a single semantic vector that encodes the service interfaces, workflow, and the time-critical constraints required. By extending existing services interfaces, with a simple cognitive layer that can interpret and exchange the vectors, we show how the required services can be dynamically discovered and interconnected in a completely decentralized manner. We demonstrate the viability of this approach by using a VSA to encode various time-critical data analytics workflows. We show that these vectors can be used to dynamically construct and run applications using services that are distributed across an emulated Mobile Ad-Hoc Wireless Network (MANET). Scalability is demonstrated via an empirical evaluation

    The workflow trace archive:Open-access data from public and private computing infrastructures

    Get PDF
    Realistic, relevant, and reproducible experiments often need input traces collected from real-world environments. In this work, we focus on traces of workflows - common in datacenters, clouds, and HPC infrastructures. We show that the state-of-the-art in using workflow-traces raises important issues: (1) the use of realistic traces is infrequent and (2) the use of realistic, open-access traces even more so. Alleviating these issues, we introduce the Workflow Trace Archive (WTA), an open-access archive of workflow traces from diverse computing infrastructures and tooling to parse, validate, and analyze traces. The WTA includes {>}48>48 million workflows captured from {>}10>10 computing infrastructures, representing a broad diversity of trace domains and characteristics. To emphasize the importance of trace diversity, we characterize the WTA contents and analyze in simulation the impact of trace diversity on experiment results. Our results indicate significant differences in characteristics, properties, and workflow structures between workload sources, domains, and fields

    Towards a Reference Architecture with Modular Design for Large-scale Genotyping and Phenotyping Data Analysis: A Case Study with Image Data

    Get PDF
    With the rapid advancement of computing technologies, various scientific research communities have been extensively using cloud-based software tools or applications. Cloud-based applications allow users to access software applications from web browsers while relieving them from the installation of any software applications in their desktop environment. For example, Galaxy, GenAP, and iPlant Colaborative are popular cloud-based systems for scientific workflow analysis in the domain of plant Genotyping and Phenotyping. These systems are being used for conducting research, devising new techniques, and sharing the computer assisted analysis results among collaborators. Researchers need to integrate their new workflows/pipelines, tools or techniques with the base system over time. Moreover, large scale data need to be processed within the time-line for more effective analysis. Recently, Big Data technologies are emerging for facilitating large scale data processing with commodity hardware. Among the above-mentioned systems, GenAp is utilizing the Big Data technologies for specific cases only. The structure of such a cloud-based system is highly variable and complex in nature. Software architects and developers need to consider totally different properties and challenges during the development and maintenance phases compared to the traditional business/service oriented systems. Recent studies report that software engineers and data engineers confront challenges to develop analytic tools for supporting large scale and heterogeneous data analysis. Unfortunately, less focus has been given by the software researchers to devise a well-defined methodology and frameworks for flexible design of a cloud system for the Genotyping and Phenotyping domain. To that end, more effective design methodologies and frameworks are an urgent need for cloud based Genotyping and Phenotyping analysis system development that also supports large scale data processing. In our thesis, we conduct a few studies in order to devise a stable reference architecture and modularity model for the software developers and data engineers in the domain of Genotyping and Phenotyping. In the first study, we analyze the architectural changes of existing candidate systems to find out the stability issues. Then, we extract architectural patterns of the candidate systems and propose a conceptual reference architectural model. Finally, we present a case study on the modularity of computation-intensive tasks as an extension of the data-centric development. We show that the data-centric modularity model is at the core of the flexible development of a Genotyping and Phenotyping analysis system. Our proposed model and case study with thousands of images provide a useful knowledge-base for software researchers, developers, and data engineers for cloud based Genotyping and Phenotyping analysis system development

    Workflow repository for providing configurable workflow in ERP

    Get PDF
    Workflow pada ERP dengan domain fungsi yang besar rentan dengan adanya duplikasi. Membuat workflow repository yang menyimpan berbagai macam workflow dari proses bisnis ERP yang dapat digunakan untuk menyusun workflow baru sesuai kebutuhan tenant baru Metode yang diusulkan: Metode yang diusulkan terdiri dari 2 tahapan, preprocessing dan processing. Tahap preprocessing bertujuan untuk mencari common dan sub variant dari existing workflow variant. Workflow variant yang disimpan oleh pengguna adalah Procure to Pay workflow. Variasi tersebut diseleksi berdasarkan kemiripannya dengan similarity filtering, kemudian dimerge untuk mencari common dan sub variantnya. Common dan sub variant disimpan menggunakan metadata yang dipetakan pada basis data relasional. Deteksi common dan sub variant workflow mencapai tingkat akurasi sebesar 92%. Ccommon workflow terdiri dari 3-common dari 8-variant workflow. Common workflow tersebut memiliki tingkat kompleksitas lebih rendah 10% dari model sebelumnya. Tahapan processing adalah tahapan penyediaan configurable workflow. Pengguna memasukan query model untuk mencari workflow yang diinginkan. Dengan menggunakan metode similarity filtering, didapatkan common dan/atau sub variant yang memungkinkan. Pengguna dapat menggunakan common workflow melalui workflow designer untuk melakukan rekomposisi ulang. Penyediaan configurable workflow oleh ERP mencapai tingkat 100% dimana apapun yang diinginkan pengguna dapat disediakaan workflownya oleh ERP, ataupun sebagai dasar membentuk workflow yang lain. Berdasarkan hasil percobaan, tempat penyimpanan workflow dapat dibangun dengan arsitektur yang diajukan dan mampu menyimpan dan menyediakan workflow. Tempat penyimpanan ERP mampu mendeteksi workflow yang bersifat common dan sub variant. Tempat penyimpanan ERP mampu menyediakan configurable workflow, dimana pengguna dapat memanfaatkan common dan sub variant workflow untuk menjadi dasar mengkomposisi workflow yang lain. =================================================================================================== Workflow in ERP which covered big domain faced duplication issues. Scope of this research was developing workflow from business process ERP which could be used for required workflow as user needs. Proposed approach consisted of 2 stages preprocessing and processing. Preprocessing stages aimed for finding common and variant of sub workflow based on existing workflow variant. The workflow variants that were stored by user were procured to pay workflow. The workflows was filtered by similarity filtering method then merged for identifying the common and variant of sub workflow. The common and sub variant workflow were stored using metadata that mapped into relational database. The common and variant of sub workflow detection achieved 92% accuracy. The common workflow consisted of 3- the common workflow from 8-variant workflow. The common workflow has 10% lesser complexity than its predecessor. Processing was providing configurable workflow. User inputted query model to find required workflow. Utilizing similarity filtering, possible the common and variant of sub workflow was collected. User used the common workflow through workflow designer to recompose. Providing configurable workflow ERP achieved 100%, where any user need would be provided by ERP, as workflow or as based template for creating other. Based on evaluation, repository was built based on proposed architecture and was able to store or provide workflow. Repository detected workflow whether common or variant of sub workflow. Repository ERP was able to provide configurable ERP, where user utilized common and variant of sub workflow as based for creating one of their need
    corecore