Search CORE

779 research outputs found

Lotaru: Locally Predicting Workflow Task Runtimes for Resource Management on Heterogeneous Infrastructures

Author: Bader Jonathan
Kao Odej
Lehmann Fabian
Leser Ulf
Thamsen Lauritz
Publication venue
Publication date: 13/09/2023
Field of study

Many resource management techniques for task scheduling, energy and carbon efficiency, and cost optimization in workflows rely on a-priori task runtime knowledge. Building runtime prediction models on historical data is often not feasible in practice as workflows, their input data, and the cluster infrastructure change. Online methods, on the other hand, which estimate task runtimes on specific machines while the workflow is running, have to cope with a lack of measurements during start-up. Frequently, scientific workflows are executed on heterogeneous infrastructures consisting of machines with different CPU, I/O, and memory configurations, further complicating predicting runtimes due to different task runtimes on different machine types. This paper presents Lotaru, a method for locally predicting the runtimes of scientific workflow tasks before they are executed on heterogeneous compute clusters. Crucially, our approach does not rely on historical data and copes with a lack of training data during the start-up. To this end, we use microbenchmarks, reduce the input data to quickly profile the workflow locally, and predict a task's runtime with a Bayesian linear regression based on the gathered data points from the local workflow execution and the microbenchmarks. Due to its Bayesian approach, Lotaru provides uncertainty estimates that can be used for advanced scheduling methods on distributed cluster infrastructures. In our evaluation with five real-world scientific workflows, our method outperforms two state-of-the-art runtime prediction baselines and decreases the absolute prediction error by more than 12.5%. In a second set of experiments, the prediction performance of our method, using the predicted runtimes for state-of-the-art scheduling, carbon reduction, and cost prediction, enables results close to those achieved with perfect prior knowledge of runtimes

arXiv.org e-Print Archive

On-the-fly tracing for data-centric computing : parallelization, workflow and applications

Author: Jiang Lei
Publication venue: LSU Digital Commons
Publication date: 01/01/2013
Field of study

As data-centric computing becomes the trend in science and engineering, more and more hardware systems, as well as middleware frameworks, are emerging to handle the intensive computations associated with big data. At the programming level, it is crucial to have corresponding programming paradigms for dealing with big data. Although MapReduce is now a known programming model for data-centric computing where parallelization is completely replaced by partitioning the computing task through data, not all programs particularly those using statistical computing and data mining algorithms with interdependence can be re-factorized in such a fashion. On the other hand, many traditional automatic parallelization methods put an emphasis on formalism and may not achieve optimal performance with the given limited computing resources. In this work we propose a cross-platform programming paradigm, called on-the-fly data tracing , to provide source-to-source transformation where the same framework also provides the functionality of workflow optimization on larger applications. Using a big-data approximation computations related to large-scale data input are identified in the code and workflow and a simplified core dependence graph is built based on the computational load taking in to account big data. The code can then be partitioned into sections for efficient parallelization; and at the workflow level, optimization can be performed by adjusting the scheduling for big-data considerations, including the I/O performance of the machine. Regarding each unit in both source code and workflow as a model, this framework enables model-based parallel programming that matches the available computing resources. The techniques used in model-based parallel programming as well as the design of the software framework for both parallelization and workflow optimization as well as its implementations with multiple programming languages are presented in the dissertation. Then, the following experiments are performed to validate the framework: i) the benchmarking of parallelization speed-up using typical examples in data analysis and machine learning (e.g. naive Bayes, k-means) and ii) three real-world applications in data-centric computing with the framework are also described to illustrate the efficiency: pattern detection from hurricane and storm surge simulations, road traffic flow prediction and text mining from social media data. In the applications, it illustrates how to build scalable workflows with the framework along with performance enhancements

Louisiana State University

토카막 통합 시뮬레이션 코드의 개발과 여러 장치에 대한 적용 연구

Author: 이찬영
Publication venue: 서울대학교 대학원
Publication date: 01/08/2022
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 에너지시스템공학부, 2022. 8. 나용수.The in-depth design and implementation of a newly developed integrated suite of codes, TRIASSIC (tokamak reactor integrated automated suite for simulation and computation), are reported. The suite comprises existing plasma simulation codes, including equilibrium solvers, 1.5D and 2D plasma transport solvers, neoclassical and anomalous transport models, current drive and heating (cooling) models, and 2D grid generators. The components in TRIASSIC could be fully modularized, by adopting a generic data structure as its internal data. Due to a unique interfacing method that does not depend on the generic data itself, legacy codes that are no longer maintained by the original author were easily interfaced. The graphical user interface and the parallel computing of the framework and its components are also addressed. The verification of TRIASSIC in terms of equilibrium, transport, and heating is also shown. Following the data model and definition of the data structure, a declarative programming method was adopted in the core part of the framework. The method was used to keep the internal data consistency of the data by enforcing the reciprocal relations between the data nodes, contributing to extra flexibility and explicitness of the simulations. TRIASSIC was applied on various devices including KSTAR, VEST, and KDEMO, owing to its flexibility in composing a workflow. TRIASSIC was validated against KSTAR plasmas in terms of interpretive and predictive modelings. The prediction and validation on the VEST device using TRIASSIC are also shown. For the applications to the upcoming KDEMO device, the machine design parameters were optimized, targeting an economical fusion demonstration reactor.본 연구에서는 TRIASSIC (tokamak reactor integrated automated suite for simulation and computation) 코드의 자세한 디자인과 실행 결과에 대해 소개합니다. 이 시뮬레이션 코드는 기존에 존재하던 플라즈마 평형, 1.5차원 및 2차원 플라즈마 수송, 신고전 및 난류 수송 모델, 전류 구동 및 가열 (냉각) 모델, 그리고 2차원 격자 생성기 등의 코드를 구성하여 만들어졌습니다. 프레임워크 내 데이터 구조로써 일반 데이터 구조를 채택함으로써 TRIASSIC의 코드 구성요소들은 완전한 모듈화 방식으로 결합될 수 있었습니다. 일반 데이터 구조에 의존하지 않는 독특한 코드 결합 방식으로 인해, 더 이상 유지보수되지 않는 레거시 코드들 또한 쉽게 결합될 수 있었습니다. 본 코드의 그래피컬 유저 인터페이스, 프레임워크와 코드 구성 요소들의 병렬 컴퓨팅에 관한 내용도 다뤄집니다. 평형, 수송, 그리고 가열 측면에서의 TRIASSIC 시뮬레이션의 검증 내용도 소개됩니다. 시뮬레이션 프레임워크 내 일반 데이터 구조의 데이터 모델과 데이터 정의를 만족시키기 위해, 데이터를 관리하는 프레임워크의 중심부에는 선언적 프로그래밍이 도입되었습니다. 선언적 프로그래밍을 통해 일반 데이터의 데이터 노드 간 관계식을 만족시킴으로써 데이터 간 내부 일관성을 확보하고, 코드의 유연성과 명시성을 추가적으로 확보할 수 있었습니다. TRIASSIC은 해석적, 예측적 모델링 측면에서 KSTAR 플라즈마를 대상으로 검증되었습니다. VEST 장치를 대상으로 한 예측 및 이에 대한 검증 내용 또한 서술됩니다. 경제적인 핵융합 실증로 건설을 목표로 KDEMO 장치에 대한 적용 및 장치 설계 최적화 연구도 소개됩니다.Abstract １ Table of Contents ２ List of Figures ４ List of Tables １０ Chapter 1. Introduction １１ 1.1. Background １１ 1.1.1. Fusion Reactor and Modeling １１ 1.1.2. Interpretive Analysis and Predictive Modeling １７ 1.1.3. Modular Approach ２１ 1.1.4. The Standard Data Structure ２４ 1.1.5. The Internal Data Consistency in a Generic Data ２８ 1.1.6. Integration of Physics Codes into IDS ２９ 1.2. Overview of the Research ３１ Chapter 2. Development of Integrated Suite of Codes ３３ 2.1. Development of TRIASSIC ３３ 2.1.1. Design Requirements ３３ 2.1.2. Overview of TRIASSIC ３５ 2.1.3. Comparison of Integrated Simulation Codes ４０ 2.2. Components in the Framework ４３ 2.2.1. Physics Codes Interfaced with the Framework ４３ 2.2.2. Physics Code Interfacings ４６ 2.2.3. Graphical User Interface ５２ 2.2.4. Jobs Scheduler and MPI ５５ 2.3. Verifications ５７ 2.3.1. The Coordinate Conventions ５７ 2.3.2. Coupling of Equilibrium-Transport ５９ 2.3.3. Neoclassical Transport and Bootstrap Current ６３ 2.3.4. Heating and Current Drive ６５ Chapter 3. Improvements in Keeping the Internal Data Consistency ６８ 3.1. Background ６８ 3.2. Possible Implementations of a Component ７１ 3.3. A Method Adopted in the Framework ７３ 3.3.1. Prerequisites and Relation Definitions ７３ 3.3.2. Adding Relations in the Framework ７８ 3.3.3. Applying Relations ８０ 3.4. Performance and Flexibility of the Framework ８３ 3.4.1. Performance Enhancement ８３ 3.4.2. Flexibility and Maintenance of the Framework ８５ Chapter 4. Applications to Various Devices ９１ 4.1. Applications to KSTAR ９１ 4.1.1. Kinetic equilibrium workflow and its validation ９１ 4.1.2. Stationary-state predictive modeling workflow ９５ 4.2. Application to VEST １０２ 4.2.1. Time-dependent predictive modeling workflow １０３ 4.3. Application to KDEMO １０６ 4.3.1. Predictive simulation workflow for optimization １０６ Chapter 5. Summary and Conclusion １１２ 5.1. Summary and Conclusion １１２ Appendix １１６ A. Code Snippet of the Relation Definition １１６ Bibliography １１８ Abstract in Korean １２６박

SNU Open Repository and Archive

Parallel surrogate detection in large-scale simulations

Author: Jiang Lei
Publication venue: LSU Digital Commons
Publication date: 01/01/2011
Field of study

Simulation has become a useful approach in scientific computing and engineering for its ability to model real natural or human systems. In particular, for complex systems such as hurricanes, wildfire disasters, and real-time road traffic, simulation methods are able to provide researchers, engineers and decision makers predicted values in order to help them to take appropriate actions. For large-scale problems, the simulations usually take a lot of time on supercomputers, thus making real-time predictions more difficult. Approximation models that mimic the behavior of simulation models but are computationally cheaper, namely surrogate models , are desired in such scenarios. In the thesis, a framework for scalable surrogate detection in large-scale simulations is presented with the basic idea of using functions to represent functions . The following issues are discussed in the thesis: i) the data mining approaches to detecting and optimizing the surrogate models; ii) the scalable and automated workflow of constructing surrogate models from large-scale simulations; and iii) the system design and implementation with the application of storm surge simulations in the occurrence of hurricanes

Louisiana State University

Automated Machine Learning implementation framework in the banking sector

Author: Carmona Pedro Bernardo Resina Baptista Barreiros
Publication venue
Publication date: 24/01/2022
Field of study

Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsAutomated Machine Learning is a subject in the Machine Learning field, designed to give the possibility of Machine Learning use to non-expert users, it aroused from the lack of subject matter experts, trying to remove humans from these topic implementations. The advantages behind automated machine learning are leaning towards the removal of human implementation, fastening the machine learning deployment speed. The organizations will benefit from effective solutions benchmarking and validations. The use of an automated machine learning implementation framework can deeply transform an organization adding value to the business by freeing the subject matter experts of the low-level machine learning projects, letting them focus on high level projects. This will also help the organization reach new competence, customization, and decision-making levels in a higher analytical maturity level. This work pretends, firstly to investigate the impact and benefits automated machine learning implementation in the banking sector, and afterwards develop an implementation framework that could be used by banking institutions as a guideline for the automated machine learning implementation through their departments. The autoML advantages and benefits are evaluated regarding business value and competitive advantage and it is presented the implementation in a fictitious institution, considering all the need steps and the possible setbacks that could arise. Banking institutions, in their business have different business processes, and since most of them are old institutions, the main concerns are related with the automating their business process, improving their analytical maturity and sensibilizing their workforce to the benefits of the implementation of new forms of work. To proceed to a successful implementation plan should be known the institution particularities, adapt to them and ensured the sensibilization of the workforce and management to the investments that need to be made and the changes in all levels of their organizational work that will come from that, that will lead to a lot of facilities in everyone’s daily work

Repositório da Universidade Nova de Lisboa

Survey of scientific programming techniques for the management of data-intensive engineering environments

Author: Alor-Hernández Giner
Mejia Miranda Jezreel
Álvarez Rodríguez José María
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2018
Field of study

The present paper introduces and reviews existing technology and research works in the field of scientific programming methods and techniques in data-intensive engineering environments. More specifically, this survey aims to collect those relevant approaches that have faced the challenge of delivering more advanced and intelligent methods taking advantage of the existing large datasets. Although existing tools and techniques have demonstrated their ability to manage complex engineering processes for the development and operation of safety-critical systems, there is an emerging need to know how existing computational science methods will behave to manage large amounts of data. That is why, authors review both existing open issues in the context of engineering with special focus on scientific programming techniques and hybrid approaches. 1193 journal papers have been found as the representative in these areas screening 935 to finally make a full review of 122. Afterwards, a comprehensive mapping between techniques and engineering and nonengineering domains has been conducted to classify and perform a meta-analysis of the current state of the art. As the main result of this work, a set of 10 challenges for future data-intensive engineering environments have been outlined.The current work has been partially supported by the Research Agreement between the RTVE (the Spanish Radio and Television Corporation) and the UC3M to boost research in the field of Big Data, Linked Data, Complex Network Analysis, and Natural Language. It has also received the support of the Tecnologico Nacional de Mexico (TECNM), National Council of Science and Technology (CONACYT), and the Public Education Secretary (SEP) through PRODEP

Directory of Open Access Journals

Universidad Carlos III de Madrid e-Archivo

The Common Workflow Scheduler Interface: Status Quo and Future Plans

Author: Bader Jonathan
Lehmann Fabian
Leser Ulf
Thamsen Lauritz
Publication venue
Publication date: 01/11/2023
Field of study

Nowadays, many scientific workflows from different domains, such as Remote Sensing, Astronomy, and Bioinformatics, are executed on large computing infrastructures managed by resource managers. Scientific workflow management systems (SWMS) support the workflow execution and communicate with the infrastructures’ resource managers. However, the communication between SWMS and resource managers is complicated by a) inconsistent interfaces between SMWS and resource managers and b) the lack of support for workflow dependencies and workflow-specific properties. To tackle these issues, we developed the Common Workflow Scheduler Interface (CWSI), a simple yet powerful interface to exchange workflow-related information between a SWMS and a resource manager, making the resource manager workflow-aware. The first prototype implementations show that the CWSI can reduce the makespan already with simple but workflow-aware strategies up to 25%. In this paper, we show how existing workflow resource management research can be integrated into the CWSI

Enlighten