166 research outputs found

    Multi-Objective Scientific-Workflow Scheduling With Data Movement Awareness in Cloud.

    Get PDF
    Due to serving several purposes simultaneously, running scientific workflows on dynamic environments such as cloud computing, has become multi-objective scheduling. Among these purposes, Cost and Makespan are probably the most two primitive objectives. Another critical factor in a large-scale scientific workflow is tremendous amount of data during execution. Therefore, this work also includes Data Movement as an additional objective as it has a major impact on network utilization and energy consumption in network equipment in cloud data center. In considering these three objectives, this work proposes a framework for scheduling solutions which combines a new nodes clustering technique in Directed Acyclic Graph (DAG) model known as Multilevel Dependent Node Clustering (MDNC) and the multiobjective optimization, Extreme Nondominated Sorting Genetic Algorithm-III (E-NSGA-III). E-NSGAIII is the recent extension of Nondominated Sorting Genetic Algorithm (NSGA-III). Five well-known scientific workflows, CyberShake, Epigenomics, LIGO, Montage, and SIPHT are selected as testbeds, while the commonly known Hypervolume is chosen as the performance metric. In this work, MDNC is also experimented with both NSGA-III. Comparison among three approaches, E-NAGA-III alone, E-NAGA-III with Peer-to-Peer clustering and E-NAGA-III with MDNC are carried out. The superiority of the proposed framework among them and its limitation are discussed

    A comparative analysis of NSGA-II and NSGA-III for autoscaling parameter sweep experiments in the cloud

    Get PDF
    The Cloud Computing paradigm is focused on the provisioning of reliable and scalable virtual infrastructures that deliver execution and storage services. This paradigm is particularly suitable to solve resource-greedy scientific computing applications such as parameter sweep experiments (PSEs). Through the implementation of autoscalers, the virtual infrastructure can be scaled up and down by acquiring or terminating instances of virtual machines (VMs) at the time that application tasks are being scheduled. In this paper, we extend an existing study centered in a state-of-the-art autoscaler called multiobjective evolutionary autoscaler (MOEA). MOEA uses a multiobjective optimization algorithm to determine the set of possible virtual infrastructure settings. In this context, the performance of MOEA is greatly influenced by the underlying optimization algorithm used and its tuning. Therefore, we analyze two well-known multiobjective evolutionary algorithms (NSGA-II and NSGA-III) and how they impact on the performance of the MOEA autoscaler. Simulated experiments with three real-world PSEs show that MOEA gets significantly improved when using NSGA-III instead of NSGA-II due to the former provides a better exploitation versus exploration trade-off.Fil: Yannibelli, Virginia Daniela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; ArgentinaFil: Pacini Naumovich, Elina Rocío. Universidad Nacional de Cuyo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; ArgentinaFil: Monge, David. Universidad Nacional de Cuyo; ArgentinaFil: Mateos Diaz, Cristian Maximiliano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; ArgentinaFil: Rodríguez, Guillermo Horacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentin

    Multiobjective Level-Wise Scientific Workflow Optimization in IaaS Public Cloud Environment

    Get PDF

    Energy-aware scheduling in distributed computing systems

    Get PDF
    Distributed computing systems, such as data centers, are key for supporting modern computing demands. However, the energy consumption of data centers has become a major concern over the last decade. Worldwide energy consumption in 2012 was estimated to be around 270 TWh, and grim forecasts predict it will quadruple by 2030. Maximizing energy efficiency while also maximizing computing efficiency is a major challenge for modern data centers. This work addresses this challenge by scheduling the operation of modern data centers, considering a multi-objective approach for simultaneously optimizing both efficiency objectives. Multiple data center scenarios are studied, such as scheduling a single data center and scheduling a federation of several geographically-distributed data centers. Mathematical models are formulated for each scenario, considering the modeling of their most relevant components such as computing resources, computing workload, cooling system, networking, and green energy generators, among others. A set of accurate heuristic and metaheuristic algorithms are designed for addressing the scheduling problem. These scheduling algorithms are comprehensively studied, and compared with each other, using statistical tools to evaluate their efficacy when addressing realistic workloads and scenarios. Experimental results show the designed scheduling algorithms are able to significantly increase the energy efficiency of data centers when compared to traditional scheduling methods, while providing a diverse set of trade-off solutions regarding the computing efficiency of the data center. These results confirm the effectiveness of the proposed algorithmic approaches for data center infrastructures.Los sistemas informáticos distribuidos, como los centros de datos, son clave para satisfacer la demanda informática moderna. Sin embargo, su consumo de energético se ha convertido en una gran preocupación. Se estima que mundialmente su consumo energético rondó los 270 TWh en el año 2012, y algunos prevén que este consumo se cuadruplicará para el año 2030. Maximizar simultáneamente la eficiencia energética y computacional de los centros de datos es un desafío crítico. Esta tesis aborda dicho desafío mediante la planificación de la operativa del centro de datos considerando un enfoque multiobjetivo para optimizar simultáneamente ambos objetivos de eficiencia. En esta tesis se estudian múltiples variantes del problema, desde la planificación de un único centro de datos hasta la de una federación de múltiples centros de datos geográficmentea distribuidos. Para esto, se formulan modelos matemáticos para cada variante del problema, modelado sus componentes más relevantes, como: recursos computacionales, carga de trabajo, refrigeración, redes, energía verde, etc. Para resolver el problema de planificación planteado, se diseñan un conjunto de algoritmos heurísticos y metaheurísticos. Estos son estudiados exhaustivamente y su eficiencia es evaluada utilizando una batería de herramientas estadísticas. Los resultados experimentales muestran que los algoritmos de planificación diseñados son capaces de aumentar significativamente la eficiencia energética de un centros de datos en comparación con métodos tradicionales planificación. A su vez, los métodos propuestos proporcionan un conjunto diverso de soluciones con diferente nivel de compromiso respecto a la eficiencia computacional del centro de datos. Estos resultados confirman la eficacia del enfoque algorítmico propuesto

    Highly scalable algorithms for scheduling tasks and provisioning machines on heterogeneous computing systems

    Get PDF
    Includes bibliographical references.2015 Summer.As high performance computing systems increase in size, new and more efficient algorithms are needed to schedule work on the machines, understand the performance trade-offs inherent in the system, and determine which machines to provision. The extreme scale of these newer systems requires unique task scheduling algorithms that are capable of handling millions of tasks and thousands of machines. A highly scalable scheduling algorithm is developed that computes high quality schedules, especially for large problem sizes. Large-scale computing systems also consume vast amounts of electricity, leading to high operating costs. Through the use of novel resource allocation techniques, system administrators can examine this trade-off space to quantify how much a given performance level will cost in electricity, or see what kind of performance can be expected when given an energy budget. Trading-off energy and makespan is often difficult for companies because it is unclear how each affects the profit. A monetary-based model of high performance computing is presented and a highly scalable algorithm is developed to quickly find the schedule that maximizes the profit per unit time. As more high performance computing needs are being met with cloud computing, algorithms are needed to determine the types of machines that are best suited to a particular workload. An algorithm is designed to find the best set of computing resources to allocate to the workload that takes into account the uncertainty in the task arrival rates, task execution times, and power consumption. Reward rate, cost, failure rate, and power consumption can be optimized, as desired, to optimally trade-off these conflicting objectives

    A Step Toward Improving Healthcare Information Integration & Decision Support: Ontology, Sustainability and Resilience

    Get PDF
    The healthcare industry is a complex system with numerous stakeholders, including patients, providers, insurers, and government agencies. To improve healthcare quality and population well-being, there is a growing need to leverage data and IT (Information Technology) to support better decision-making. Healthcare information systems (HIS) are developed to store, process, and disseminate healthcare data. One of the main challenges with HIS is effectively managing the large amounts of data to support decision-making. This requires integrating data from disparate sources, such as electronic health records, clinical trials, and research databases. Ontology is one approach to address this challenge. However, understanding ontology in the healthcare domain is complex and difficult. Another challenge is to use HIS on scheduling and resource allocation in a sustainable and resilient way that meets multiple conflicting objectives. This is especially important in times of crisis when demand for resources may be high, and supply may be limited. This research thesis aims to explore ontology theory and develop a methodology for constructing HIS that can effectively support better decision-making in terms of scheduling and resource allocation while considering system resiliency and social sustainability. The objectives of the thesis are: (1) studying the theory of ontology in healthcare data and developing a deep model for constructing HIS; (2) advancing our understanding of healthcare system resiliency and social sustainability; (3) developing a methodology for scheduling with multi-objectives; and (4) developing a methodology for resource allocation with multi-objectives. The following conclusions can be drawn from the research results: (1) A data model for rich semantics and easy data integration can be created with a clearer definition of the scope and applicability of ontology; (2) A healthcare system's resilience and sustainability can be significantly increased by the suggested design principles; (3) Through careful consideration of both efficiency and patients' experiences and a novel optimization algorithm, a scheduling problem can be made more patient-accessible; (4) A systematic approach to evaluating efficiency, sustainability, and resilience enables the simultaneous optimization of all three criteria at the system design stage, leading to more efficient distributions of resources and locations for healthcare facilities. The contributions of the thesis can be summarized as follows. Scientifically, this thesis work has expanded our knowledge of ontology and data modelling, as well as our comprehension of the healthcare system's resilience and sustainability. Technologically or methodologically, the work has advanced the state of knowledge for system modelling and decision-making. Overall, this thesis examines the characteristics of healthcare systems from a system viewpoint. Three ideas in this thesis—the ontology-based data modelling approach, multi-objective optimization models, and the algorithms for solving the models—can be adapted and used to affect different aspects of disparate systems

    Metaheuristic models for decision support in the software construction process

    Get PDF
    En la actualidad, los ingenieros software no solo tienen la responsabilidad de construir sistemas que desempe~nen una determinada funcionalidad, sino que cada vez es más importante que dichos sistemas también cumplan con requisitos no funcionales como alta disponibilidad, efciencia o seguridad, entre otros. Para lograrlo, los ingenieros se enfrentan a un proceso continuo de decisión, pues deben estudiar las necesidades del sistema a desarrollar y las alternativas tecnológicas existentes para implementarlo. Todo este proceso debe estar encaminado a la obtención de sistemas software de gran calidad, reutilizables y que faciliten su mantenimiento y modificación en un escenario tan exigente y competitivo. La ingeniería del software, como método sistemático para la construcción de software, ha aportado una serie de pautas y tareas que, realizadas de forma disciplinada y adaptadas al contexto de desarrollo, posibilitan la obtención de software de calidad. En concreto, el proceso de análisis y diseño del software ha adquirido una gran importancia, pues en ella se concibe la estructura del sistema, en términos de sus bloques funcionales y las interacciones entre ellos. Es en este momento cuando se toman las decisiones acerca de la arquitectura, incluyendo los componentes que la conforman, que mejor se adapta a los requisitos, tanto funcionales como no funcionales, que presenta el sistema y que claramente repercuten en su posterior desarrollo. Por tanto, es necesario que el ingeniero analice rigurosamente las alternativas existentes, sus implicaciones en los criterios de calidad impuestos y la necesidad de establecer compromisos entre ellos. En este contexto, los ingenieros se guían principalmente por sus habilidades y experiencia, por lo que dotarles de métodos de apoyo a la decisión representaría un avance significativo en el área. La aplicación de técnicas de inteligencia artificial en este ámbito ha despertado un gran interés en los últimos años. En particular, la inteligencia artificial ha encontrado en la ingeniería del software un ámbito de aplicación complejo, donde diferentes técnicas pueden ayudar a conseguir la semi-automatización de tareas tradicionalmente realizadas de forma manual. De la unión de ambas áreas surge la denominada ingeniería del software basada en búsqueda, que propone la reformulación de las actividades propias de la ingeniería del software como problemas de optimización. A continuación, estos problemas podrían ser resueltos mediante técnicas de búsqueda como las metaheurísticas. Este tipo de técnicas se caracterizan por explorar el espacio de posibles soluciones de una manera \inteligente", a menudo simulando procesos naturales como es el caso de los algoritmos evolutivos. A pesar de ser un campo de investigación muy reciente, es posible encontrar propuestas para automatizar una gran variedad de tareas dentro del ciclo de vida del software, como son la priorización de requisitos, la planifcación de recursos, la refactorización del código fuente o la generación de casos de prueba. En el ámbito del análisis y diseño de software, cuyas tareas requieren de creatividad y experiencia, conseguir una automatización completa resulta poco realista. Es por ello por lo que la resolución de sus tareas mediante enfoques de búsqueda debe ser tratada desde la perspectiva del ingeniero, promoviendo incluso la interacción con ellos. Además, el alto grado de abstracción de algunas de sus tareas y la dificultad de evaluar cuantitativamente la calidad de un diseño software, suponen grandes retos en la aplicación de técnicas de búsqueda durante las fases tempranas del proceso de construcción de software. Esta tesis doctoral busca realizar aportaciones significativas al campo de la ingeniería del software basada en búsqueda y, más concretamente, al área de la optimización de arquitecturas software. Aunque se están realizando importantes avances en este área, la mayoría de propuestas se centran en la obtención de arquitecturas de bajo nivel o en la selección y despliegue de artefactos software ya desarrollados. Por tanto, no existen propuestas que aborden el modelado arquitectónico a un nivel de abstracción elevado, donde aún no existe un conocimiento profundo sobre cómo será el sistema y, por tanto, es más difícil asistir al ingeniero. Como problema de estudio, se ha abordado principalmente la tarea del descubrimiento de arquitecturas software basadas en componentes. El objetivo de este problema consiste en abstraer los bloques arquitectónicos que mejor definen la estructura actual del software, así como sus interacciones, con el fin de facilitar al ingeniero su posterior análisis y mejora. Durante el desarrollo de esta tesis doctoral se ha explorado el uso de una gran variedad de técnicas de búsqueda, estudiando su idoneidad y realizando las adaptaciones necesarias para hacer frente a los retos mencionados anteriormente. La primera propuesta se ha centrado en la formulación del descubrimiento de arquitecturas como problema de optimización, abordando la representación computacional de los artefactos software que deben ser modelados y definiendo medidas software para evaluar su calidad durante el proceso de búsqueda. Además, se ha desarrollado un primer modelo basado en algoritmos evolutivos mono-objetivo para su resolución, el cual ha sido validado experimentalmente con sistemas software reales. Dicho modelo se caracteriza por ser comprensible y exible, pues sus componentes han sido diseñados considerando estándares y herramientas del ámbito de la ingeniería del software, siendo además configurable en función de las necesidades del ingeniero. A continuación, el descubrimiento de arquitecturas ha sido tratado desde una perspectiva multiobjetivo, donde varias medidas software, a menudo en con icto, deben ser simultáneamente optimizadas. En este caso, la resolución del problema se ha llevado a cabo mediante ocho algoritmos del estado del arte, incluyendo propuestas recientes del ámbito de la optimización de muchos objetivos. Tras ser adaptados al problema, estos algoritmos han sido comparados mediante un extenso estudio experimental con el objetivo de analizar la ifnuencia que tiene el número y la elección de las métricas a la hora de guiar el proceso de búsqueda. Además de realizar una validación del rendimiento de estos algoritmos siguiendo las prácticas habituales del área, este estudio aporta un análisis detallado de las implicaciones que supone la optimización de múltiples objetivos en la obtención de modelos de soporte a la decisión. La última propuesta en el contexto del descubrimiento de arquitecturas software se centra en la incorporación de la opinión del ingeniero al proceso de búsqueda. Para ello se ha diseñado un mecanismo de interacción que permite al ingeniero indicar tanto las características deseables en las soluciones arquitectónicas (preferencias positivas) como aquellos aspectos que deben evitarse (preferencias negativas). Esta información es combinada con las medidas software utilizadas hasta el momento, permitiendo al algoritmo evolutivo adaptar la búsqueda conforme el ingeniero interactúe. Dadas las características del modelo, su validación se ha realizado con la participación de ingenieros con distinta experiencia en desarrollo software, a fin de demostrar la idoneidad y utilidad de la propuesta. En el transcurso de la tesis doctoral, los conocimientos adquiridos y las técnicas desarrolladas también han sido extrapolados a otros ámbitos de la ingeniería del software basada en búsqueda mediante colaboraciones con investigadores del área. Cabe destacar especialmente la formalización de una nueva disciplina transversal, denominada ingeniería del software basada en búsqueda interactiva, cuyo fin es promover la participación activa del ingeniero durante el proceso de búsqueda. Además, se ha explorado la aplicación de algoritmos de muchos objetivos a un problema clásico de la computación orientada a servicios, como es la composición de servicios web.Nowadays, software engineers have not only the responsibility of building systems that provide a particular functionality, but they also have to guarantee that these systems ful l demanding non-functional requirements like high availability, e ciency or security. To achieve this, software engineers face a continuous decision process, as they have to evaluate system needs and existing technological alternatives to implement it. All this process should be oriented towards obtaining high-quality and reusable systems, also making future modi cations and maintenance easier in such a competitive scenario. Software engineering, as a systematic method to build software, has provided a number of guidelines and tasks that, when done in a disciplinarily manner and properly adapted to the development context, allow the creation of high-quality software. More speci cally, software analysis and design has acquired great relevance, being the phase in which the software structure is conceived in terms of its functional blocks and their interactions. In this phase, engineers have to make decisions about the most suitable architecture, including its constituent components. Such decisions are made according to the system requirements, either functional or non-functional, and will have a great impact on its future development. Therefore, the engineer has to rigorously analyse existing alternatives, their implications on the imposed quality criteria and the need of establishing trade-o s among them. In this context, engineers are mostly guided by their own capabilities and experience, so providing them with decision support methods would represent a signi cant contribution. The application of arti cial intelligent techniques in this area has experienced a growing interest in the last years. Particularly, software engineering represents a complex application domain to arti cial intelligence, whose diverse techniques can help in the semi-automation of tasks traditionally performed manually. The union of both elds has led to the appearance of search-based software engineering, which proposes reformulating software engineering activities as optimisation problems. For their resolution, search techniques like metaheuristics can be then applied. This type of technique performs an \intelligent" exploration of the space of candidate solutions, often inspired by natural processes as happens with evolutionary algorithms. Despite the novelty of this research eld, there are proposals to automate a great variety of tasks within the software lifecycle, such as requirement prioritisation, resource planning, code refactoring or test case generation. Focusing on analysis and design, whose tasks require creativity and experience, trying to achieve full automation is not realistic. Therefore, solving design tasks by means of search approaches should be oriented towards the engineer's perspective, even promoting their interaction. Furthermore, design tasks are also characterised by a high level of abstraction and the di culty of quantitatively evaluating design quality. All these aspects represent key challenges for the application of search techniques in early phases of the software construction process. The aim of this Ph.D. Thesis is to make signi cant contributions in search-based software engineering and, specially, in the area of software architecture optimisation. Although it is an area in which signi cant progress is being done, most of the current proposals are focused on generating low-level architectures or selecting and deploying already developed artefacts. Therefore, there is a lack of proposals dealing with architectural modelling at a high level of abstraction. At this level, engineers do not have a deep understanding of the system yet, meaning that assisting them is even more di cult. As case study, the discovery of component-based software architectures has been primary addressed. The objective for this problem consists in the abstraction of the architectural blocks, and their interactions, that best de ne the current structure of a software system. This can be viewed as the rst step an engineer would perform in order to further analyse and improve the system architecture. In this Ph.D. Thesis, the use of a great variety of search techniques has been explored. The suitability of these techniques has been studied, also making the necessary adaptations to cope with the aforementioned challenges. A rst proposal has been focused on the formulation of software architecture discovery as an optimisation problem, which consists in the computational representation of its software artefacts and the de nition of software metrics to evaluate their quality during the search process. Moreover, a single-objective evolutionary algorithm has been designed for its resolution, which has been validated using real software systems. The resulting model is comprehensible and exible, since its components have been designed under software engineering standards and tools and are also con gurable according to engineer's needs. Next, the discovery of software architectures has been tackled from a multi-objective perspective, in which several software metrics, often in con ict, have to be simultaneously optimised. In this case, the problem is solved by applying eight state-of-theart algorithms, including some recent many-objective approaches. These algorithms have been adapted to the problem and compared in an extensive experimental study, whose purpose is to analyse the in uence of the number and combination of metrics when guiding the search process. Apart from the performance validation following usual practices within the eld, this study provides a detailed analysis of the practical implications behind the optimisation of multiple objectives in the context of decision support. The last proposal is focused on interactively including the engineer's opinion in the search-based architecture discovery process. To do this, an interaction mechanism has been designed, which allows the engineer to express desired characteristics for the solutions (positive preferences), as well as those aspects that should be avoided (negative preferences). The gathered information is combined with the software metrics used until the moment, thus making possible to adapt the search as the engineer interacts. Due to the characteristics of the proposed model, engineers of di erent expertise in software development have participated in its validation with the aim of showing the suitability and utility of the approach. The knowledge acquired along the development of the Thesis, as well as the proposed approaches, have also been transferred to other search-based software engineering areas as a result of research collaborations. In this sense, it is worth noting the formalisation of interactive search-based software engineering as a cross-cutting discipline, which aims at promoting the active participation of the engineer during the search process. Furthermore, the use of many-objective algorithms has been explored in the context of service-oriented computing to address the so-called web service composition problem

    Stochastic Parameter Estimation of Poroelastic Processes Using Geomechanical Measurements

    Get PDF
    Understanding the structure and material properties of hydrologic systems is important for a number of applications, including carbon dioxide injection for geological carbon storage or enhanced oil recovery, monitoring of hydraulic fracturing projects, mine dewatering, environmental remediation and managing geothermal reservoirs. These applications require a detailed knowledge of the geologic systems being impacted, in order to optimize their operation and safety. In order to evaluate, monitor and manage such hydrologic systems, a stochastic estimation framework was developed which is capable of characterizing the structure and physical parameters of the subsurface. This software framework uses a set of stochastic optimization algorithms to calibrate a heterogeneous subsurface flow model to available field data, and to construct an ensemble of models which represent the range of system states that would explain this data. Many of these systems, such as oil reservoirs, are deep and hydraulically isolted from the shallow subsurface making near-surface fluid pressure measurements uninformative. Near-surface strainmeter, tiltmeter and extensometer signals were therefore evaluated in terms of their potential information content for calibrating poroelastic flow models. Such geomechanical signals are caused by mechanical deformation, and therefore travel through hydraulically impermeable rock much more quickly. A numerical geomechanics model was therefore developed using Geocentric, which couples subsurface flow and elastic deformation equations to simulate geomechanical signals (e.g. pressure, strain, tilt and displacement) given a set of model parameters. A high-performance cluster computer performs this computationally expensive simulation for each set of parameters, and compares the simulation results to measured data in order to evaluate the likelihood of each model. The set of data-model comparisons are then used to estimate each unknown parameter, as well as the uncertainty of each parameter estimate. This uncertainty can be inuenced by limitations in the measured dataset such as random noise, instrument drift, and the number and location of sensors, as well as by conceptual model errors and false underlying assumptions. In this study we find that strain measurements taken from the shallow subsurface can be used to estimate the structure and material parameters of geologic layers much deeper in the subsurface. This can signicantly mitigate drilling and installation costs of monitoring wells, as well as reduce the risk of puncturing or fracturing a target reservoir. These parameter estimates were also used to develop an ensemble of calibrated hydromechanical models which can predict the range of system behavior and inform decision-making on the management of an aquifer or reservoir

    Green Technologies for Production Processes

    Get PDF
    This book focuses on original research works about Green Technologies for Production Processes, including discrete production processes and process production processes, from various aspects that tackle product, process, and system issues in production. The aim is to report the state-of-the-art on relevant research topics and highlight the barriers, challenges, and opportunities we are facing. This book includes 22 research papers and involves energy-saving and waste reduction in production processes, design and manufacturing of green products, low carbon manufacturing and remanufacturing, management and policy for sustainable production, technologies of mitigating CO2 emissions, and other green technologies
    corecore