Search CORE

1,100 research outputs found

Index handling and assign optimization for Algorithmic Differentiation reuse index managers

Author: Blühdorn Johannes
Gauger Nicolas R.
Sagebaum Max
Publication venue
Publication date: 17/02/2021
Field of study

For operator overloading Algorithmic Differentiation tools, the identification of primal variables and adjoint variables is usually done via indices. Two common schemes exist for their management and distribution. The linear approach is easy to implement and supports memory optimization with respect to copy statements. On the other hand, the reuse approach requires more implementation effort but results in much smaller adjoint vectors, which are more suitable for the vector mode of Algorithmic Differentiation. In this paper, we present both approaches, how to implement them, and discuss their advantages, disadvantages and properties of the resulting Algorithmic Differentiation type. In addition, a new management scheme is presented which supports copy optimizations and the reuse of indices, thus combining the advantages of the other two. The implementations of all three schemes are compared on a simple synthetic example and on a real world example using the computational fluid dynamics solver in SU2.Comment: 20 pages, 14 figures, 4 table

arXiv.org e-Print Archive

Reverse-Mode Automatic Differentiation of Compiled Programs

Author: Aehle Max
Blühdorn Johannes
Gauger Nicolas R.
Sagebaum Max
Publication venue
Publication date: 28/12/2022
Field of study

Tools for algorithmic differentiation (AD) provide accurate derivatives of computer-implemented functions for use in, e. g., optimization and machine learning (ML). However, they often require the source code of the function to be available in a restricted set of programming languages. As a step towards making AD accessible for code bases with cross-language or closed-source components, we recently presented the forward-mode AD tool Derivgrind. It inserts forward-mode AD logic into the machine code of a compiled program using the Valgrind dynamic binary instrumentation framework. This work extends Derivgrind, adding the capability to record the real-arithmetic evaluation tree, and thus enabling operator overloading style reverse-mode AD for compiled programs. We maintain the high level of correctness reported for Derivgrind's forward mode, failing the same few testcases in an extensive test suite for the same well-understood reasons. Runtime-wise, the recording slows down the execution of a compiled 64-bit benchmark program by a factor of about 180.Comment: 17 pages, 5 figures, 1 listin

arXiv.org e-Print Archive

Forward-Mode Automatic Differentiation of Compiled Programs

Author: Aehle Max
Blühdorn Johannes
Gauger Nicolas R.
Sagebaum Max
Publication venue
Publication date: 07/07/2023
Field of study

Algorithmic differentiation (AD) is a set of techniques that provide partial derivatives of computer-implemented functions. Such a function can be supplied to state-of-the-art AD tools via its source code, or via an intermediate representation produced while compiling its source code. We present the novel AD tool Derivgrind, which augments the machine code of compiled programs with forward-mode AD logic. Derivgrind leverages the Valgrind instrumentation framework for a structured access to the machine code, and a shadow memory tool to store dot values. Access to the source code is required at most for the files in which input and output variables are defined. Derivgrind's versatility comes at the price of scaling the run-time by a factor between 30 and 75, measured on a benchmark based on a numerical solver for a partial differential equation. Results of our extensive regression test suite indicate that Derivgrind produces correct results on GCC- and Clang-compiled programs, including a Python interpreter, with a small number of exceptions. While we provide a list of scenarios that Derivgrind does not handle correctly, nearly all of them are academic counterexamples or originate from highly optimized math libraries. As long as differentiating those is avoided, Derivgrind can be applied to an unprecedentedly wide range of cross-language or partially closed-source software with little integration efforts.Comment: 21 pages, 3 figures, 3 tables, 5 listing

arXiv.org e-Print Archive

Scalable Schedule-Aware Bundle Routing

Author: De Jonckère Olivier
Publication venue
Publication date: 09/08/2023
Field of study

This thesis introduces approaches providing scalable delay-/disruption-tolerant routing capabilities in scheduled space topologies. The solution is developed for the requirements derived from use cases built according to predictions for future space topology, like the future Mars communications architecture report from the interagency operations advisory group. A novel routing algorithm is depicted to provide optimized networking performance that discards the scalability issues inherent to state-of-the-art approaches. This thesis also proposes a new recommendation to render volume management concerns generic and easily exchangeable, including a new simple management technique increasing volume awareness accuracy while being adaptable to more particular use cases. Additionally, this thesis introduces a more robust and scalable approach for internetworking between subnetworks to increase the throughput, reduce delays, and ease configuration thanks to its high flexibility.:1 Introduction 1.1 Motivation 1.2 Problem statement 1.3 Objectives 1.4 Outline 2 Requirements 2.1 Use cases 2.2 Requirements 2.2.1 Requirement analysis 2.2.2 Requirements relative to the routing algorithm 2.2.3 Requirements relative to the volume management 2.2.4 Requirements relative to interregional routing 3 Fundamentals 3.1 Delay-/disruption-tolerant networking 3.1.1 Architecture 3.1.2 Opportunistic and deterministic DTNs 3.1.3 DTN routing 3.1.4 Contact plans 3.1.5 Volume management 3.1.6 Regions 3.2 Contact graph routing 3.2.1 A non-replication routing scheme 3.2.2 Route construction 3.2.3 Route selection 3.2.4 Enhancements and main features 3.3 Graph theory and DTN routing 3.3.1 Mapping with DTN objects 3.3.2 Shortest path algorithm 3.3.3 Edge and vertex contraction 3.4 Algorithmic determinism and predictability 4 Preliminary analysis 4.1 Node and contact graphs 4.2 Scenario 4.3 Route construction in ION-CGR 4.4 Alternative route search 4.4.1 Yen’s algorithm scalability 4.4.2 Blocking issues with Yen 4.4.3 Limiting contact approaches 4.5 CGR-multicast and shortest-path tree search 4.6 Volume management 4.6.1 Volume obstruction 4.6.2 Contact sink 4.6.3 Ghost queue 4.6.4 Data rate variations 4.7 Hierarchical interregional routing 4.8 Other potential issues 5 State-of-the-art and related work 5.1 Taxonomy 5.2 Opportunistic and probabilistic approaches 5.2.1 Flooding approaches 5.2.2 PROPHET 5.2.3 MaxProp 5.2.4 Issues 5.3 Deterministic approaches 5.3.1 Movement-aware routing over interplanetary networks 5.3.2 Delay-tolerant link state routing 5.3.3 DTN routing for quasi-deterministic networks 5.3.4 Issues 5.4 CGR variants and enhancements 5.4.1 CGR alternative routing table computation 5.4.2 CGR-multicast 5.4.3 CGR extensions 5.4.4 RUCoP and CGR-hop 5.4.5 Issues 5.5 Interregional routing 5.5.1 Border gateway protocol 5.5.2 Hierarchical interregional routing 5.5.3 Issues 5.6 Further approaches 5.6.1 Machine learning approaches 5.6.2 Tropical geometry 6 Scalable schedule-aware bundle routing 6.1 Overview 6.2 Shortest-path tree routing for space networks 6.2.1 Structure 6.2.2 Tree construction 6.2.3 Tree management 6.2.4 Tree caching 6.3 Contact segmentation 6.3.1 Volume management interface 6.3.2 Simple volume manager 6.3.3 Enhanced volume manager 6.4 Contact passageways 6.4.1 Regional border deﬁnition 6.4.2 Virtual nodes 6.4.3 Pathﬁnding and administration 7 Evaluation 7.1 Methodology 7.1.1 Simulation tools 7.1.2 Simulator extensions 7.1.3 Algorithms and scenarios 7.2 Oﬄine analysis 7.3 Eliminatory processing pressures 7.4 Networking performance 7.4.1 Intraregional unicast routing tests 7.4.2 Intraregional multicast tests 7.4.3 Interregional routing tests 7.4.4 Behavior with congestion 7.5 Requirement fulﬁllment 8 Summary and Outlook 8.1 Conclusion 8.2 Future works 8.2.1 Next development steps 8.2.2 Contact graph routin

Technische Universität Dresden: Qucosa

Parallel optimization algorithms for high performance computing : application to thermal systems

Author: Aizpurua Udabe Imanol
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2017
Field of study

The need of optimization is present in every field of engineering. Moreover, applications requiring a multidisciplinary approach in order to make a step forward are increasing. This leads to the need of solving complex optimization problems that exceed the capacity of human brain or intuition. A standard way of proceeding is to use evolutionary algorithms, among which genetic algorithms hold a prominent place. These are characterized by their robustness and versatility, as well as their high computational cost and low convergence speed. Many optimization packages are available under free software licenses and are representative of the current state of the art in optimization technology. However, the ability of optimization algorithms to adapt to massively parallel computers reaching satisfactory efficiency levels is still an open issue. Even packages suited for multilevel parallelism encounter difficulties when dealing with objective functions involving long and variable simulation times. This variability is common in Computational Fluid Dynamics and Heat Transfer (CFD & HT), nonlinear mechanics, etc. and is nowadays a dominant concern for large scale applications. Current research in improving the performance of evolutionary algorithms is mainly focused on developing new search algorithms. Nevertheless, there is a vast knowledge of sequential well-performing algorithmic suitable for being implemented in parallel computers. The gap to be covered is efficient parallelization. Moreover, advances in the research of both new search algorithms and efficient parallelization are additive, so that the enhancement of current state of the art optimization software can be accelerated if both fronts are tackled simultaneously. The motivation of this Doctoral Thesis is to make a step forward towards the successful integration of Optimization and High Performance Computing capabilities, which has the potential to boost technological development by providing better designs, shortening product development times and minimizing the required resources. After conducting a thorough state of the art study of the mathematical optimization techniques available to date, a generic mathematical optimization tool has been developed putting a special focus on the application of the library to the field of Computational Fluid Dynamics and Heat Transfer (CFD & HT). Then the main shortcomings of the standard parallelization strategies available for genetic algorithms and similar population-based optimization methods have been analyzed. Computational load imbalance has been identified to be the key point causing the degradation of the optimization algorithm¿s scalability (i.e. parallel efficiency) in case the average makespan of the batch of individuals is greater than the average time required by the optimizer for performing inter-processor communications. It occurs because processors are often unable to finish the evaluation of their queue of individuals simultaneously and need to be synchronized before the next batch of individuals is created. Consequently, the computational load imbalance is translated into idle time in some processors. Several load balancing algorithms have been proposed and exhaustively tested, being extendable to any other population-based optimization method that needs to synchronize all processors after the evaluation of each batch of individuals. Finally, a real-world engineering application that consists on optimizing the refrigeration system of a power electronic device has been presented as an illustrative example in which the use of the proposed load balancing algorithms is able to reduce the simulation time required by the optimization tool.El aumento de las aplicaciones que requieren de una aproximación multidisciplinar para poder avanzar se constata en todos los campos de la ingeniería, lo cual conlleva la necesidad de resolver problemas de optimización complejos que exceden la capacidad del cerebro humano o de la intuición. En estos casos es habitual el uso de algoritmos evolutivos, principalmente de los algoritmos genéticos, caracterizados por su robustez y versatilidad, así como por su gran coste computacional y baja velocidad de convergencia. La multitud de paquetes de optimización disponibles con licencias de software libre representan el estado del arte actual en tecnología de optimización. Sin embargo, la capacidad de adaptación de los algoritmos de optimización a ordenadores masivamente paralelos alcanzando niveles de eficiencia satisfactorios es todavía una tarea pendiente. Incluso los paquetes adaptados al paralelismo multinivel tienen dificultades para gestionar funciones objetivo que requieren de tiempos de simulación largos y variables. Esta variabilidad es común en la Dinámica de Fluidos Computacional y la Transferencia de Calor (CFD & HT), mecánica no lineal, etc. y es una de las principales preocupaciones en aplicaciones a gran escala a día de hoy. La investigación actual que tiene por objetivo la mejora del rendimiento de los algoritmos evolutivos está enfocada principalmente al desarrollo de nuevos algoritmos de búsqueda. Sin embargo, ya se conoce una gran variedad de algoritmos secuenciales apropiados para su implementación en ordenadores paralelos. La tarea pendiente es conseguir una paralelización eficiente. Además, los avances en la investigación de nuevos algoritmos de búsqueda y la paralelización son aditivos, por lo que el proceso de mejora del software de optimización actual se verá incrementada si se atacan ambos frentes simultáneamente. La motivación de esta Tesis Doctoral es avanzar hacia una integración completa de las capacidades de Optimización y Computación de Alto Rendimiento para así impulsar el desarrollo tecnológico proporcionando mejores diseños, acortando los tiempos de desarrollo del producto y minimizando los recursos necesarios. Tras un exhaustivo estudio del estado del arte de las técnicas de optimización matemática disponibles a día de hoy, se ha diseñado una librería de optimización orientada al campo de la Dinámica de Fluidos Computacional y la Transferencia de Calor (CFD & HT). A continuación se han analizado las principales limitaciones de las estrategias de paralelización disponibles para algoritmos genéticos y otros métodos de optimización basados en poblaciones. En el caso en que el tiempo de evaluación medio de la tanda de individuos sea mayor que el tiempo medio que necesita el optimizador para llevar a cabo comunicaciones entre procesadores, se ha detectado que la causa principal de la degradación de la escalabilidad o eficiencia paralela del algoritmo de optimización es el desequilibrio de la carga computacional. El motivo es que a menudo los procesadores no terminan de evaluar su cola de individuos simultáneamente y deben sincronizarse antes de que se cree la siguiente tanda de individuos. Por consiguiente, el desequilibrio de la carga computacional se convierte en tiempo de inactividad en algunos procesadores. Se han propuesto y testado exhaustivamente varios algoritmos de equilibrado de carga aplicables a cualquier método de optimización basado en una población que necesite sincronizar los procesadores tras cada tanda de evaluaciones. Finalmente, se ha presentado como ejemplo ilustrativo un caso real de ingeniería que consiste en optimizar el sistema de refrigeración de un dispositivo de electrónica de potencia. En él queda demostrado que el uso de los algoritmos de equilibrado de carga computacional propuestos es capaz de reducir el tiempo de simulación que necesita la herramienta de optimización

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Recommended from our members

End-to-end deep reinforcement learning in computer systems

Author: Schaarschmidt Michael
Publication venue: University of Cambridge
Publication date: 13/04/2020
Field of study

Abstract The growing complexity of data processing systems has long led systems designers to imagine systems (e.g. databases, schedulers) which can self-configure and adapt based on environmental cues. In this context, reinforcement learning (RL) methods have since their inception appealed to systems developers. They promise to acquire complex decision policies from raw feedback signals. Despite their conceptual popularity, RL methods are scarcely found in real-world data processing systems. Recently, RL has seen explosive growth in interest due to high profile successes when utilising large neural networks (deep reinforcement learning). Newly emerging machine learning frameworks and powerful hardware accelerators have given rise to a plethora of new potential applications. In this dissertation, I first argue that in order to design and execute deep RL algorithms efficiently, novel software abstractions are required which can accommodate the distinct computational patterns of communication-intensive and fast-evolving algorithms. I propose an architecture which decouples logical algorithm construction from local and distributed execution semantics. I further present RLgraph, my proof-of-concept implementation of this architecture. In RLgraph, algorithm developers can explore novel designs by constructing a high-level data flow graph through combination of logical components. This dataflow graph is independent of specific backend frameworks or notions of execution, and is only later mapped to execution semantics via a staged build process. RLgraph enables high-performing algorithm implementations while maintaining flexibility for rapid prototyping. Second, I investigate reasons for the scarcity of RL applications in systems themselves. I argue that progress in applied RL is hindered by a lack of tools for task model design which bridge the gap between systems and algorithms, and also by missing shared standards for evaluation of model capabilities. I introduce Wield, a first-of-its-kind tool for incremental model design in applied RL. Wield provides a small set of primitives which decouple systems interfaces and deployment-specific configuration from representation. Core to Wield is a novel instructive experiment protocol called progressive randomisation which helps practitioners to incrementally evaluate different dimensions of non-determinism. I demonstrate how Wield and progressive randomisation can be used to reproduce and assess prior work, and to guide implementation of novel RL applications

Apollo (Cambridge)

Methodological review of multicriteria optimization techniques: aplications in water resources

Author: Cano Javier
Efremov Roman
Redchuk Andrés
Udías Angel
Publication venue
Publication date: 01/03/2013
Field of study

Multi-criteria decision analysis (MCDA) is an umbrella approach that has been applied to a wide range of natural resource management situations. This report has two purposes. First, it aims to provide an overview of advancedmulticriteriaapproaches, methods and tools. The review seeks to layout the nature of the models, their inherent strengths and limitations. Analysis of their applicability in supporting real-life decision-making processes is provided with relation to requirements imposed by organizationally decentralized and economically specific spatial and temporal frameworks. Models are categorized based on different classification schemes and are reviewed by describing their general characteristics, approaches, and fundamental properties. A necessity of careful structuring of decision problems is discussed regarding planning, staging and control aspects within broader agricultural context, and in water management in particular. A special emphasis is given to the importance of manipulating decision elements by means ofhierarchingand clustering. The review goes beyond traditionalMCDAtechniques; it describes new modelling approaches. The second purpose is to describe newMCDAparadigms aimed at addressing the inherent complexity of managing water ecosystems, particularly with respect to multiple criteria integrated with biophysical models,multistakeholders, and lack of information. Comments about, and critical analysis of, the limitations of traditional models are made to point out the need for, and propose a call to, a new way of thinking aboutMCDAas they are applied to water and natural resources management planning. These new perspectives do not undermine the value of traditional methods; rather they point to a shift in emphasis from methods for problem solving to methods for problem structuring. Literature review show successfully integrations of watershed management optimization models to efficiently screen a broad range of technical, economic, and policy management options within a watershed system framework and select the optimal combination of management strategies and associated water allocations for designing a sustainable watershed management plan at least cost. Papers show applications in watershed management model that integrates both natural and human elements of a watershed system including the management of ground and surface water sources, water treatment and distribution systems, human demands,wastewatertreatment and collection systems, water reuse facilities,nonpotablewater distribution infrastructure, aquifer storage and recharge facilities, storm water, and land use

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Centro de Servicios en Gestión de Información

Working Notes from the 1992 AAAI Workshop on Automating Software Design. Theme: Domain Specific Software Design

Author: Barstow David
Keller Richard M.
Lowry Michael R.
Tong Christopher H.
Publication venue
Publication date
Field of study

The goal of this workshop is to identify different architectural approaches to building domain-specific software design systems and to explore issues unique to domain-specific (vs. general-purpose) software design. Some general issues that cut across the particular software design domain include: (1) knowledge representation, acquisition, and maintenance; (2) specialized software design techniques; and (3) user interaction and user interface

NASA Technical Reports Server

The quality-aware service selection problem: an adaptive evolutionary approach

Author: Vinek Elisabeth
Publication venue
Publication date: 01/01/2011
Field of study

Die Qualität der Serviceerbringung (kurz QoS) ist ein wichtiger Aspekt in verteilten, Service-orientierten Systemen. Wenn mehrere Implementierungen einer Funktionalität koexistieren, kann die Wahl eines konkreten Services aufgrund von QoS-Aspekten getroffen werden. Leistung, Verfügbarkeit und Kosten sind Beispiele für QoS-Attribute eines Services. In der vorliegenden Dissertation werden Aspekte dieses Selektionsproblems anhand eines konkreten, Service-orientieren Systems vertieft. Es handelt sich dabei um das TAG-System in ATLAS, einem Hochenergiephysikexperiment am CERN, der Europäischen Organisation für Kernforschung. Die Daten und Services des TAG-Systems sind weltweit verteilt und müssen auf Anfrage selektiert und zu einem Workflow zusammengesetzt werden. Die Optimierung wird aus zwei unterschiedlichen Blickwinkeln. Die Selektion wird als ein dynamisches Pfadoptimierungsproblem unter Nebenbedingungen modelliert, wodurch QoS-Attribute sowohl der Knoten (Services) als auch der Kanten (Netzwerk) berücksichtigt werden können. Dynamische Aspekte des verteilten sind in der Problemformulierung integriert, da sie eine spezifische Herausforderung und Anforderung an Lösungsalgorithmen stellen. Für die dynamische Pareto-Optimierung von Serviceselektionsproblemen wird im Rahmen dieser Arbeit ein Optimierungsansatz mit einem genetischen Algorithmus präsentiert, der über einen persistenten Speicher von früheren Lösungen sowie eine automatische Adaptierung der Mutationsrate eine effiziente Anpassung an das sich ständig verändernde System gewährleistet. Eine Ontologie der Systemkomponenten sowie deren QoS-Attribute bildet die Basis für die Optimierung. Der Ansatz wird im Rahmen der Dissertation hinsichtlich der Qualität der erzielten Lösungen, der Adaptierung an änderungen sowie der Laufzeit evaluiert. Teile des Ansatzes wurden schließ lich in das TAG-System integriert und darin evaluiert.Quality of Service (QoS) is an important aspect in distributed, service-oriented systems. When several concrete services exist that implement the same functionality, the choice of a service instance among many can be made based on QoS considerations, objectives and constraints. Typically considered properties are performance, availability, and costs. In this thesis, aspects of the QoS-aware service selection problem are studied in the context of a distributed, service-oriented system from ATLAS, a high-energy physics experiment at CERN, the European Organization for Nuclear Research. In this so-called TAG system, data and modular services are distributed world-wide and need to be selected and composed on the fly, as a user starts a request. There are two conflicting optimization viewpoints. The service selection is modeled as a dynamic multi-constrained optimal path problem, which allows considering QoS attributes of service instances and of the network. The dynamic aspects of the system are included in the problem definition, as they represent a specific challenge. To address these issues regarding dynamics and conflicting viewpoints, this work proposes a service selection optimization framework based on a multi-objective genetic algorithm capable of efficiently dealing with changing conditions by using a persistent memory of good solutions, and a stepwise adaptation of the mutation rate. A system and QoS attribute ontology as well as a description of dynamics of distributed systems build the basis of the framework. The presented approach is evaluated in terms of optimization quality, adaptability to changes, runtime performance and scalability

OTHES