129 research outputs found
Software for Exascale Computing - SPPEXA 2016-2019
This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest
Status and Future Perspectives for Lattice Gauge Theory Calculations to the Exascale and Beyond
In this and a set of companion whitepapers, the USQCD Collaboration lays out
a program of science and computing for lattice gauge theory. These whitepapers
describe how calculation using lattice QCD (and other gauge theories) can aid
the interpretation of ongoing and upcoming experiments in particle and nuclear
physics, as well as inspire new ones.Comment: 44 pages. 1 of USQCD whitepapers
Next generation of Exascale-class systems: ExaNeSt project and the status of its interconnect and storage development
The ExaNeSt project started on December 2015 and is funded by EU H2020 research framework (call H2020-FETHPC-2014, n. 671553) to study the adoption of low-cost, Linux-based power-efficient 64-bit ARM processors clusters for Exascale-class systems. The ExaNeSt consortium pools partners with industrial and academic research expertise in storage, interconnects and applications that share a vision of an European Exascale-class supercomputer. The common goal is designing and implementing a physical rack prototype together with its cooling system, the non-volatile memory (NVM) architecture and a unified low-latency interconnect able to test different options for network and storage. Furthermore, the consortium goal is to provide real HPC applications to validate the system. In this paper we describe the unified data and storage network architecture, reporting on the status of development of different testbeds and highlighting preliminary benchmark results obtained through the execution of scientific, engineering and data analytics scalable application kernels
A differentiated proposal of three dimension i/o performance characterization model focusing on storage environments
The I/O bottleneck remains a central issue in high-performance environments. Cloud
computing, high-performance computing (HPC) and big data environments share many underneath difficulties to deliver data at a desirable time rate requested by high-performance
applications. This increases the possibility of creating bottlenecks throughout the application feeding process by bottom hardware devices located in the storage system layer.
In the last years, many researchers have been proposed solutions to improve the I/O
architecture considering different approaches. Some of them take advantage of hardware
devices while others focus on a sophisticated software approach. However, due to the
complexity of dealing with high-performance environments, creating solutions to improve
I/O performance in both software and hardware is challenging and gives researchers many
opportunities. Classifying these improvements in different dimensions allows researchers
to understand how these improvements have been built over the years and how it progresses. In addition, it also allows future efforts to be directed to research topics that
have developed at a lower rate, balancing the general development process. This research
present a three-dimension characterization model for classifying research works on I/O
performance improvements for large scale storage computing facilities. This classification
model can also be used as a guideline framework to summarize researches providing an
overview of the actual scenario. We also used the proposed model to perform a systematic
literature mapping that covered ten years of research on I/O performance improvements
in storage environments. This study classified hundreds of distinct researches identifying
which were the hardware, software, and storage systems that received more attention over
the years, which were the most researches proposals elements and where these elements
were evaluated. In order to justify the importance of this model and the development
of solutions that targets I/O performance improvements, we evaluated a subset of these
improvements using a a real and complete experimentation environment, the Grid5000.
Analysis over different scenarios using a synthetic I/O benchmark demonstrates how the
throughput and latency parameters behaves when performing different I/O operations
using distinct storage technologies and approaches.O gargalo de E/S continua sendo um problema central em ambientes de alto desempenho. Os ambientes de computação em nuvem, computação de alto desempenho (HPC) e big data compartilham muitas dificuldades para fornecer dados em uma taxa de tempo desejável solicitada por aplicações de alto desempenho. Isso aumenta a possibilidade de criar gargalos em todo o processo de alimentação de aplicativos pelos dispositivos de hardware inferiores localizados na camada do sistema de armazenamento. Nos últimos anos, muitos pesquisadores propuseram soluções para melhorar a arquitetura de E/S considerando diferentes abordagens. Alguns deles aproveitam os dispositivos de hardware, enquanto outros se concentram em uma abordagem sofisticada de software. No entanto, devido à complexidade de lidar com ambientes de alto desempenho, criar soluções para melhorar o desempenho de E/S em software e hardware é um desafio e oferece aos pesquisadores muitas oportunidades. A classificação dessas melhorias em diferentes dimensões permite que os pesquisadores entendam como essas melhorias foram construídas ao longo dos anos e como elas progridem. Além disso, também permite que futuros esforços sejam direcionados para tópicos de pesquisa que se desenvolveram em menor proporção, equilibrando o processo geral de desenvolvimento. Esta pesquisa apresenta um modelo de caracterização tridimensional para classificar trabalhos de pesquisa sobre melhorias de desempenho de E/S para instalações de computação de armazenamento em larga escala. Esse modelo de classificação também pode ser usado como uma estrutura de diretrizes para resumir as pesquisas, fornecendo uma visão geral do cenário real. Também usamos o modelo proposto para realizar um mapeamento sistemático da literatura que abrangeu dez anos de pesquisa sobre melhorias no desempenho de E/S em ambientes de armazenamento. Este estudo classificou centenas de pesquisas distintas, identificando quais eram os dispositivos de hardware, software e sistemas de armazenamento que receberam mais atenção ao longo dos anos, quais foram os elementos de proposta mais pesquisados e onde esses elementos foram avaliados. Para justificar a importância desse modelo e o desenvolvimento de soluções que visam melhorias no desempenho de E/S, avaliamos um subconjunto dessas melhorias usando um ambiente de experimentação real e completo, o Grid5000. Análises em cenários diferentes usando um benchmark de E/S sintética demonstra como os parâmetros de vazão e latência se comportam ao executar diferentes operações de E/S usando tecnologias e abordagens distintas de armazenamento
Recommended from our members
From Petascale to Exascale: Eight Focus Areas of R&D Challenges for HPC Simulation Environments
Programming models bridge the gap between the underlying hardware architecture and the supporting layers of software available to applications. Programming models are different from both programming languages and application programming interfaces (APIs). Specifically, a programming model is an abstraction of the underlying computer system that allows for the expression of both algorithms and data structures. In comparison, languages and APIs provide implementations of these abstractions and allow the algorithms and data structures to be put into practice - a programming model exists independently of the choice of both the programming language and the supporting APIs. Programming models are typically focused on achieving increased developer productivity, performance, and portability to other system designs. The rapidly changing nature of processor architectures and the complexity of designing an exascale platform provide significant challenges for these goals. Several other factors are likely to impact the design of future programming models. In particular, the representation and management of increasing levels of parallelism, concurrency and memory hierarchies, combined with the ability to maintain a progressive level of interoperability with today's applications are of significant concern. Overall the design of a programming model is inherently tied not only to the underlying hardware architecture, but also to the requirements of applications and libraries including data analysis, visualization, and uncertainty quantification. Furthermore, the successful implementation of a programming model is dependent on exposed features of the runtime software layers and features of the operating system. Successful use of a programming model also requires effective presentation to the software developer within the context of traditional and new software development tools. Consideration must also be given to the impact of programming models on both languages and the associated compiler infrastructure. Exascale programming models must reflect several, often competing, design goals. These design goals include desirable features such as abstraction and separation of concerns. However, some aspects are unique to large-scale computing. For example, interoperability and composability with existing implementations will prove critical. In particular, performance is the essential underlying goal for large-scale systems. A key evaluation metric for exascale models will be the extent to which they support these goals rather than merely enable them
Proceedings of the Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015) Krakow, Poland
Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015
Relajaciones de ejecución definidas por el usuario para la mejora de la programabilidad en computación paralela de altas prestaciones
Tesis de la Universidad Complutense de Madrid, Facultad de Informática, leída el 22-11-2019This thesis proposes the development and implementation of a new programming model basedon execution relaxations, and focused on High-Performance Parallel Computing. Specifically,the main goals of the thesis are:1. Advocate a development methodology in which users define the basic computing units(tasks), together with a set of relaxations in, possibly, multiple dimensions. These relaxationswill be translated, at execution time, into expanded (and complex) scheduling opportunitiesdepending on the underlying architectural features, yielding improvements in termsof desired output metrics (e.g., performance or energy consumption).2. Abstract away users from the complexity of the underlying heterogeneous hardware, delegatingthe proper exploitation of expanded scheduling choices to a system software component(typically referred as a runtime). This piece of software, armed with knowledge fromstatic architectural characteristics and dynamic status of the hardware at execution time,will exploit those combinations considered optimal among those relaxations proposed bythe user for each task ready for execution.3. Extend this abstraction in order to describe both computing systems, by means of executor/ allocator hierarchies that describe the heterogeneous computing architecture, and applications,in terms of sets of interdependent tasks. In addition, the relations between executorsand tasks are categorized into a new task-executor taxonomy, which motivates ambiguityfreeHPC programming frontends based on the STSE, Single Task - Single Executor classification,distinguished from fully-automated runtime backends.4. Propose a new programming model (STEEL) based on previous ideas, that gathers featuresconsidered to be basic for future task-based programming models, namely: performance,composability, expressivity and hard-to-misuse interfaces.5. Specify an API to support the STEEL programming model, and a runtime implementationleveraging techniques and programming paradigms supported by modern C++, illustratingits flexibility, ease of use and performance impact by means of simple use cases and examples.Hence, the proposed methodology stands for a clear and strict separation of concerns betweenthe involved actors in a parallel executions: user / codes and underlying hardware. This kind ofabstractions allows a delegation of the expert knowledge from the user toward the system software(runtime) in a systematic way, and facilitates the integration of mechanisms to automate optimizations,adapting performance to the specificities of the heterogeneous parallel architecture in whichthe code is instantiated and executed.From this perspective, the thesis designs, implements and validates mechanisms to perform aso-called complexity formalization, classifying many actions that are currently done by the userand building a framework in which these complexities can be delegated to the runtime system. Thedelegation of these decisions is already a step forward to next generation of programming modelsseeking performance, expressivity, programmability and portability...La presente tesis doctoral propone el diseño e implementación de un nuevo modelo de programación basado en relajaciones de ejecución y enfocado al ámbito de la Computación Paralela de Altas Prestaciones. Concretamente, los objetivos principales de la tesis son:1. Abogar por una metodología de desarrollo en la que el usuario define las unidades básicas de computo (tareas), junto con un conjunto de relajaciones en, posiblemente, múltiples dimensiones. Estas relajaciones se traducirán, en tiempo de ejecución, en oportunidades expandidas(y complejas) de planificación en función de la arquitectura subyacente, impactando así en métricas como rendimiento o consumo energético.2. Abstraer al usuario de la complejidad del hardware subyacente, delegando la correcta explotación de dichas posibilidades de planificación expandidas a un componente software de sistema (típicamente conocido como runtime). Dicho software, dotado de conocimiento tanto de las características estáticas de la arquitectura subyacente como del estado puntual de la misma en el momento de la ejecución, explotará las combinaciones consideradas optimas de entre las relajaciones propuestas por el usuario para cada tarea lista para set ejecutada.3. Extender dicha abstracción para describir tanto sistemas de cómputo, en forma de jerarquía de ejecutores y alojadores de memoria que en ´ultimo término describen una arquitectura de cómputo heterogénea, como aplicaciones, en forma de un conjunto de tareas interrelacionadas. Además, las relaciones entre ejecutores y tareas son clasificadas en una nueva taxonomía tarea-ejecutor, la cual motiva frontends de programación HPC sin ambigüedad basados en la clasificación STSE, Single Task - Single Executor, separada de backends runtime totalmente automatizados.4. Proponer un nuevo modelo de programación (STEEL) basado en la clasificación STSE que aglutine ciertas características consideradas básicas de cara al éxito de los futuros modelos de programación basados en tareas: rendimiento, facilidad de composición, expresividad e interfaces no permisivos ante fallos.5. Especificar una API que dé soporte al modelo de programación, así como una implementación runtime del mismo aprovechando técnicas y paradigmas soportados en el lenguaje C++ de última generación, e ilustrar su uso, flexibilidad e impacto en el rendimiento a través de ejemplos y casos de uso sencillos .La metodología que se propugna aboga por una clara y estricta separación de conceptos entre los actores básicos que componen una ejecución paralela: usuario / código y hardware subyacente. Este tipo de abstracciones permite delegar el conocimiento experto desde el usuario hacia el software de sistema, proporcionando así mecanismos para mecanizar y automatizar su optimización ,y adaptar su rendimiento a la arquitectura paralela sobre la que se instanciarán los códigos. Desde este punto de vista, la tesis diseña, implementa y valida mecanismos para llevar a cabo una formalización de la complejidad inherente a la programación paralela heterogénea, clasificando aquellas acciones que en la actualidad se llevan a cabo por parte del usuario en el proceso de desarrollo y optimización de código, y proporcionando un marco de trabajo en el que dicha complejidad puede ser delegada, de forma eficiente y consistente, a un runtime...Fac. de InformáticaTRUEunpu
Proceedings of the First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014): Porto, Portugal
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014
- …