26 research outputs found

    Abstraction Raising in General-Purpose Compilers

    Get PDF

    Local time stepping on high performance computing architectures: mitigating CFL bottlenecks for large-scale wave propagation

    Get PDF
    Modeling problems that require the simulation of hyperbolic PDEs (wave equations) on large heterogeneous domains have potentially many bottlenecks. We attack this problem through two techniques: the massively parallel capabilities of graphics processors (GPUs) and local time stepping (LTS) to mitigate any CFL bottlenecks on a multiscale mesh. Many modern supercomputing centers are installing GPUs due to their high performance, and extending existing seismic wave-propagation software to use GPUs is vitally important to give application scientists the highest possible performance. In addition to this architectural optimization, LTS schemes avoid performance losses in meshes with localized areas of refinement. Coupled with the GPU performance optimizations, the derivation and implementation of an Newmark LTS scheme enables next-generation performance for real-world applications. Included in this implementation is work addressing the load-balancing problem inherent to multi-level LTS schemes, enabling scalability to hundreds and thousands of CPUs and GPUs. These GPU, LTS, and scaling optimizations accelerate the performance of existing applications by a factor of 30 or more, and enable future modeling scenarios previously made unfeasible by the cost of standard explicit time-stepping schemes

    Applications and Techniques for Fast Machine Learning in Science

    Get PDF
    In this community review report, we discuss applications and techniques for fast machine learning (ML) in science - the concept of integrating powerful ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs

    EuroEXA - D2.6: Final ported application software

    Get PDF
    This document describes the ported software of the EuroEXA applications to the single CRDB testbed and it discusses the experiences extracted from porting and optimization activities that should be actively taken into account in future redesign and optimization. This document accompanies the ported application software, found in the EuroEXA private repository (https://github.com/euroexa). In particular, this document describes the status of the software for each of the EuroEXA applications, sketches the redesign and optimization strategy for each application, discusses issues and difficulties faced during the porting activities and the relative lesson learned. A few preliminary evaluation results have been presented, however the full evaluation will be discussed in deliverable 2.8

    Aplicación de Inteligencia Artificial sobre infraestructuras IoT para automatizar y optimizar los procesos de agricultura intensiva en invernaderos.

    Get PDF
    La agenda de desarrollo sostenible (Sustainable Development Goals, SDG) de las Naciones Unidas establece una serie de objetivos con el fin de erradicar la pobreza, proteger el planeta y asegurar la prosperidad de sus ciudadanos. Entre estos objetivos se destacan: (6) “Garantizar la disponibilidad y la gestión sostenible del agua y el saneamiento para todos”, (13) “Adoptar medidas urgentes para combatir el cambio climático y sus efectos” y (15) “Proteger, restaurar y promover el uso sostenible de los ecosistemas terrestres, gestionar de forma sostenible los bosques, luchar contra la desertización, detener e invertir la degradación del suelo y frenar la pérdida de biodiversidad”. Los procesos industriales y, en concreto, los procesos de agricultura intensiva, son una de las principales amenazas para cumplir con los SDGs. Sin embargo, los avances tecnológicos en materias como la Inteligencia Artificial (Artificial Intelligence, AI), la Computación de Alto Rendimiento (High Performance Computing, HPC) o el Internet de las Cosas (Internet of Things, IoT) permiten aumentar la productividad de estos procesos reduciendo su impacto medioambiental y ecosistémico. La investigación desarrollada en la presente tesis doctoral pretende establecer un marco de trabajo donde aprovechar los avances tecnológicos desarrollados en estas disciplinas, es decir, AI, HPC e IoT, para optimizar y reducir el impacto de los procesos industriales más agresivos para el medioambiente. En concreto, esta tesis doctoral se desarrollará en el contexto de agricultura intensiva en invernaderos, un sector de un gran valor estratégico, comercial e incluso humanitario para garantizar el acceso a los alimentos a toda la humanidad, centrándose en tres puntos clave: (1) la generación de técnicas de AI de bajo consumo que puedan ser ejecutados en plataformas con reducidas capacidades de cómputo, tales como dispositivos IoT; (2) la creación de una infraestructura que permite entrenar, desplegar y predecir con técnicas de AI que requieren de grandes capacidades de cómputo en pequeños dispositivos IoT gracias a protocolos de comunicación en tiempo real como MQTT; y (3) el aumento de las capacidades de cómputo y la eficiencia energética de los dispositivos IoT gracias a la virtualización de GPUs remotas mediante rCUDA. Los principales resultados obtenidos en relación a lo expuesto anteriormente demuestran que (1) la intersección entre la AI, HPC e IoT es todavía muy incipiente. Las cargas de cómputo del aprendizaje máquina son cada vez más altas y se diverge cada vez más de los recursos computacionales disponibles en los dispositivos de cómputo más cercanos a la captura de datos, es decir, los dispositivos de Edge Computing. Estas plataformas no son computacionalmente capaces de desarrollar parte de las tareas más exigentes (como, por ejemplo, el entrenamiento de técnicas de AI), limitando el éxito de su aplicación; (2) se puede crear una infraestructura auxiliar que permita desarrollar predicciones en tiempo real en dispositivos IoT, aunque el intercambio de información entre los distintos nodos de la infraestructura conlleva una latencia asumible puesto que es muy reducida; y (3) es posible ampliar las capacidades computacionales y la eficiencia energética de los dispositivos IoT mediante el uso de técnicas de virtualización de GPUs remotas. Estas técnicas aumentan notablemente la eficiencia energética de estos dispositivos ya que se delega las operaciones de mayor carga computacional a los servidores remotos de cómputo. Si bien es cierto, el consumo total de la infraestructura aumenta notablemente a causa del gasto en comunicaciones entre los dispositivos edge y cloud. Para finalizar, destacar que la presente tesis se ha desarrollado en el proyecto retos-colaboración “Desarrollo de infraestructuras IoT de altas prestaciones contra el cambio climático basadas en inteligencia artificial” (GLOBALoT) con referencia RTC2019-007159-5, financiado por el Ministerio de Ciencia e Innovación / Agencia Estatal de Investigación, que tiene un marcado carácter tecnológico y, por tanto, se ha transferido el conocimiento obtenido, desarrollando un prototipo funcional en TRL 3-4 que ha sido desplegado en un entorno real de invernadero ofrecido por uno de los socios del proyecto, la empresa NUTRICONTROL. Los resultados obtenidos muestran un claro interés por esta tecnología, sentando las bases para automatizar y optimizar procesos mediante la Inteligencia Artificial de las Cosas (Artificial Intelligence of Things, AIoT) para aumentar la producción y reducir el impacto medioambiental en invernaderos inteligentes.Ingeniería, Industria y Construcció

    Simulation methodologies for mobile GPUs

    Get PDF
    GPUs critically rely on a complex system software stack comprising kernel- and user-space drivers and JIT compilers. Yet, existing GPU simulators typically abstract away details of the software stack and GPU instruction set. Partly, this is because GPU vendors rarely release sufficient information about their latest GPU products. However, this is also due to the lack of an integrated CPU-GPU simulation framework, which is complete and powerful enough to drive the complex GPU software environment. This has led to a situation where research on GPU architectures and compilers is largely based on outdated or greatly simplified architectures and software stacks, undermining the validity of the generated results. Making the situation even more dire, existing GPU simulation efforts are concentrated around desktop GPUs, making infrastructure for modelling mobile GPUs virtually non-existent, despite their surging importance in the GPU market. Still, mobile GPU designers are faced with the challenge of evaluating design alternatives involving hundreds of architectural configuration options and micro-architectural improvements under tight time-to-market constraints, to which currently employed design flows involving detailed, but slow simulations are not well suited. In this thesis we develop a full-system simulation environment for a mobile platform, which enables users to run a complete and unmodified software stack for a state-of-the-art mobile Arm CPU and Mali Bifrost GPU powered device, achieving 100\% architectural accuracy across all available toolchains. We demonstrate the capability of our GPU simulation framework through a number of case studies exploring modern, mobile GPU applications, and optimize them using functional simulation statistics, unavailable with other approaches or hardware. Furthermore, we develop a trace-based performance model, allowing architects to rapidly model GPU configurations in early design space exploration

    Software for Exascale Computing - SPPEXA 2016-2019

    Get PDF
    This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest

    Electronic systems for the restoration of the sense of touch in upper limb prosthetics

    Get PDF
    In the last few years, research on active prosthetics for upper limbs focused on improving the human functionalities and the control. New methods have been proposed for measuring the user muscle activity and translating it into the prosthesis control commands. Developing the feed-forward interface so that the prosthesis better follows the intention of the user is an important step towards improving the quality of life of people with limb amputation. However, prosthesis users can neither feel if something or someone is touching them over the prosthesis and nor perceive the temperature or roughness of objects. Prosthesis users are helped by looking at an object, but they cannot detect anything otherwise. Their sight gives them most information. Therefore, to foster the prosthesis embodiment and utility, it is necessary to have a prosthetic system that not only responds to the control signals provided by the user, but also transmits back to the user the information about the current state of the prosthesis. This thesis presents an electronic skin system to close the loop in prostheses towards the restoration of the sense of touch in prosthesis users. The proposed electronic skin system inlcudes an advanced distributed sensing (electronic skin), a system for (i) signal conditioning, (ii) data acquisition, and (iii) data processing, and a stimulation system. The idea is to integrate all these components into a myoelectric prosthesis. Embedding the electronic system and the sensing materials is a critical issue on the way of development of new prostheses. In particular, processing the data, originated from the electronic skin, into low- or high-level information is the key issue to be addressed by the embedded electronic system. Recently, it has been proved that the Machine Learning is a promising approach in processing tactile sensors information. Many studies have been shown the Machine Learning eectiveness in the classication of input touch modalities.More specically, this thesis is focused on the stimulation system, allowing the communication of a mechanical interaction from the electronic skin to prosthesis users, and the dedicated implementation of algorithms for processing tactile data originating from the electronic skin. On system level, the thesis provides design of the experimental setup, experimental protocol, and of algorithms to process tactile data. On architectural level, the thesis proposes a design ow for the implementation of digital circuits for both FPGA and integrated circuits, and techniques for the power management of embedded systems for Machine Learning algorithms

    Artificial Intelligence Technology

    Get PDF
    This open access book aims to give our readers a basic outline of today’s research and technology developments on artificial intelligence (AI), help them to have a general understanding of this trend, and familiarize them with the current research hotspots, as well as part of the fundamental and common theories and methodologies that are widely accepted in AI research and application. This book is written in comprehensible and plain language, featuring clearly explained theories and concepts and extensive analysis and examples. Some of the traditional findings are skipped in narration on the premise of a relatively comprehensive introduction to the evolution of artificial intelligence technology. The book provides a detailed elaboration of the basic concepts of AI, machine learning, as well as other relevant topics, including deep learning, deep learning framework, Huawei MindSpore AI development framework, Huawei Atlas computing platform, Huawei AI open platform for smart terminals, and Huawei CLOUD Enterprise Intelligence application platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence
    corecore