10 research outputs found

    SimFS: A Simulation Data Virtualizing File System Interface

    Full text link
    Nowadays simulations can produce petabytes of data to be stored in parallel filesystems or large-scale databases. This data is accessed over the course of decades often by thousands of analysts and scientists. However, storing these volumes of data for long periods of time is not cost effective and, in some cases, practically impossible. We propose to transparently virtualize the simulation data, relaxing the storage requirements by not storing the full output and re-simulating the missing data on demand. We develop SimFS, a file system interface that exposes a virtualized view of the simulation output to the analysis applications and manages the re-simulations. SimFS monitors the access patterns of the analysis applications in order to (1) decide the data to keep stored for faster accesses and (2) to employ prefetching strategies to reduce the access time of missing data. Virtualizing simulation data allows us to trade storage for computation: this paradigm becomes similar to traditional on-disk analysis (all data is stored) or in situ (no data is stored) according with the storage resources that are assigned to SimFS. Overall, by exploiting the growing computing power and relaxing the storage capacity requirements, SimFS offers a viable path towards exa-scale simulations

    La hoja de ruta de la ingeniería de computadores al final de la ley de Moore y el escalado de Dennard

    Get PDF
    En el presente trabajo se hace una revisión sobre la situación de la ingeniería de computadores al inicio de la década de los 2020 con objeto de perfilar algunos de los cambios que deberían establecerse en la enseñanza superior de esta disciplina. Se considera la gran relevancia del control del consumo energético y de las aplicaciones relacionadas con clasificación y optimización que requieren cantidades ingentes de datos (big data) y tiempos de respuesta difícilmente alcanzables utilizando las técnicas tradicionales de la ingeniería de computadores, y dada la reducción del ritmo que marca la ley de Moore y el final del escalado de Dennard. El artículo proporciona referencias bibliográficas recientes sobre la situación de la ingeniería de computadores, e identifica los nuevos requisitos de las interfaces presentes en la jerarquía de capas propia de los sistemas de cómputo, fundamentalmente los relacionados con la seguridad, el consumo energético, y el aprovechamiento del paralelismo heterogéneo. También se reflexiona sobre los límites teóricos que se pueden establecer para la computación y las expectativas que ofrece la computación cuántica.This paper reviews the state of Computer Engineering at the beginning of the 2020s in order to outline some of the changes that should be established in higher education in this discipline. It is considered the great relevance of controlling energy consumption and applications related to classification and optimization that require huge amounts of data (big data) and response times difficult to achieve using traditional techniques of computer engineering, and given the reduction of the improvement rate set by Moore's law and the end of Dennard scaling. The article also provides recent bibliographical references on the situation of Computer Engineering, and identifies the new requirements of the interfaces present in the hierarchy of layers of computer systems, mainly those related to security, energy consumption, and the use of heterogeneous parallelism. It also reflects on the theoretical limits that can be established for computation and the expectations that quantum computation offers.Universidad de Granada: Departamento de Arquitectura y Tecnología de Computadore

    Thermal Management of Electronics and Optoelectronics: From Heat Source Characterization to Heat Mitigation at the Device and Package Levels

    Full text link
    Thermal management of electronic and optoelectronic devices has become increasingly challenging. For electronic devices, the challenge arises primarily from the drive for miniaturized, high-performance devices, leading to escalating power density. For optoelectronics, the recent widespread use of organic light emitting diode (OLED) displays in mobile platforms and flexible electronics presents new challenges for heat dissipation. Furthermore, the performance and reliability of increasingly high-power semiconductor lasers used for telecommunications and other applications hinge on proper thermal management. For example, small, concentrated hotspots may trigger thermal runaway and premature device destruction. Emerging challenges in thermal management of devices require innovative methods to characterize and mitigate heat generation and temperature rise at the device level as well as the package level. The first part of this dissertation discusses device-level thermal management. A thermal imaging microscope with high spatial resolution (~450nm) is created for hotspot detection in the context of diode lasers under back-irradiance (BI). Laser facet temperature maps reveal the existence of a critical BI spot location that increases the laser’s active region temperature by nearly a factor of 3. An active solid-state cooling strategy that could scale down to the size of hotspots in modern devices is then explored, utilizing energy filtering at carbon nanotube (CNT) junctions as a means to provide thermionic cooling at nanometer spatial scales. The CNT cooler exhibits a large effective Seebeck coefficient of 386μV/K and a relatively moderate thermal conductivity, together giving rise to a high cooling capacity (2.3 × 106 W/cm2). Thermal management at the package level is then considered. Heat transfer in polymers is first studied, owing to their prevalence in thermal interface materials as well as organic devices (e.g., OLEDs). Employing molecular design principles developed to engineer the thermal properties of polymers, molecular-scale electrostatic repulsive forces are utilized to modify chain morphologies in amorphous polymers, leading to spin-cast films that are free of ceramic or metallic fillers yet have thermal conductivities as high as 1.17 Wm-1K-1, which is approximately 6 times that of typical amorphous polymers. Electronics packaging designs incorporating phase change materials (PCMs) are then considered as a means to mitigate bursty heat sources; PCM incorporation in a packaged accelerator chip intended for large-scale object identification is found to suppress the peak die temperature by 17%.PHDMechanical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/150013/1/chenlium_1.pd

    SIMD@OpenMP : a programming model approach to leverage SIMD features

    Get PDF
    SIMD instruction sets are a key feature in current general purpose and high performance architectures. SIMD instructions apply in parallel the same operation to a group of data, commonly known as vector. A single SIMD/vector instruction can, thus, replace a sequence of scalar instructions. Consequently, the number of instructions can be greatly reduced leading to improved execution times. However, SIMD instructions are not widely exploited by the vast majority of programmers. In many cases, taking advantage of these instructions relies on the compiler. Nevertheless, compilers struggle with the automatic vectorization of codes. Advanced programmers are then compelled to exploit SIMD units by hand, using low-level hardware-specific intrinsics. This approach is cumbersome, error prone and not portable across SIMD architectures. This thesis targets OpenMP to tackle the underuse of SIMD instructions from three main areas of the programming model: language constructions, compiler code optimizations and runtime algorithms. We choose the Intel Xeon Phi coprocessor (Knights Corner) and its 512-bit SIMD instruction set for our evaluation process. We make four contributions aimed at improving the exploitation of SIMD instructions in this scope. Our first contribution describes a compiler vectorization infrastructure suitable for OpenMP. This infrastructure targets for-loops and whole functions. We define a set of attributes for expressions that determine how the code is vectorized. Our vectorization infrastructure also implements support for several advanced vector features. This infrastructure is proven to be effective in the vectorization of complex codes and it is the basis upon which we build the following two contributions. The second contribution introduces a proposal to extend OpenMP 3.1 with SIMD parallelism. Essential parts of this work have become key features of the SIMD proposal included in OpenMP 4.0. We define the "simd" and "simd for" directives that allow programmers to describe SIMD parallelism of loops and whole functions. Furthermore, we propose a set of optional clauses that leads the compiler to generate a more efficient vector code. These SIMD extensions improve the programming efficiency when exploiting SIMD resources. Our evaluation on the Intel Xeon Phi coprocessor shows that our SIMD proposal allows the compiler to efficiently vectorize codes poorly or not vectorized automatically with the Intel C/C++ compiler. In the third contribution, we propose a vector code optimization that enhances overlapped vector loads. These vector loads redundantly read from memory scalar elements already loaded by other vector loads. Our vector code optimization improves the memory usage of these accesses by means of building a vector register cache and exploiting register-to-register instructions. Our proposal also includes a new clause (overlap) in the context of the SIMD extensions for OpenMP of our first contribution. This new clause allows enabling, disabling and tuning this optimization on demand. The last contribution tackles the exploitation of SIMD instructions in the OpenMP barrier and reduction primitives. We propose a new combined barrier and reduction tree scheme specifically designed to make the most of SIMD instructions. Our barrier algorithm takes advantage of simultaneous multi-threading technology (SMT) and it utilizes SIMD memory instructions in the synchronization process. The four contributions of this thesis are an important step in the direction of a more common and generalized use of SIMD instructions. Our work is having an outstanding impact on the whole OpenMP community, ranging from users of the programming model to compiler and runtime implementations. Our proposals in the context of OpenMP improves the programmability of the programming model, the overhead of runtime services and the execution time of applications by means of a better use of SIMD.Los juegos de instrucciones SIMD son un componente clave en las arquitecturas de propósito general y de alto rendimiento actuales. Estas instrucciones aplican en paralelo la misma operación a un conjunto de datos, conocido como vector. Una instrucción SIMD/vectorial puede sustituir una secuencia de instrucciones escalares. Así, el número de instrucciones puede ser reducido considerablemente, dando lugar a mejores tiempos de ejecución. No obstante, las instrucciones SIMD no son explotadas ampliamente por la mayoría de programadores. En general, beneficiarse de estas instrucciones depende del compilador. Sin embargo, los compiladores tienen dificultades con la vectorización automática de códigos por lo que los programadores avanzados se ven obligados a explotar las unidades SIMD manualmente, empleando intrínsecas de bajo nivel específicas del hardware. Esta aproximación es costosa, propensa a errores y no portable entre arquitecturas. Esta tesis se centra en el modelo de programación OpenMP para abordar el poco uso de las instrucciones SIMD desde tres áreas: construcciones del lenguaje, optimizaciones de código del compilador y algoritmos del runtime. Hemos escogido el coprocesador Intel Xeon Phi (Knights Corner) y su juego de instrucciones SIMD de 512 bits para nuestra evaluación. Realizamos cuatro contribuciones para mejorar la explotación de las instrucciones SIMD en este ámbito. Nuestra primera contribución describe una infraestructura de vectorización de compilador adecuada para OpenMP. Esta infraestructura tiene como objetivo la vectorización de bucles y funciones. Para ello definimos un conjunto de atributos que determina como se vectoriza el código. Nuestra evaluación demuestra la efectividad de esta infraestructura en la vectorización de códigos complejos. Esta infraestructura es la base de las dos propuestas siguientes. En la segunda contribución proponemos una extensión SIMD para de OpenMP 3.1. Partes esenciales de este trabajo se han convertido en características clave de la propuesta sobre SIMD incluida en OpenMP 4.0. Definimos las directivas ‘simd’ y ‘simd for’ que permiten a los programadores describir paralelismo SIMD de bucles y funciones. Además, proponemos un conjunto de cláusulas opcionales que permiten que el compilador genere código vectorial más eficiente. Nuestra evaluación muestra que nuestra propuesta SIMD permite al compilador vectorizar eficientemente códigos pobremente o no vectorizados automáticamente con el compilador Intel C/C++

    21st Century Computer Architecture

    No full text
    Because most technology and computer architecture innovations were (intentionally) invisible to higher layers, application and other software developers could reap the benefits of this progress without engaging in it. Higher performance has both made more computationally demanding applications feasible (e.g., virtual assistants, computer vision) and made less demanding applications easier to develop by enabling higher-level programming abstractions (e.g., scripting languages and reusable components). Improvements in computer system cost-effectiveness enabled value creation that could never have been imagined by the field's founders (e.g., distributed web search sufficiently inexpensive so as to be covered by advertising links). The wide benefits of computer performance growth are clear. Recently, Danowitz et al. apportioned computer performance growth roughly equally between technology and architecture, with architecture credited with ~80x improvement since 1985. As semiconductor technology approaches its "end-of-the-road" (see below), computer architecture will need to play an increasing role in enabling future ICT innovation. But instead of asking, "How can I make my chip run faster?," architects must now ask, "How can I enable the 21st century infrastructure, from sensors to clouds, adding value from performance to privacy, but without the benefit of near-perfect technology scaling?". The challenges are many, but with appropriate investment, opportunities abound. Underlying these opportunities is a common theme that future architecture innovations will require the engagement of and investments from innovators in other ICT layers

    21st Century Computer Architecture

    No full text

    21st century computer architecture

    No full text

    21st century computer architecture keynote at 2014 international conference on supercomputing (ICS)

    No full text
    corecore