296 research outputs found
2022 Review of Data-Driven Plasma Science
Data-driven science and technology offer transformative tools and methods to science. This review article highlights the latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS), i.e., plasma science whose progress is driven strongly by data and data analyses. Plasma is considered to be the most ubiquitous form of observable matter in the universe. Data associated with plasmas can, therefore, cover extremely large spatial and temporal scales, and often provide essential information for other scientific disciplines. Thanks to the latest technological developments, plasma experiments, observations, and computation now produce a large amount of data that can no longer be analyzed or interpreted manually. This trend now necessitates a highly sophisticated use of high-performance computers for data analyses, making artificial intelligence and machine learning vital components of DDPS. This article contains seven primary sections, in addition to the introduction and summary. Following an overview of fundamental data-driven science, five other sections cover widely studied topics of plasma science and technologies, i.e., basic plasma physics and laboratory experiments, magnetic confinement fusion, inertial confinement fusion and high-energy-density physics, space and astronomical plasmas, and plasma technologies for industrial and other applications. The final section before the summary discusses plasma-related databases that could significantly contribute to DDPS. Each primary section starts with a brief introduction to the topic, discusses the state-of-the-art developments in the use of data and/or data-scientific approaches, and presents the summary and outlook. Despite the recent impressive signs of progress, the DDPS is still in its infancy. This article attempts to offer a broad perspective on the development of this field and identify where further innovations are required
Design and management of image processing pipelines within CPS : Acquired experience towards the end of the FitOptiVis ECSEL Project
Cyber-Physical Systems (CPSs) are dynamic and reactive systems interacting with processes, environment and, sometimes, humans. They are often distributed with sensors and actuators, characterized for being smart, adaptive, predictive and react in real-time. Indeed, image- and video-processing pipelines are a prime source for environmental information for systems allowing them to take better decisions according to what they see. Therefore, in FitOptiVis, we are developing novel methods and tools to integrate complex image- and video-processing pipelines. FitOptiVis aims to deliver a reference architecture for describing and optimizing quality and resource management for imaging and video pipelines in CPSs both at design- and run-time. The architecture is concretized in low-power, high-performance, smart components, and in methods and tools for combined design-time and run-time multi-objective optimization and adaptation within system and environment constraints.Peer reviewe
Adaptable register file organization for vector processors
Today there are two main vector processors design trends. On the one hand, we have vector processors designed for long vectors lengths such as the SX-Aurora TSUBASA which implements vector lengths of 256 elements (16384-bits). On the other hand, we have vector processors designed for short vectors such as the Fujitsu A64FX that implements vector lengths of 8 elements (512-bit) ARM SVE. However, short vector designs are the most widely adopted in modern chips. This is because, to achieve high-performance with a very high-efficiency, applications executed on long vector designs must feature abundant DLP, then limiting the range of applications. On the contrary, short vector designs are compatible with a larger range of applications. In fact, in the beginnings, long vector length implementations were focused on the HPC market, while short vector length implementations were conceived to improve performance in multimedia tasks. However, those short vector length extensions have evolved to better fit the needs of modern applications. In that sense, we believe that this compatibility with a large range of applications featuring high, medium and low DLP is one of the main reasons behind the trend of building parallel machines with short vectors. Short vector designs are area efficient and are "compatible" with applications having long vectors; however, these short vector architectures are not as efficient as longer vector designs when executing high DLP code.
In this thesis, we propose a novel vector architecture that combines the area and resource efficiency characterizing short vector processors with the ability to handle large DLP applications, as allowed in long vector architectures. In this context, we present AVA, an Adaptable Vector Architecture designed for short vectors (MVL = 16 elements), capable of reconfiguring the MVL when executing applications with abundant DLP, achieving performance comparable to designs for long vectors. The design is based on three complementary concepts. First, a two-stage renaming unit based on a new type of registers termed as Virtual Vector Registers (VVRs), which are an intermediate mapping between the conventional logical and the physical and memory registers. In the first stage, logical registers are renamed to VVRs, while in the second stage, VVRs are renamed to physical registers. Second, a two-level VRF, that supports 64 VVRs whose MVL can be configured from 16 to 128 elements. The first level corresponds to the VVRs mapped in the physical registers held in the 8KB Physical Vector Register File (P-VRF), while the second level represents the VVRs mapped in memory registers held in the Memory Vector Register File (M-VRF). While the baseline configuration (MVL=16 elements) holds all the VVRs in the P-VRF, larger MVL configurations hold a subset of the total VVRs in the P-VRF, and map the remaining part in the M-VRF. Third, we propose a novel two-stage vector issue unit. In the first stage, the second level of mapping between the VVRs and physical registers is performed, while issuing to execute is managed in the second stage.
This thesis also presents a set of tools for designing and evaluating vector architectures. First, a parameterizable vector architecture model implemented on the gem5 simulator to evaluate novel ideas on vector architectures. Second, a Vector Architecture model implemented on the McPAT framework to evaluate power and area metrics. Finally, the RiVEC benchmark suite, a collection of ten vectorized applications from different domains focusing on benchmarking vector microarchitectures.Hoy en día existen dos tendencias principales en el diseño de procesadores vectoriales. Por un lado, tenemos procesadores vectoriales basados en vectores largos como el SX-Aurora TSUBASA que implementa vectores con 256 elementos (16384-bits) de longitud. Por otro lado, tenemos procesadores vectoriales basados en vectores cortos como el Fujitsu A64FX que implementa vectores de 8 elementos (512-bits) de longitud ARM SVE. Sin embargo, los diseños de vectores cortos son los más adoptados en los chips modernos. Esto es porque, para lograr alto rendimiento con muy alta eficiencia, las aplicaciones ejecutadas en diseños de vectores largos deben presentar abundante paralelismo a nivel de datos (DLP), lo que limita el rango de aplicaciones. Por el contrario, los diseños de vectores cortos son compatibles con un rango más amplio de aplicaciones. En sus orígenes, implementaciones basadas en vectores largos estaban enfocadas al HPC, mientras que las implementaciones basadas en vectores cortos estaban enfocadas en tareas de multimedia. Sin embargo, esas extensiones basadas en vectores cortos han evolucionado para adaptarse mejor a las necesidades de las aplicaciones modernas. Creemos que esta compatibilidad con un mayor rango de aplicaciones es una de las principales razones de construir máquinas paralelas basadas en vectores cortos. Los diseños de vectores cortos son eficientes en área y son compatibles con aplicaciones que soportan vectores largos; sin embargo, estos diseños de vectores cortos no son tan eficientes como los diseños de vectores largos cuando se ejecuta un código con alto DLP. En esta tesis, proponemos una novedosa arquitectura vectorial que combina la eficiencia de área y recursos que caracteriza a los procesadores vectoriales basados en vectores cortos, con la capacidad de mejorar en rendimiento cuando se presentan aplicaciones con alto DLP, como lo permiten las arquitecturas vectoriales basadas en vectores largos. En este contexto, presentamos AVA, una Arquitectura Vectorial Adaptable basada en vectores cortos (MVL = 16 elementos), capaz de reconfigurar el MVL al ejecutar aplicaciones con abundante DLP, logrando un rendimiento comparable a diseños basados en vectores largos. El diseño se basa en tres conceptos. Primero, una unidad de renombrado de dos etapas basada en un nuevo tipo de registros denominados registros vectoriales virtuales (VVR), que son un mapeo intermedio entre los registros lógicos y físicos y de memoria. En la primera etapa, los registros lógicos se renombran a VVR, mientras que, en la segunda etapa, los VVR se renombran a registros físicos. En segundo lugar, un VRF de dos niveles, que admite 64 VVR cuyo MVL se puede configurar de 16 a 128 elementos. El primer nivel corresponde a los VVR mapeados en los registros físicos contenidos en el banco de registros vectoriales físico (P-VRF) de 8 KB, mientras que el segundo nivel representa los VVR mapeados en los registros de memoria contenidos en el banco de registros vectoriales de memoria (M-VRF). Mientras que la configuración de referencia (MVL=16 elementos) contiene todos los VVR en el P-VRF, las configuraciones de MVL más largos contienen un subconjunto del total de VVR en el P-VRF y mapean la parte restante en el M-VRF. En tercer lugar, proponemos una novedosa unidad de colas de emisión de dos etapas. En la primera etapa se realiza el segundo nivel de mapeo entre los VVR y los registros físicos, mientras que en la segunda etapa se gestiona la emisión de instrucciones a ejecutar. Esta tesis también presenta un conjunto de herramientas para diseñar y evaluar arquitecturas vectoriales. Primero, un modelo de arquitectura vectorial parametrizable implementado en el simulador gem5 para evaluar novedosas ideas. Segundo, un modelo de arquitectura vectorial implementado en McPAT para evaluar las métricas de potencia y área. Finalmente, presentamos RiVEC, una colección de diez aplicaciones vectorizadas enfocadas en evaluar arquitecturas vectorialesPostprint (published version
Smart Sensor Technologies for IoT
The recent development in wireless networks and devices has led to novel services that will utilize wireless communication on a new level. Much effort and resources have been dedicated to establishing new communication networks that will support machine-to-machine communication and the Internet of Things (IoT). In these systems, various smart and sensory devices are deployed and connected, enabling large amounts of data to be streamed. Smart services represent new trends in mobile services, i.e., a completely new spectrum of context-aware, personalized, and intelligent services and applications. A variety of existing services utilize information about the position of the user or mobile device. The position of mobile devices is often achieved using the Global Navigation Satellite System (GNSS) chips that are integrated into all modern mobile devices (smartphones). However, GNSS is not always a reliable source of position estimates due to multipath propagation and signal blockage. Moreover, integrating GNSS chips into all devices might have a negative impact on the battery life of future IoT applications. Therefore, alternative solutions to position estimation should be investigated and implemented in IoT applications. This Special Issue, “Smart Sensor Technologies for IoT” aims to report on some of the recent research efforts on this increasingly important topic. The twelve accepted papers in this issue cover various aspects of Smart Sensor Technologies for IoT
Shader optimization and specialization
In the field of real-time graphics for computer games, performance has a significant effect on the player’s enjoyment and immersion. Graphics processing units (GPUs) are
hardware accelerators that run small parallelized shader programs to speed up computationally expensive rendering calculations. This thesis examines optimizing shader
programs and explores ways in which data patterns on both the CPU and GPU can be
analyzed to automatically speed up rendering in games.
Initially, the effect of traditional compiler optimizations on shader source-code
was explored. Techniques such as loop unrolling or arithmetic reassociation provided
speed-ups on several devices, but different GPU hardware responded differently to
each set of optimizations. Analyzing execution traces from numerous popular PC
games revealed that much of the data passed from CPU-based API calls to GPU-based
shaders is either unused, or remains constant. A system was developed to capture this
constant data and fold it into the shaders’ source-code. Re-running the game’s rendering code using these specialized shader variants resulted in performance improvements
in several commercial games without impacting their visual quality
Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing
The availability of many-core computing platforms enables a wide variety of technical solutions for systems across the embedded, high-performance and cloud computing domains. However, large scale manycore systems are notoriously hard to optimise. Choices regarding resource allocation alone can account for wide variability in timeliness and energy dissipation (up to several orders of magnitude). Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing covers dynamic resource allocation heuristics for manycore systems, aiming to provide appropriate guarantees on performance and energy efficiency. It addresses different types of systems, aiming to harmonise the approaches to dynamic allocation across the complete spectrum between systems with little flexibility and strict real-time guarantees all the way to highly dynamic systems with soft performance requirements. Technical topics presented in the book include: • Load and Resource Models• Admission Control• Feedback-based Allocation and Optimisation• Search-based Allocation Heuristics• Distributed Allocation based on Swarm Intelligence• Value-Based AllocationEach of the topics is illustrated with examples based on realistic computational platforms such as Network-on-Chip manycore processors, grids and private cloud environments
Recent Advances in Wireless Communications and Networks
This book focuses on the current hottest issues from the lowest layers to the upper layers of wireless communication networks and provides "real-time" research progress on these issues. The authors have made every effort to systematically organize the information on these topics to make it easily accessible to readers of any level. This book also maintains the balance between current research results and their theoretical support. In this book, a variety of novel techniques in wireless communications and networks are investigated. The authors attempt to present these topics in detail. Insightful and reader-friendly descriptions are presented to nourish readers of any level, from practicing and knowledgeable communication engineers to beginning or professional researchers. All interested readers can easily find noteworthy materials in much greater detail than in previous publications and in the references cited in these chapters
Telecommunications Networks
This book guides readers through the basics of rapidly emerging networks to more advanced concepts and future expectations of Telecommunications Networks. It identifies and examines the most pressing research issues in Telecommunications and it contains chapters written by leading researchers, academics and industry professionals. Telecommunications Networks - Current Status and Future Trends covers surveys of recent publications that investigate key areas of interest such as: IMS, eTOM, 3G/4G, optimization problems, modeling, simulation, quality of service, etc. This book, that is suitable for both PhD and master students, is organized into six sections: New Generation Networks, Quality of Services, Sensor Networks, Telecommunications, Traffic Engineering and Routing
- …