304 research outputs found

    Task-based Runtime Optimizations Towards High Performance Computing Applications

    Get PDF
    The last decades have witnessed a rapid improvement of computational capabilities in high-performance computing (HPC) platforms thanks to hardware technology scaling. HPC architectures benefit from mainstream advances on the hardware with many-core systems, deep hierarchical memory subsystem, non-uniform memory access, and an ever-increasing gap between computational power and memory bandwidth. This has necessitated continuous adaptations across the software stack to maintain high hardware utilization. In this HPC landscape of potentially million-way parallelism, task-based programming models associated with dynamic runtime systems are becoming more popular, which fosters developers’ productivity at extreme scale by abstracting the underlying hardware complexity. In this context, this dissertation highlights how a software bundle powered by a task-based programming model can address the heterogeneous workloads engendered by HPC applications., i.e., data redistribution, geospatial modeling and 3D unstructured mesh deformation here. Data redistribution aims to reshuffle data to optimize some objective for an algorithm, whose objective can be multi-dimensional, such as improving computational load balance or decreasing communication volume or cost, with the ultimate goal of increasing the efficiency and therefore reducing the time-to-solution for the algorithm. Geostatistical modeling, one of the prime motivating applications for exascale computing, is a technique for predicting desired quantities from geographically distributed data, based on statistical models and optimization of parameters. Meshing the deformable contour of moving 3D bodies is an expensive operation that can cause huge computational challenges in fluid-structure interaction (FSI) applications. Therefore, in this dissertation, Redistribute-PaRSEC, ExaGeoStat-PaRSEC and HiCMA-PaRSEC are proposed to efficiently tackle these HPC applications respectively at extreme scale, and they are evaluated on multiple HPC clusters, including AMD-based, Intel-based, Arm-based CPU systems and IBM-based multi-GPU system. This multidisciplinary work emphasizes the need for runtime systems to go beyond their primary responsibility of task scheduling on massively parallel hardware system for servicing the next-generation scientific applications

    Network-on-Chip

    Get PDF
    Limitations of bus-based interconnections related to scalability, latency, bandwidth, and power consumption for supporting the related huge number of on-chip resources result in a communication bottleneck. These challenges can be efficiently addressed with the implementation of a network-on-chip (NoC) system. This book gives a detailed analysis of various on-chip communication architectures and covers different areas of NoCs such as potentials, architecture, technical challenges, optimization, design explorations, and research directions. In addition, it discusses current and future trends that could make an impactful and meaningful contribution to the research and design of on-chip communications and NoC systems

    Architectures for Multinode Superconducting Quantum Computers

    Full text link
    Many proposals to scale quantum technology rely on modular or distributed designs where individual quantum processors, called nodes, are linked together to form one large multinode quantum computer (MNQC). One scalable method to construct an MNQC is using superconducting quantum systems with optical interconnects. However, a limiting factor of these machines will be internode gates, which may be two to three orders of magnitude noisier and slower than local operations. Surmounting the limitations of internode gates will require a range of techniques, including improvements in entanglement generation, the use of entanglement distillation, and optimized software and compilers, and it remains unclear how improvements to these components interact to affect overall system performance, what performance from each is required, or even how to quantify the performance of each. In this paper, we employ a `co-design' inspired approach to quantify overall MNQC performance in terms of hardware models of internode links, entanglement distillation, and local architecture. In the case of superconducting MNQCs with microwave-to-optical links, we uncover a tradeoff between entanglement generation and distillation that threatens to degrade performance. We show how to navigate this tradeoff, lay out how compilers should optimize between local and internode gates, and discuss when noisy quantum links have an advantage over purely classical links. Using these results, we introduce a roadmap for the realization of early MNQCs which illustrates potential improvements to the hardware and software of MNQCs and outlines criteria for evaluating the landscape, from progress in entanglement generation and quantum memory to dedicated algorithms such as distributed quantum phase estimation. While we focus on superconducting devices with optical interconnects, our approach is general across MNQC implementations.Comment: 23 pages, white pape

    Smart Sensor Technologies for IoT

    Get PDF
    The recent development in wireless networks and devices has led to novel services that will utilize wireless communication on a new level. Much effort and resources have been dedicated to establishing new communication networks that will support machine-to-machine communication and the Internet of Things (IoT). In these systems, various smart and sensory devices are deployed and connected, enabling large amounts of data to be streamed. Smart services represent new trends in mobile services, i.e., a completely new spectrum of context-aware, personalized, and intelligent services and applications. A variety of existing services utilize information about the position of the user or mobile device. The position of mobile devices is often achieved using the Global Navigation Satellite System (GNSS) chips that are integrated into all modern mobile devices (smartphones). However, GNSS is not always a reliable source of position estimates due to multipath propagation and signal blockage. Moreover, integrating GNSS chips into all devices might have a negative impact on the battery life of future IoT applications. Therefore, alternative solutions to position estimation should be investigated and implemented in IoT applications. This Special Issue, “Smart Sensor Technologies for IoT” aims to report on some of the recent research efforts on this increasingly important topic. The twelve accepted papers in this issue cover various aspects of Smart Sensor Technologies for IoT

    Reinforcement Learning based Fault-Tolerant Routing Algorithm for Mesh based NoC and its FPGA Implementation

    Get PDF
    Network-on-Chip (NoC) has emerged as the most promising on-chip interconnection framework in Multi-Processor System-on-Chips (MPSoCs) due to its efficiency and scalability. In the deep submicron level, NoCs are vulnerable to faults, which leads to the failure of network components such as links and routers. Failures in NoC components diminish system efficiency and reliability. This paper proposes a Reinforcement Learning based Fault-Tolerant Routing (RL-FTR) algorithm to tackle the routing issues caused by link and router faults in the mesh-based NoC architecture. The efficiency of the proposed RL-FTR algorithm is examined using System-C based cycle-accurate NoC simulator. Simulations are carried out by increasing the number of links and router faults in various sizes of mesh. Followed by simulations, real-time functioning of the proposed RL-FTR algorithm is observed using the FPGA implementation. Results of the simulation and hardware shows that the proposed RL-FTR algorithm provides an optimal routing path from the source router to the destination router.publishedVersio

    Trellis Decoding And Applications For Quantum Error Correction

    Get PDF
    Compact, graphical representations of error-correcting codes called trellises are a crucial tool in classical coding theory, establishing both theoretical properties and performance metrics for practical use. The idea was extended to quantum error-correcting codes by Ollivier and Tillich in 2005. Here, we use their foundation to establish a practical decoder able to compute the maximum-likely error for any stabilizer code over a finite field of prime dimension. We define a canonical form for the stabilizer group and use it to classify the internal structure of the graph. Similarities and differences between the classical and quantum theories are discussed throughout. Numerical results are presented which match or outperform current state-of-the-art decoding techniques. New construction techniques for large trellises are developed and practical implementations discussed. We then define a dual trellis and use algebraic graph theory to solve the maximum-likely coset problem for any stabilizer code over a finite field of prime dimension at minimum added cost. Classical trellis theory makes occasional theoretical use of a graph product called the trellis product. We establish the relationship between the trellis product and the standard graph products and use it to provide a closed form expression for the resulting graph, allowing it to be used in practice. We explore its properties and classify all idempotents. The special structure of the trellis allows us to present a factorization procedure for the product, which is much simpler than that of the standard products. Finally, we turn to an algorithmic study of the trellis and explore what coding-theoretic information can be extracted assuming no other information about the code is available. In the process, we present a state-of-the-art algorithm for computing the minimum distance for any stabilizer code over a finite field of prime dimension. We also define a new weight enumerator for stabilizer codes over F_2 incorporating the phases of each stabilizer and provide a trellis-based algorithm to compute it.Ph.D

    Parallel and Distributed Computing

    Get PDF
    The 14 chapters presented in this book cover a wide variety of representative works ranging from hardware design to application development. Particularly, the topics that are addressed are programmable and reconfigurable devices and systems, dependability of GPUs (General Purpose Units), network topologies, cache coherence protocols, resource allocation, scheduling algorithms, peertopeer networks, largescale network simulation, and parallel routines and algorithms. In this way, the articles included in this book constitute an excellent reference for engineers and researchers who have particular interests in each of these topics in parallel and distributed computing

    Ant Colony Optimization

    Get PDF
    Ant Colony Optimization (ACO) is the best example of how studies aimed at understanding and modeling the behavior of ants and other social insects can provide inspiration for the development of computational algorithms for the solution of difficult mathematical problems. Introduced by Marco Dorigo in his PhD thesis (1992) and initially applied to the travelling salesman problem, the ACO field has experienced a tremendous growth, standing today as an important nature-inspired stochastic metaheuristic for hard optimization problems. This book presents state-of-the-art ACO methods and is divided into two parts: (I) Techniques, which includes parallel implementations, and (II) Applications, where recent contributions of ACO to diverse fields, such as traffic congestion and control, structural optimization, manufacturing, and genomics are presented

    Advances in Grid Computing

    Get PDF
    This book approaches the grid computing with a perspective on the latest achievements in the field, providing an insight into the current research trends and advances, and presenting a large range of innovative research papers. The topics covered in this book include resource and data management, grid architectures and development, and grid-enabled applications. New ideas employing heuristic methods from swarm intelligence or genetic algorithm and quantum encryption are considered in order to explain two main aspects of grid computing: resource management and data management. The book addresses also some aspects of grid computing that regard architecture and development, and includes a diverse range of applications for grid computing, including possible human grid computing system, simulation of the fusion reaction, ubiquitous healthcare service provisioning and complex water systems

    GVSU Undergraduate and Graduate Catalog, 2021-2022

    Get PDF
    Grand Valley State University 2021-2022 undergraduate and graduate course catalog. Course catalogs are published annually to provide students with information and guidance for enrollment.https://scholarworks.gvsu.edu/course_catalogs/1096/thumbnail.jp
    • …
    corecore