180 research outputs found

    Code Generation and Global Optimization Techniques for a Reconfigurable PRAM-NUMA Multicore Architecture

    Full text link

    Algorithm Libraries for Multi-Core Processors

    Get PDF
    By providing parallelized versions of established algorithm libraries, we ease the exploitation of the multiple cores on modern processors for the programmer. The Multi-Core STL provides basic algorithms for internal memory, while the parallelized STXXL enables multi-core acceleration for algorithms on large data sets stored on disk. Some parallelized geometric algorithms are introduced into CGAL. Further, we design and implement sorting algorithms for huge data in distributed external memory

    Parallel and Distributed Computing

    Get PDF
    The 14 chapters presented in this book cover a wide variety of representative works ranging from hardware design to application development. Particularly, the topics that are addressed are programmable and reconfigurable devices and systems, dependability of GPUs (General Purpose Units), network topologies, cache coherence protocols, resource allocation, scheduling algorithms, peertopeer networks, largescale network simulation, and parallel routines and algorithms. In this way, the articles included in this book constitute an excellent reference for engineers and researchers who have particular interests in each of these topics in parallel and distributed computing

    Memory Subsystem Design for Explicit Multithreading Architectures

    Get PDF
    Explicit multithreading (XMT) is a parallel programming approach for exploiting on-chip parallelism. An important enabler for XMT is sufficient memory bandwidth to support parallelism. For targeted deep-submicron VLSI processes, chip designers will be faced with the widely acknowledged issues of rising interconnect RC delays and shortening clock periods. Comprehensive memory design for an XMT architecture has never before been rigorously studied. This thesis relies on an examination the implications of the XMT programming model on memory subsystem design to motivate a potential framework for on-chip memory interconnection. Many system-level issues are considered, and analytical electrical interconnect modeling is used to demonstrate the physical viability of new structures in future processes. It is estimated that a chip, built in a 2008 process with 1024 hardware execution contexts, may be capable of a sustained on-chip memory transaction throughput of 1430 GB/s

    Congruent Weak Conformance

    Get PDF
    This research addresses the problem of verifying implementations against specifications through an innovative logic approach. Congruent weak conformance, a formal relationship between agents and specifications, has been developed and proven to be a congruent partial order. This property arises from a set of relations called weak conformations. The largest, called weak conformance, is analogous to Milner\u27s observational equivalence. Weak conformance is not an equivalence, however, but rather an ordering relation among processes. Weak conformance allows behaviors in the implementation that are unreachable in the specification. Furthermore, it exploits output concurrencies and allows interleaving of extraneous output actions in the implementation. Finally, reasonable restrictions in CCS syntax strengthen weak conformance to a congruence, called congruent weak conformance. At present, congruent weak conformance is the best known formal relation for verifying implementations against specifications. This precongruence derives maximal flexibility and embodies all weaknesses in input, output, and no-connect signals while retaining a fully replaceable conformance to the specification. Congruent weak conformance has additional utility in verifying transformations between systems of incompatible semantics. This dissertation describes a hypothetical translator from the informal simulation semantics of VHDL to the bisimulation semantics of CCS. A second translator is described from VHDL to a broadcast-communication version of CCS. By showing that they preserve congruent weak conformance, both translators are verified

    Approaches to parallel performance prediction

    Get PDF
    • …
    corecore