30 research outputs found

    Toward Performance-Portable PETSc for GPU-based Exascale Systems

    Full text link
    The Portable Extensible Toolkit for Scientific computation (PETSc) library delivers scalable solvers for nonlinear time-dependent differential and algebraic equations and for numerical optimization.The PETSc design for performance portability addresses fundamental GPU accelerator challenges and stresses flexibility and extensibility by separating the programming model used by the application from that used by the library, and it enables application developers to use their preferred programming model, such as Kokkos, RAJA, SYCL, HIP, CUDA, or OpenCL, on upcoming exascale systems. A blueprint for using GPUs from PETSc-based codes is provided, and case studies emphasize the flexibility and high performance achieved on current GPU-based systems.Comment: 15 pages, 10 figures, 2 table

    Gali’s Prize: A Treasure Hunt Game for the Textile Museum of Canada

    Get PDF
    Gali’s Prize is an experimental treasure-hunting game that integrates tangible and multi-screen interactions. The game has been designed for the Textile Museum of Canada (TMC) to replace the old quiz-style scavenger hunt with paper and pen. Its goal is to provide an entertaining, educational experience for children on school trips. The learning journey begins with an initial engagement at the starting spot and continues by approaching and connecting with a couple of specific artifacts in the exhibition space. The whole experience blends self- directed curation with an augmented reality (AR) treasure-hunting experience. During their participation, children will learn the stories behind the artifacts they encounter and gain lasting memories of their visit. The investigation stands at the intersection of museum business, children’s learning experience, and digital technology, and explores the opportunities and challenges involved in using mixed technologies in museums and galleries during the near future. At the same time, this examination studies the engagements and interactions of visitors on site. These explorations can potentially create benefits for both museums and visitors. The prototype of Gali’s Prize was inspired by theoretical conclusions in existing literature, personal experiences in museums and galleries, and some studies of particular cases. It helps a specialized museum, the TMC, to experiment with a new solution that may solve their current issues. This paper explains the relevant critical thinking, documents the development process of Gali’s Prize, and provides discussion and reflection about the work

    Eliminating stack overflow by abstract interpretation

    Get PDF
    ManuscriptAn important correctness criterion for software running on embedded microcontrollers is stack safety: a guarantee that the call stack does not overflow. Our first contribution is a method for statically guaranteeing stack safety of interrupt-driven embedded software using an approach based on context-sensitive dataflow analysis of object code. We have implemented a prototype stack analysis tool that targets software for Atmel AVR microcontrollers and tested it on embedded applications compiled from up to 30,000 lines of C. We experimentally validate the accuracy of the tool, which runs in under 10 sec on the largest programs that we tested. The second contribution of this paper is the development of two novel ways to reduce stack memory requirements of embedded software

    C4: Verified Transactional Objects

    Get PDF
    A framework for Verified Transactional Objects in Coq. - Formalization of concurrent objects, linearizability, strict serializability, and associated proof techniques. - Verified linearizable concurrent hash map - Verified strictly serializable TML - Verified strictly serializable transaction-predicated ma

    Mixed-size concurrency: ARM, POWER, C/C++11, and SC

    Get PDF
    Previous work on the semantics of relaxed shared-memory concurrency has only considered the case in which each load reads the data of exactly one store. In practice, however, multiprocessors support mixed-size accesses, and these are used by systems software and (to some degree) exposed at the C/C++ language level. A semantic foundation for software, therefore, has to address them. We investigate the mixed-size behaviour of ARMv8 and IBM POWER architectures and implementations: by experiment, by developing semantic models, by testing the correspondence between these, and by discussion with ARM and IBM staff. This turns out to be surprisingly subtle, and on the way we have to revisit the fundamental concepts of coherence and sequential consistency, which change in this setting. In particular, we show that adding a memory barrier between each instruction does not restore sequential consistency. We go on to extend the C/C++11 model to support nonatomic mixed-size memory accesses, and prove the standard compilation scheme from C11 atomics to POWER remains sound. This is a necessary step towards semantics for real-world shared-memory concurrent code, beyond litmus tests

    Generation of reconfigurable circuits from machine code

    Get PDF
    Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores. Telecomunicações. Universidade do Porto. Faculdade de Engenharia. 201
    corecore