34 research outputs found

    A New System Architecture for Heterogeneous Compute Units

    Get PDF
    The ongoing trend to more heterogeneous systems forces us to rethink the design of systems. In this work, I study a new system design that considers heterogeneous compute units (general-purpose cores with different instruction sets, DSPs, FPGAs, fixed-function accelerators, etc.) from the beginning instead of as an afterthought. The goal is to treat all compute units (CUs) as first-class citizens, enabling (1) isolation and secure communication between all types of CUs, (2) a direct interaction of all CUs, removing the conventional CPU from the critical path, and (3) access to operating system (OS) services such as file systems and network stacks for all CUs. To study this system design, I am using a hardware/software co-design based on two key ideas: 1) introduce a new hardware component next to each CU used by the OS as the CUs' common interface and 2) let the OS kernel control applications remotely from a different CU. The hardware component is called data transfer unit (DTU) and offers the minimal set of features to reach the stated goals: secure message passing and memory access. The OS is called M³ and runs its kernel on a dedicated CU and runs the OS services and applications on the remaining CUs. The kernel is responsible for establishing DTU-based communication channels between services and applications. After a channel has been set up, services and applications communicate directly without involving the kernel. This approach allows to support arbitrary CUs as aforementioned first-class citizens, ranging from fixed-function accelerators to complex general-purpose cores

    Exploratory studies for the position-space approach to hadronic light-by-light scattering in the muon g−2g-2

    Full text link
    The well-known discrepancy in the muon g−2g-2 between experiment and theory demands further theory investigations in view of the upcoming new experiments. One of the leading uncertainties lies in the hadronic light-by-light scattering contribution (HLbL), that we address with our position-space approach. We focus on exploratory studies of the pion-pole contribution in a simple model and the fermion loop without gluon exchanges in the continuum and in infinite volume. These studies provide us with useful information for our planned computation of HLbL in the muon g−2g-2 using full QCD.Comment: 8 pages, 11 figures, 1 table, Lattice 2017 proceedings, Granada, Spai

    Hadronic light-by-light scattering in the anomalous magnetic moment of the muon

    Full text link
    Hadronic light-by-light scattering in the anomalous magnetic moment of the muon aμa_\mu is one of two hadronic effects limiting the precision of the Standard Model prediction for this precision observable, and hence the new-physics discovery potential of direct experimental determinations of aμa_\mu. In this contribution, we report on recent progress in the calculation of this effect achieved both via dispersive and lattice QCD methods.Comment: 14 pages, 7 figures; submitted as proceedings contribution for the 15th International Workshop on Tau Lepton Physic

    Query processing on low-energy many-core processors

    Get PDF
    Aside from performance, energy efficiency is an increasing challenge in database systems. To tackle both aspects in an integrated fashion, we pursue a hardware/software co-design approach. To fulfill the energy requirement from the hardware perspective, we utilize a low-energy processor design offering the possibility to us to place hundreds to millions of chips on a single board without any thermal restrictions. Furthermore, we address the performance requirement by the development of several database-specific instruction set extensions to customize each core, whereas each core does not have all extensions. Therefore, our hardware foundation is a low-energy processor consisting of a high number of heterogeneous cores. In this paper, we introduce our hardware setup on a system level and present several challenges for query processing. Based on these challenges, we describe two implementation concepts and a comparison between these concepts. Finally, we conclude the paper with some lessons learned and an outlook on our upcoming research directions

    The Orchestration Stack: The Impossible Task of Designing Software for Unknown Future Post-CMOS Hardware

    Get PDF
    Future systems based on post-CMOS technologies will be wildly heterogeneous, with properties largely unknown today. This paper presents our design of a new hardware/software stack to address the challenge of preparing software development for such systems. It combines well-understood technologies from different areas, e.g., network-on-chips, capability operating systems, flexible programming models and model checking. We describe our approach and provide details on key technologies

    A New System Architecture for Heterogeneous Compute Units

    No full text
    The ongoing trend to more heterogeneous systems forces us to rethink the design of systems. In this work, I study a new system design that considers heterogeneous compute units (general-purpose cores with different instruction sets, DSPs, FPGAs, fixed-function accelerators, etc.) from the beginning instead of as an afterthought. The goal is to treat all compute units (CUs) as first-class citizens, enabling (1) isolation and secure communication between all types of CUs, (2) a direct interaction of all CUs, removing the conventional CPU from the critical path, and (3) access to operating system (OS) services such as file systems and network stacks for all CUs. To study this system design, I am using a hardware/software co-design based on two key ideas: 1) introduce a new hardware component next to each CU used by the OS as the CUs' common interface and 2) let the OS kernel control applications remotely from a different CU. The hardware component is called data transfer unit (DTU) and offers the minimal set of features to reach the stated goals: secure message passing and memory access. The OS is called M³ and runs its kernel on a dedicated CU and runs the OS services and applications on the remaining CUs. The kernel is responsible for establishing DTU-based communication channels between services and applications. After a channel has been set up, services and applications communicate directly without involving the kernel. This approach allows to support arbitrary CUs as aforementioned first-class citizens, ranging from fixed-function accelerators to complex general-purpose cores

    A variance reduction technique for hadronic correlators with partially twisted boundary conditions

    No full text
    Partially twisted boundary conditions are widely used for improving the momentum resolution in lattice computations of hadronic correlation functions. The method is however expensive since every additional twist requires computing additional propagators. We propose a novel variance reduction technique that exploits statistical correlations to reduce the overall cost for computing correlators with additional twist angles. We explain and demonstrate the method for meson 2pt and 3pt functions
    corecore