4,178 research outputs found

    OpenCL + OpenSHMEM Hybrid Programming Model for the Adapteva Epiphany Architecture

    Full text link
    There is interest in exploring hybrid OpenSHMEM + X programming models to extend the applicability of the OpenSHMEM interface to more hardware architectures. We present a hybrid OpenCL + OpenSHMEM programming model for device-level programming for architectures like the Adapteva Epiphany many-core RISC array processor. The Epiphany architecture comprises a 2D array of low-power RISC cores with minimal uncore functionality connected by a 2D mesh Network-on-Chip (NoC). The Epiphany architecture offers high computational energy efficiency for integer and floating point calculations as well as parallel scalability. The Epiphany-III is available as a coprocessor in platforms that also utilize an ARM CPU host. OpenCL provides good functionality for supporting a co-design programming model in which the host CPU offloads parallel work to a coprocessor. However, the OpenCL memory model is inconsistent with the Epiphany memory architecture and lacks support for inter-core communication. We propose a hybrid programming model in which OpenSHMEM provides a better solution by replacing the non-standard OpenCL extensions introduced to achieve high performance with the Epiphany architecture. We demonstrate the proposed programming model for matrix-matrix multiplication based on Cannon's algorithm showing that the hybrid model addresses the deficiencies of using OpenCL alone to achieve good benchmark performance.Comment: 12 pages, 5 figures, OpenSHMEM 2016: Third workshop on OpenSHMEM and Related Technologie

    Asynchronous In Situ Processing with Gromacs: Taking Advantage of GPUs

    Get PDF
    International audienceNumerical simulations using supercomputers are producing an ever growing amount of data. Efficient production and analysis of these data are the key to future discoveries. The in situ paradigm is emerging as a promising solution to avoid the I/O bottleneck encountered in the file system for both the simulation and the analytics by treating the data as soon as they are produced in memory. Various strategies and implementations have been proposed in the last years to support in situ treatments with a low impact on the simulation performance. Yet, little efforts have been made when it comes to perform in situ analytics with hybrid simulations supporting accelerators like GPUs. In this article, we propose a study of the in situ strategies with Gromacs, a molecular dynamic simulation code supporting multi-GPUs, as our application target. We specifically focus on the computational resources usage of the machine by the simulation and the in situ analytics. We finally extend the usual in situ placement strategies to the case of in situ analytics running on a GPU and study their impact on both Gromacs performance and the resource usage of the machine. We show in particular that running in situ analytics on the GPU can be a more efficient solution than on the CPU especially when the CPU is the bottleneck of the simulation

    Bubble-Driven Detachment of Bacteria from Confined Microgeometries

    Get PDF
    Moving air–liquid interfaces, for example, bubbles, play a significant role in the detachment and transport of colloids and microorganisms in confined systems as well as unsaturated porous media. Moreover, they can effectively prevent and/or postpone the development of mature biofilms on surfaces that are colonized by bacteria. Here we demonstrate the dynamics and quantify the effectiveness of this bubble-driven detachment process for the bacterial strain Staphylococcus aureus. We investigate the effects of interface velocity and geometrical factors through microfluidic experiments that mimic some of the confinement features of pore-scale geometries. Depending on the bubble velocity U, at least three different flow regimes are found. These operating flow regimes not only affect the efficiency of the detachment process but also modify the final distribution of the bacteria on the surface. We organize our results according to the capillary number, , where μ and γ are the viscosity and the surface tension, respectively. Bubbles at very low velocities, corresponding to capillary numbers Ca 10–3, have lower detachment efficiencies and cause significant nonuniformities in the final distribution of the cells on the substrate. This effect is associated with the formation of a thin liquid film around the bubble at higher Ca. In general, at higher bubble velocities bacterial cells in the corners of the geometry are less influenced by the bubble passage compared to the central region

    Physiological effects of diet mixing on consumer fitness: a meta-analysis

    Get PDF
    The degree of dietary generalism among consumers has important consequences for population, community, and ecosystem processes, yet the effects on consumer fitness of mixing food types have not been examined comprehensively. We conducted a meta-analysis of 161 peer-reviewed studies reporting 493 experimental manipulations of prey diversity to test whether diet mixing enhances consumer fitness based on the intrinsic nutritional quality of foods and consumer physiology. Averaged across studies, mixed diets conferred significantly higher fitness than the average of single-species diets, but not the best single prey species. More than half of individual experiments, however, showed maximal growth and reproduction on mixed diets, consistent with the predicted benefits of a balanced diet. Mixed diets including chemically defended prey were no better than the average prey type, opposing the prediction that a diverse diet dilutes toxins. Finally, mixed-model analysis showed that the effect of diet mixing was stronger for herbivores than for higher trophic levels. The generally weak evidence for the nutritional benefits of diet mixing in these primarily laboratory experiments suggests that diet generalism is not strongly favored by the inherent physiological benefits of mixing food types, but is more likely driven by ecological and environmental influences on consumer foraging

    Self-adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures

    Full text link
    Abstract. Based on the premise that preconditioners needed for scien-tific computing are not only required to be robust in the numerical sense, but also scalable for up to thousands of light-weight cores, we argue that this two-fold goal is achieved for the recently developed self-adaptive multi-elimination preconditioner. For this purpose, we revise the under-lying idea and analyze the performance of implementations realized in the PARALUTION and MAGMA open-source software libraries on GPU architectures (using either CUDA or OpenCL), Intel’s Many Integrated Core Architecture, and Intel’s Sandy Bridge processor. The comparison with other well-established preconditioners like multi-coloured Gauss-Seidel, ILU(0) and multi-colored ILU(0), shows that the twofold goal of a numerically stable cross-platform performant algorithm is achieved.

    Magnetic excitations of the Cu2+^{2+} quantum spin chain in Sr3_3CuPtO6_6

    Get PDF
    We report the magnetic excitation spectrum as measured by inelastic neutron scattering for a polycrystalline sample of Sr3_3CuPtO6_6. Modeling the data by the 2+4 spinon contributions to the dynamical susceptibility within the chains, and with interchain coupling treated in the random phase approximation, accounts for the major features of the powder-averaged structure factor. The magnetic excitations broaden considerably as temperature is raised, persisting up to above 100 K and displaying a broad transition as previously seen in the susceptibility data. No spin gap is observed in the dispersive spin excitations at low momentum transfer, which is consistent with the gapless spinon continuum expected from the coordinate Bethe ansatz. However, the temperature dependence of the excitation spectrum gives evidence of some very weak interchain coupling.Comment: 9 pages, 5 figure

    Contribution of thirdhand smoke to overall tobacco smoke exposure in pediatric patients: study protocol.

    Get PDF
    BackgroundThirdhand smoke (THS) is the persistent residue resulting from secondhand smoke (SHS) that accumulates in dust, objects, and on surfaces in homes where tobacco has been used, and is reemitted into air. Very little is known about the extent to which THS contributes to children's overall tobacco smoke exposure (OTS) levels, defined as their combined THS and SHS exposure. Even less is known about the effect of OTS and THS on children's health. This project will examine how different home smoking behaviors contribute to THS and OTS and if levels of THS are associated with respiratory illnesses in nonsmoking children.MethodsThis project leverages the experimental design from an ongoing pediatric emergency department-based tobacco cessation trial of caregivers who smoke and their children (NIHR01HD083354). At baseline and follow-up, we will collect urine and handwipe samples from children and samples of dust and air from the homes of smokers who smoke indoors, have smoking bans or who have quit smoking. These samples will be analyzed to examine to what extent THS pollution at home contributes to OTS exposure over and above SHS and to what extent THS continues to persist and contribute to OTS in homes of smokers who have quit or have smoking bans. Targeted and nontargeted chemical analyses of home dust samples will explore which types of THS pollutants are present in homes. Electronic medical record review will examine if THS and OTS levels are associated with child respiratory illness. Additionally, a repository of child and environmental samples will be created.DiscussionThe results of this study will be crucial to help close gaps in our understanding of the types, quantity, and clinical effects of OTS, THS exposure, and THS pollutants in a unique sample of tobacco smoke-exposed ill children and their homes. The potential impact of these findings is substantial, as currently the level of risk in OTS attributable to THS is unknown. This research has the potential to change how we protect children from OTS, by recognizing that SHS and THS exposure needs to be addressed separately and jointly as sources of pollution and exposure.Trial registrationClinicalTrials.gov Identifier: NCT02531594 . Date of registration: August 24, 2015

    Batch solution of small PDEs with the OPS DSL

    Get PDF
    In this paper we discuss the challenges and optimisations opportunities when solving a large number of small, equally sized discretised PDEs on regular grids. We present an extension of the OPS (Oxford Parallel library for Structured meshes) embedded Domain Specific Language, and show how support can be added for solving multiple systems, and how OPS makes it easy to deploy a variety of transformations and optimisations. The new capabilities in OPS allow to automatically apply data structure transformations, as well as execution schedule transformations to deliver high performance on a variety of hardware platforms. We evaluate our work on an industrially representative finance simulation on Intel CPUs, as well as NVIDIA GPUs

    Parallel processing area extraction and data transfer number reduction for automatic GPU offloading of IoT applications

    Full text link
    For Open IoT, we have proposed Tacit Computing technology to discover the devices that have data users need on demand and use them dynamically and an automatic GPU offloading technology as an elementary technology of Tacit Computing. However, it can improve limited applications because it only optimizes parallelizable loop statements extraction. Thus, in this paper, to improve performances of more applications automatically, we propose an improved method with reduction of data transfer between CPU and GPU. We evaluate our proposed offloading method by applying it to Darknet and find that it can process it 3 times as quickly as only using CPU.Comment: 6 pages, 4 figures, in Japanese, IEICE Technical Report, SC2018-3
    • …
    corecore