57,545 research outputs found

    An Efficient Thread Mapping Strategy for Multiprogramming on Manycore Processors

    Full text link
    The emergence of multicore and manycore processors is set to change the parallel computing world. Applications are shifting towards increased parallelism in order to utilise these architectures efficiently. This leads to a situation where every application creates its desirable number of threads, based on its parallel nature and the system resources allowance. Task scheduling in such a multithreaded multiprogramming environment is a significant challenge. In task scheduling, not only the order of the execution, but also the mapping of threads to the execution resources is of a great importance. In this paper we state and discuss some fundamental rules based on results obtained from selected applications of the BOTS benchmarks on the 64-core TILEPro64 processor. We demonstrate how previously efficient mapping policies such as those of the SMP Linux scheduler become inefficient when the number of threads and cores grows. We propose a novel, low-overhead technique, a heuristic based on the amount of time spent by each CPU doing some useful work, to fairly distribute the workloads amongst the cores in a multiprogramming environment. Our novel approach could be implemented as a pragma similar to those in the new task-based OpenMP versions, or can be incorporated as a distributed thread mapping mechanism in future manycore programming frameworks. We show that our thread mapping scheme can outperform the native GNU/Linux thread scheduler in both single-programming and multiprogramming environments.Comment: ParCo Conference, Munich, Germany, 201

    A Comparison of some recent Task-based Parallel Programming Models

    Get PDF
    The need for parallel programming models that are simple to use and at the same time efficient for current ant future parallel platforms has led to recent attention to task-based models such as Cilk++, Intel TBB and the task concept in OpenMP version 3.0. The choice of model and implementation can have a major impact on the final performance and in order to understand some of the trade-offs we have made a quantitative study comparing four implementations of OpenMP (gcc, Intel icc, Sun studio and the research compiler Mercurium/nanos mcc), Cilk++ and Wool, a high-performance task-based library developed at SICS. Abstract. We use microbenchmarks to characterize costs for task-creation and stealing and the Barcelona OpenMP Tasks Suite for characterizing application performance. By far Wool and Cilk++ have the lowest overhead in both spawning and stealing tasks. This is reflected in application performance when many tasks with small granularity are spawned where Cilk++ and, in particular, has the highest performance. For coarse granularity applications, the OpenMP implementations have quite similar performance as the more light-weight Cilk++ and Wool except for one application where mcc is superior thanks to a superior task scheduler. Abstract. The OpenMP implemenations are generally not yet ready for use when the task granularity becomes very small. There is no inherent reason for this, so we expect future implementations of OpenMP to focus on this issue

    GLB: Lifeline-based Global Load Balancing library in X10

    Full text link
    We present GLB, a programming model and an associated implementation that can handle a wide range of irregular paral- lel programming problems running over large-scale distributed systems. GLB is applicable both to problems that are easily load-balanced via static scheduling and to problems that are hard to statically load balance. GLB hides the intricate syn- chronizations (e.g., inter-node communication, initialization and startup, load balancing, termination and result collection) from the users. GLB internally uses a version of the lifeline graph based work-stealing algorithm proposed by Saraswat et al. Users of GLB are simply required to write several pieces of sequential code that comply with the GLB interface. GLB then schedules and orchestrates the parallel execution of the code correctly and efficiently at scale. We have applied GLB to two representative benchmarks: Betweenness Centrality (BC) and Unbalanced Tree Search (UTS). Among them, BC can be statically load-balanced whereas UTS cannot. In either case, GLB scales well-- achieving nearly linear speedup on different computer architectures (Power, Blue Gene/Q, and K) -- up to 16K cores

    NOViSE: a virtual natural orifice transluminal endoscopic surgery simulator

    Get PDF
    Purpose: Natural Orifice Transluminal Endoscopic Surgery (NOTES) is a novel technique in minimally invasive surgery whereby a flexible endoscope is inserted via a natural orifice to gain access to the abdominal cavity, leaving no external scars. This innovative use of flexible endoscopy creates many new challenges and is associated with a steep learning curve for clinicians. Methods: We developed NOViSE - the first force-feedback enabled virtual reality simulator for NOTES training supporting a flexible endoscope. The haptic device is custom built and the behaviour of the virtual flexible endoscope is based on an established theoretical framework – the Cosserat Theory of Elastic Rods. Results: We present the application of NOViSE to the simulation of a hybrid trans-gastric cholecystectomy procedure. Preliminary results of face, content and construct validation have previously shown that NOViSE delivers the required level of realism for training of endoscopic manipulation skills specific to NOTES Conclusions: VR simulation of NOTES procedures can contribute to surgical training and improve the educational experience without putting patients at risk, raising ethical issues or requiring expensive animal or cadaver facilities. In the context of an experimental technique, NOViSE could potentially facilitate NOTES development and contribute to its wider use by keeping practitioners up to date with this novel surgical technique. NOViSE is a first prototype and the initial results indicate that it provides promising foundations for further development
    • …
    corecore