7,964 research outputs found

    PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation

    Full text link
    High-performance computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving computing environment currently exhibited by GPUs. One way of addressing this challenge is to embrace better techniques and develop tools tailored to their needs. This article presents one simple technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL, two open-source toolkits that support this technique. In introducing PyCUDA and PyOpenCL, this article proposes the combination of a dynamic, high-level scripting language with the massive performance of a GPU as a compelling two-tiered computing platform, potentially offering significant performance and productivity advantages over conventional single-tier, static systems. The concept of RTCG is simple and easily implemented using existing, robust infrastructure. Nonetheless it is powerful enough to support (and encourage) the creation of custom application-specific tools by its users. The premise of the paper is illustrated by a wide range of examples where the technique has been applied with considerable success.Comment: Submitted to Parallel Computing, Elsevie

    A theorem prover-based analysis tool for object-oriented databases

    Get PDF
    We present a theorem-prover based analysis tool for object-oriented database systems with integrity constraints. Object-oriented database specifications are mapped to higher-order logic (HOL). This allows us to reason about the semantics of database operations using a mechanical theorem prover such as Isabelle or PVS. The tool can be used to verify various semantics requirements of the schema (such as transaction safety, compensation, and commutativity) to support the advanced transaction models used in workflow and cooperative work. We give an example of method safety analysis for the generic structure editing operations of a cooperative authoring system

    Virtual Machine Support for Many-Core Architectures: Decoupling Abstract from Concrete Concurrency Models

    Get PDF
    The upcoming many-core architectures require software developers to exploit concurrency to utilize available computational power. Today's high-level language virtual machines (VMs), which are a cornerstone of software development, do not provide sufficient abstraction for concurrency concepts. We analyze concrete and abstract concurrency models and identify the challenges they impose for VMs. To provide sufficient concurrency support in VMs, we propose to integrate concurrency operations into VM instruction sets. Since there will always be VMs optimized for special purposes, our goal is to develop a methodology to design instruction sets with concurrency support. Therefore, we also propose a list of trade-offs that have to be investigated to advise the design of such instruction sets. As a first experiment, we implemented one instruction set extension for shared memory and one for non-shared memory concurrency. From our experimental results, we derived a list of requirements for a full-grown experimental environment for further research

    Emulating Digital Logic using Transputer Networks (Very High Parallelism = Simplicity = Performance)

    Get PDF
    Modern VLSI technology has changed the economic rules by which the balance between processing power, memory and communications is decided in computing systems. This will have a profound impact on the design rules for the controlling software. In particular, the criteria for judging efficiency of the algorithms will be somewhat different. This paper explores some of these implications through the development of highly parallel and highly distributable algorithms based on occam and transputer networks. The major results reported are a new simplicity for software designs, a corresponding ability to reason (formally and informally) about their properties, the reusability of their components and some real performance figures which demonstrate their practicality. Some guidelines to assist in these designs are also given. As a vehicle for discussion, an interactive simulator is developed for checking the functional and timing characteristics of digital logic circuits of arbitrary complexity

    Zero-gravity movement studies

    Get PDF
    The use of computer graphics to simulate the movement of articulated animals and mechanisms has a number of uses ranging over many fields. Human motion simulation systems can be useful in education, medicine, anatomy, physiology, and dance. In biomechanics, computer displays help to understand and analyze performance. Simulations can be used to help understand the effect of external or internal forces. Similarly, zero-gravity simulation systems should provide a means of designing and exploring the capabilities of hypothetical zero-gravity situations before actually carrying out such actions. The advantage of using a simulation of the motion is that one can experiment with variations of a maneuver before attempting to teach it to an individual. The zero-gravity motion simulation problem can be divided into two broad areas: human movement and behavior in zero-gravity, and simulation of articulated mechanisms

    Programming with Exceptions in JCilk

    Get PDF
    JCilk extends the Java language to provide call-return semantics for multithreading, much as Cilk does for C. Java's built-in thread model does not support the passing of exceptions or return values from one thread back to the "parent" thread that created it. JCilk imports Cilk's fork-join primitives spawn and sync into Java to provide procedure-call semantics for concurrent subcomputations. This paper shows how JCilk integrates exception handling with multithreading by defining semantics consistent with the existing semantics of Java's try and catch constructs, but which handle concurrency in spawned methods. JCilk's strategy of integrating multithreading with Java's exception semantics yields some surprising semantic synergies. In particular, JCilk extends Java's exception semantics to allow exceptions to be passed from a spawned method to its parent in a natural way that obviates the need for Cilk's inlet and abort constructs. This extension is "faithful" in that it obeys Java's ordinary serial semantics when executed on a single processor. When executed in parallel, however, an exception thrown by a JCilk computation signals its sibling computations to abort, which yields a clean semantics in which only a single exception from the enclosing try block is handled. The decision to implicitly abort side computations opens a Pandora's box of subsidiary linguistic problems to be resolved, however. For instance, aborting might cause a computation to be interrupted asynchronously, causing havoc in programmer understanding of code behavior. To minimize the complexity of reasoning about aborts, JCilk signals them "semisynchronously" so that abort signals do not interrupt ordinary serial code. In addition, JCilk propagates an abort signal throughout a subcomputation naturally with a built-in CilkAbort exception, thereby allowing programmers to handle clean-up by simply catching the CilkAbort exception. The semantics of JCilk allow programs with speculative computations to be programmed easily. Speculation is essential for parallelizing programs such as branch-and-bound or heuristic search. We show how JCilk's linguistic mechanisms can be used to program a solution to the "queens" problem and an implemention of a parallel alpha-beta search.Singapore-MIT Alliance (SMA
    • ā€¦
    corecore