4,379 research outputs found

    Parallel processors and nonlinear structural dynamics algorithms and software

    Get PDF
    The adaptation of a finite element program with explicit time integration to a massively parallel SIMD (single instruction multiple data) computer, the CONNECTION Machine is described. The adaptation required the development of a new algorithm, called the exchange algorithm, in which all nodal variables are allocated to the element with an exchange of nodal forces at each time step. The architectural and C* programming language features of the CONNECTION Machine are also summarized. Various alternate data structures and associated algorithms for nonlinear finite element analysis are discussed and compared. Results are presented which demonstrate that the CONNECTION Machine is capable of outperforming the CRAY XMP/14

    GeNN: a code generation framework for accelerated brain simulations

    Get PDF
    Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/

    Parallel processing and expert systems

    Get PDF
    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited

    Parallel processing and expert systems

    Get PDF
    Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 90's cannot enjoy an increased level of autonomy without the efficient use of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real time demands are met for large expert systems. Speed-up via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial labs in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems was surveyed. The survey is divided into three major sections: (1) multiprocessors for parallel expert systems; (2) parallel languages for symbolic computations; and (3) measurements of parallelism of expert system. Results to date indicate that the parallelism achieved for these systems is small. In order to obtain greater speed-ups, data parallelism and application parallelism must be exploited

    COLAB:A Collaborative Multi-factor Scheduler for Asymmetric Multicore Processors

    Get PDF
    Funding: Partially funded by the UK EPSRC grants Discovery: Pattern Discovery and Program Shaping for Many-core Systems (EP/P020631/1) and ABC: Adaptive Brokerage for Cloud (EP/R010528/1); Royal Academy of Engineering under the Research Fellowship scheme.Increasingly prevalent asymmetric multicore processors (AMP) are necessary for delivering performance in the era of limited power budget and dark silicon. However, the software fails to use them efficiently. OS schedulers, in particular, handle asymmetry only under restricted scenarios. We have efficient symmetric schedulers, efficient asymmetric schedulers for single-threaded workloads, and efficient asymmetric schedulers for single program workloads. What we do not have is a scheduler that can handle all runtime factors affecting AMP for multi-threaded multi-programmed workloads. This paper introduces the first general purpose asymmetry-aware scheduler for multi-threaded multi-programmed workloads. It estimates the performance of each thread on each type of core and identifies communication patterns and bottleneck threads. The scheduler then makes coordinated core assignment and thread selection decisions that still provide each application its fair share of the processor's time. We evaluate our approach using the GEM5 simulator on four distinct big.LITTLE configurations and 26 mixed workloads composed of PARSEC and SPLASH2 benchmarks. Compared to the state-of-the art Linux CFS and AMP-aware schedulers, we demonstrate performance gains of up to 25% and 5% to 15% on average depending on the hardware setup.Postprin

    HeteroCore GPU to exploit TLP-resource diversity

    Get PDF

    Comparisons of some large scientific computers

    Get PDF
    In 1975, the National Aeronautics and Space Administration (NASA) began studies to assess the technical and economic feasibility of developing a computer having sustained computational speed of one billion floating point operations per second and a working memory of at least 240 million words. Such a powerful computer would allow computational aerodynamics to play a major role in aeronautical design and advanced fluid dynamics research. Based on favorable results from these studies, NASA proceeded with developmental plans. The computer was named the Numerical Aerodynamic Simulator (NAS). To help insure that the estimated cost, schedule, and technical scope were realistic, a brief study was made of past large scientific computers. Large discrepancies between inception and operation in scope, cost, or schedule were studied so that they could be minimized with NASA's proposed new compter. The main computers studied were the ILLIAC IV, STAR 100, Parallel Element Processor Ensemble (PEPE), and Shuttle Mission Simulator (SMS) computer. Comparison data on memory and speed were also obtained on the IBM 650, 704, 7090, 360-50, 360-67, 360-91, and 370-195; the CDC 6400, 6600, 7600, CYBER 203, and CYBER 205; CRAY 1; and the Advanced Scientific Computer (ASC). A few lessons learned conclude the report
    • …
    corecore