9,621 research outputs found

    From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

    Full text link
    Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific Programmin

    COMPARISON OF OBJECT-ORIENTED PROGRAMMING AND DATA-ORIENTED DESIGN FOR IMPLEMENTING TRADING STRATEGIES BACKTESTER

    Get PDF
    This research proposes a way to accelerate backtesting of trading strategies using data-oriented design (DOD). The research discusses the differences between DOD and object-oriented approach (OOP), which is the most popular at the current moment. Then, the paper proposes efficient way to parallelize a backtesting using DOD. Finally, this research provides a performance comparison between DOD and OOP backtester implementations on the example of typical technical indicators. The comparison shows that use of DOD can speed up the process of quantitative features calculation up to 33% and allows for parallelization scheme that better utilizes resources in multiprocessor systems

    Enhanced molecular dynamics performance with a programmable graphics processor

    Full text link
    Design considerations for molecular dynamics algorithms capable of taking advantage of the computational power of a graphics processing unit (GPU) are described. Accommodating the constraints of scalable streaming-multiprocessor hardware necessitates a reformulation of the underlying algorithm. Performance measurements demonstrate the considerable benefit and cost-effectiveness of such an approach, which produces a factor of 2.5 speed improvement over previous work for the case of the soft-sphere potential.Comment: 20 pages (v2: minor additions and changes; v3: corrected typos

    A Domain-Specific Language and Editor for Parallel Particle Methods

    Full text link
    Domain-specific languages (DSLs) are of increasing importance in scientific high-performance computing to reduce development costs, raise the level of abstraction and, thus, ease scientific programming. However, designing and implementing DSLs is not an easy task, as it requires knowledge of the application domain and experience in language engineering and compilers. Consequently, many DSLs follow a weak approach using macros or text generators, which lack many of the features that make a DSL a comfortable for programmers. Some of these features---e.g., syntax highlighting, type inference, error reporting, and code completion---are easily provided by language workbenches, which combine language engineering techniques and tools in a common ecosystem. In this paper, we present the Parallel Particle-Mesh Environment (PPME), a DSL and development environment for numerical simulations based on particle methods and hybrid particle-mesh methods. PPME uses the meta programming system (MPS), a projectional language workbench. PPME is the successor of the Parallel Particle-Mesh Language (PPML), a Fortran-based DSL that used conventional implementation strategies. We analyze and compare both languages and demonstrate how the programmer's experience can be improved using static analyses and projectional editing. Furthermore, we present an explicit domain model for particle abstractions and the first formal type system for particle methods.Comment: Submitted to ACM Transactions on Mathematical Software on Dec. 25, 201

    Real-time and fault tolerance in distributed control software

    Get PDF
    Closed loop control systems typically contain multitude of spatially distributed sensors and actuators operated simultaneously. So those systems are parallel and distributed in their essence. But mapping this parallelism onto the given distributed hardware architecture, brings in some additional requirements: safe multithreading, optimal process allocation, real-time scheduling of bus and network resources. Nowadays, fault tolerance methods and fast even online reconfiguration are becoming increasingly important. All those often conflicting requirements, make design and implementation of real-time distributed control systems an extremely difficult task, that requires substantial knowledge in several areas of control and computer science. Although many design methods have been proposed so far, none of them had succeeded to cover all important aspects of the problem at hand. [1] Continuous increase of production in embedded market, makes a simple and natural design methodology for real-time systems needed more then ever

    Shawn: A new approach to simulating wireless sensor networks

    Full text link
    We consider the simulation of wireless sensor networks (WSN) using a new approach. We present Shawn, an open-source discrete-event simulator that has considerable differences to all other existing simulators. Shawn is very powerful in simulating large scale networks with an abstract point of view. It is, to the best of our knowledge, the first simulator to support generic high-level algorithms as well as distributed protocols on exactly the same underlying networks.Comment: 10 pages, 2 figures, 2 tables, Latex, to appear in Design, Analysis, and Simulation of Distributed Systems 200
    • …
    corecore