3,212 research outputs found

    Identifying and Scheduling Loop Chains Using Directives

    Get PDF
    Exposing opportunities for parallelization while explicitly managing data locality is the primary challenge to porting and optimizing existing computational science simulation codes to improve performance and accuracy. OpenMP provides many mechanisms for expressing parallelism, but it primarily remains the programmer’s responsibility to group computations to improve data locality. The loopchain abstraction, where data access patterns are included with the specification of parallel loops, provides compilers with sufficient information to automate the parallelism versus data locality tradeoff. In this paper, we present a loop chain pragma and an extension to the omp for to enable the specification of loop chains and high-level specifications of schedules on loop chains. We show example usage of the extensions, describe their implementation, and show preliminary performance results for some simple examples

    Addressing information flow in lean production management and control in construction

    Get PDF
    Traditionally, production control on construction sites has been a challenging area, where the ad-hoc production control methods foster uncertainty - one of the biggest enemies of efficiency and smooth production flow. Lean construction methods such as the Last Planner System have partially tackled this problem by addressing the flow aspect through means such as constraints analysis and commitment planning. However, such systems have relatively long planning cycles to respond to the dynamic production requirements of construction, where almost daily if not hourly control is needed. New solutions have been designed by researchers to improve this aspect such as VisiLean, but again these types of software systems require the proximity and availability of computer devices to workers. Given this observation, there is a need for a communication system between the field and site office that is highly interoperable and provides real-time task status information. A High-level communication framework (using VisiLean) is presented in this paper, which aims to overcome the problems of system integration and improve the flow of information within the production system. The framework provides, among other things, generic and standardized interfaces to simplify the “push” and “pull” of the right (production) information, whenever needed, wherever needed, by whoever needs it. Overall, it is anticipated that the reliability of the production control will be improve

    Performance analysis of a hardware accelerator of dependence management for taskbased dataflow programming models

    Get PDF
    Along with the popularity of multicore and manycore, task-based dataflow programming models obtain great attention for being able to extract high parallelism from applications without exposing the complexity to programmers. One of these pioneers is the OpenMP Superscalar (OmpSs). By implementing dynamic task dependence analysis, dataflow scheduling and out-of-order execution in runtime, OmpSs achieves high performance using coarse and medium granularity tasks. In theory, for the same application, the more parallel tasks can be exposed, the higher possible speedup can be achieved. Yet this factor is limited by task granularity, up to a point where the runtime overhead outweighs the performance increase and slows down the application. To overcome this handicap, Picos was proposed to support task-based dataflow programming models like OmpSs as a fast hardware accelerator for fine-grained task and dependence management, and a simulator was developed to perform design space exploration. This paper presents the very first functional hardware prototype inspired by Picos. An embedded system based on a Zynq 7000 All-Programmable SoC is developed to study its capabilities and possible bottlenecks. Initial scalability and hardware consumption studies of different Picos designs are performed to find the one with the highest performance and lowest hardware cost. A further thorough performance study is employed on both the prototype with the most balanced configuration and the OmpSs software-only alternative. Results show that our OmpSs runtime hardware support significantly outperforms the software-only implementation currently available in the runtime system for finegrained tasks.This work is supported by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316-P project, by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272) and by the European Research Council RoMoL Grant Agreement number 321253. We also thank the Xilinx University Program for its hardware and software donations.Peer ReviewedPostprint (published version

    Smart technologies for effective reconfiguration: the FASTER approach

    Get PDF
    Current and future computing systems increasingly require that their functionality stays flexible after the system is operational, in order to cope with changing user requirements and improvements in system features, i.e. changing protocols and data-coding standards, evolving demands for support of different user applications, and newly emerging applications in communication, computing and consumer electronics. Therefore, extending the functionality and the lifetime of products requires the addition of new functionality to track and satisfy the customers needs and market and technology trends. Many contemporary products along with the software part incorporate hardware accelerators for reasons of performance and power efficiency. While adaptivity of software is straightforward, adaptation of the hardware to changing requirements constitutes a challenging problem requiring delicate solutions. The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) project aims at introducing a complete methodology to allow designers to easily implement a system specification on a platform which includes a general purpose processor combined with multiple accelerators running on an FPGA, taking as input a high-level description and fully exploiting, both at design time and at run time, the capabilities of partial dynamic reconfiguration. The goal is that for selected application domains, the FASTER toolchain will be able to reduce the design and verification time of complex reconfigurable systems providing additional novel verification features that are not available in existing tool flows

    Practical Parallelization of Scientific Applications

    Get PDF

    The hArtes Tool Chain

    Get PDF
    This chapter describes the different design steps needed to go from legacy code to a transformed application that can be efficiently mapped on the hArtes platform

    Design of the software development and verification system (SWDVS) for shuttle NASA study task 35

    Get PDF
    An overview of the Software Development and Verification System (SWDVS) for the space shuttle is presented. The design considerations, goals, assumptions, and major features of the design are examined. A scenario that shows three persons involved in flight software development using the SWDVS in response to a program change request is developed. The SWDVS is described from the standpoint of different groups of people with different responsibilities in the shuttle program to show the functional requirements that influenced the SWDVS design. The software elements of the SWDVS that satisfy the requirements of the different groups are identified
    • 

    corecore