3,212 research outputs found
Identifying and Scheduling Loop Chains Using Directives
Exposing opportunities for parallelization while explicitly managing data locality is the primary challenge to porting and optimizing existing computational science simulation codes to improve performance and accuracy. OpenMP provides many mechanisms for expressing parallelism, but it primarily remains the programmerâs responsibility to group computations to improve data locality. The loopchain abstraction, where data access patterns are included with the specification of parallel loops, provides compilers with sufficient information to automate the parallelism versus data locality tradeoff. In this paper, we present a loop chain pragma and an extension to the omp for to enable the specification of loop chains and high-level specifications of schedules on loop chains. We show example usage of the extensions, describe their implementation, and show preliminary performance results for some simple examples
Addressing information flow in lean production management and control in construction
Traditionally, production control on construction sites has been a challenging area,
where the ad-hoc production control methods foster uncertainty - one of the biggest
enemies of efficiency and smooth production flow. Lean construction methods such
as the Last Planner System have partially tackled this problem by addressing the flow
aspect through means such as constraints analysis and commitment planning.
However, such systems have relatively long planning cycles to respond to the
dynamic production requirements of construction, where almost daily if not hourly
control is needed. New solutions have been designed by researchers to improve this
aspect such as VisiLean, but again these types of software systems require the
proximity and availability of computer devices to workers. Given this observation,
there is a need for a communication system between the field and site office that is
highly interoperable and provides real-time task status information. A High-level
communication framework (using VisiLean) is presented in this paper, which aims to
overcome the problems of system integration and improve the flow of information
within the production system. The framework provides, among other things, generic
and standardized interfaces to simplify the âpushâ and âpullâ of the right (production)
information, whenever needed, wherever needed, by whoever needs it. Overall, it is
anticipated that the reliability of the production control will be improve
Performance analysis of a hardware accelerator of dependence management for taskbased dataflow programming models
Along with the popularity of multicore and manycore, task-based dataflow programming models obtain great attention for being able to extract high parallelism from applications without exposing the complexity to programmers. One of these pioneers is the OpenMP Superscalar (OmpSs). By implementing dynamic task dependence analysis, dataflow scheduling and out-of-order execution in runtime, OmpSs achieves high performance using coarse and
medium granularity tasks. In theory, for the same application, the more parallel tasks can be exposed, the higher possible speedup can be achieved. Yet this factor is limited by task granularity, up to a point where the runtime overhead outweighs the performance increase and slows down the application. To overcome this handicap, Picos
was proposed to support task-based dataflow programming models like OmpSs as a fast hardware accelerator for fine-grained task and dependence management, and a simulator was developed to perform design space exploration. This paper presents the very first functional hardware prototype inspired by Picos. An embedded system based on a Zynq 7000 All-Programmable SoC is developed to study its capabilities and possible bottlenecks. Initial scalability and hardware consumption studies of different Picos designs are performed to find the one with the highest performance and lowest hardware cost. A further thorough performance study is employed on both the prototype with the most balanced configuration and the OmpSs software-only alternative. Results show that our OmpSs runtime hardware support significantly outperforms the software-only implementation currently available in the runtime system for finegrained tasks.This work is supported by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316-P project, by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272) and by the European Research Council RoMoL Grant Agreement number 321253. We also thank the Xilinx University Program for its hardware and
software donations.Peer ReviewedPostprint (published version
Smart technologies for effective reconfiguration: the FASTER approach
Current and future computing systems increasingly require that their functionality stays flexible after the system is operational, in order to cope with changing user requirements and improvements in system features, i.e. changing protocols and data-coding standards, evolving demands for support of different user applications, and newly emerging applications in communication, computing and consumer electronics. Therefore, extending the functionality and the lifetime of products requires the addition of new functionality to track and satisfy the customers needs and market and technology trends. Many contemporary products along with the software part incorporate hardware accelerators for reasons of performance and power efficiency. While adaptivity of software is straightforward, adaptation of the hardware to changing requirements constitutes a challenging problem requiring delicate solutions. The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) project aims at introducing a complete methodology to allow designers to easily implement a system specification on a platform which includes a general purpose processor combined with multiple accelerators running on an FPGA, taking as input a high-level description and fully exploiting, both at design time and at run time, the capabilities of partial dynamic reconfiguration. The goal is that for selected application domains, the FASTER toolchain will be able to reduce the design and verification time of complex reconfigurable systems providing additional novel verification features that are not available in existing tool flows
The hArtes Tool Chain
This chapter describes the different design steps needed to go from legacy code to a transformed application that can be efficiently mapped on the hArtes platform
Design of the software development and verification system (SWDVS) for shuttle NASA study task 35
An overview of the Software Development and Verification System (SWDVS) for the space shuttle is presented. The design considerations, goals, assumptions, and major features of the design are examined. A scenario that shows three persons involved in flight software development using the SWDVS in response to a program change request is developed. The SWDVS is described from the standpoint of different groups of people with different responsibilities in the shuttle program to show the functional requirements that influenced the SWDVS design. The software elements of the SWDVS that satisfy the requirements of the different groups are identified
- âŠ