362 research outputs found

    Correct-by-construction microarchitectural pipelining

    Get PDF
    This paper presents a method for correct-by-construction microarchitectural pipelining that handles cyclic systems with dependencies between iterations. Our method combines previously known bypass and retiming transformations with a few transformations valid only for elastic systems with early evaluation (namely, empty FIFO insertion, FIFO capacity sizing, insertion of anti-tokens, and introducing early evaluation multiplexors). By converting the design to a synchronous elastic form and then applying this extended set of transformations, one can pipeline a functional specification with an automatically generated distributed controller that implements stalling logic resolving data hazards off the critical path of the design. We have developed an interactive toolkit for exploring elastic microarchitectural transformations. The method is illustrated by pipelining a few simple examples of instruction set architecture ISA specifications.Peer ReviewedPostprint (published version

    Elastic systems

    Get PDF
    Elastic systems provide tolerance to the variations in computation and communication delays. The incorporation of elasticity opens new opportunities for optimization using new correct-by-construction transformations that cannot be applied to rigid non-elastic systems. The basics of synchronous and asynchronous elastic systems will be reviewed. A set of behavior-preserving transformations will be presented: retiming, recycling, early evaluation, variable-latency units and speculative execution. The application of these transformations for performance and power optimization will be discussed. Finally, a novel framework for microarchitectural exploration will be introduced, showing that the optimal pipelining of a circuit can be automatically obtained by using the previous transformations.Peer ReviewedPostprint (published version

    Synthesis of synchronous elastic architectures

    Get PDF
    A simple protocol for latency-insensitive design is presented. The main features of the protocol are the efficient implementation of elastic communication channels and the automatable design methodology. With this approach, fine-granularity elasticity can be introduced at the level of functional units (e.g. ALUs, memories). A formal specification of the protocol is defined and an efficient scheme for the implementation of elasticity that involves no datapath overhead is presented. The opportunities this protocol opens for microarchitectural design are discussed.Peer ReviewedPostprint (author's final draft

    WEST-3 wind turbine simulator development. Volume 1: Summary

    Get PDF
    This report is a summary description of WEST-3, a new real-time wind turbine simulator developed by Paragon Pacific Inc. WEST-3 is an all digital, fully programmable, high performance parallel processing computer. Contained in the report are descriptions of the WEST-3 hardware and software. WEST-3 consists of a network of Computational Units (CUs) working in parallel. Each CU is a custom designed high speed digital processor operating independently of other CUs. The CU, which is the main building block of the system, is described in some detail. A major contributor to the high performance of the system is the use a unique method for transferring data among the CUs. The software aspects of WEST-3 covered in the report include the preparation of the simulation model (reformulation, scaling and normalization), and the use of the system software (Translator, Linker, Assembler and Loader). Also given is a description of the wind turbine simulation model used in WEST-3, and some sample results from a study conducted to validate the system. Finally, efforts currently underway to enhance the user friendliness of the system are outlined; these include the 32-bit floating point capability, and major improvements in system software

    Model-driven development of data intensive applications over cloud resources

    Get PDF
    The proliferation of sensors over the last years has generated large amounts of raw data, forming data streams that need to be processed. In many cases, cloud resources are used for such processing, exploiting their flexibility, but these sensor streaming applications often need to support operational and control actions that have real-time and low-latency requirements that go beyond the cost effective and flexible solutions supported by existing cloud frameworks, such as Apache Kafka, Apache Spark Streaming, or Map-Reduce Streams. In this paper, we describe a model-driven and stepwise refinement methodological approach for streaming applications executed over clouds. The central role is assigned to a set of Petri Net models for specifying functional and non-functional requirements. They support model reuse, and a way to combine formal analysis, simulation, and approximate computation of minimal and maximal boundaries of non-functional requirements when the problem is either mathematically or computationally intractable. We show how our proposal can assist developers in their design and implementation decisions from a performance perspective. Our methodology allows to conduct performance analysis: The methodology is intended for all the engineering process stages, and we can (i) analyse how it can be mapped onto cloud resources, and (ii) obtain key performance indicators, including throughput or economic cost, so that developers are assisted in their development tasks and in their decision taking. In order to illustrate our approach, we make use of the pipelined wavefront array

    Accurate depth from defocus estimation with video-rate implementation

    Get PDF
    The science of measuring depth from images at video rate using „defocus‟ has been investigated. The method required two differently focussed images acquired from a single view point using a single camera. The relative blur between the images was used to determine the in-focus axial points of each pixel and hence depth. The depth estimation algorithm researched by Watanabe and Nayar was employed to recover the depth estimates, but the broadband filters, referred as the Rational filters were designed using a new procedure: the Two Step Polynomial Approach. The filters designed by the new model were largely insensitive to object texture and were shown to model the blur more precisely than the previous method. Experiments with real planar images demonstrated a maximum RMS depth error of 1.18% for the proposed filters, compared to 1.54% for the previous design. The researched software program required five 2D convolutions to be processed in parallel and these convolutions were effectively implemented on a FPGA using a two channel, five stage pipelined architecture, however the precision of the filter coefficients and the variables had to be limited within the processor. The number of multipliers required for each convolution was reduced from 49 to 10 (79.5% reduction) using a Triangular design procedure. Experimental results suggested that the pipelined processor provided depth estimates comparable in accuracy to the full precision Matlab‟s output, and generated depth maps of size 400 x 400 pixels in 13.06msec, that is faster than the video rate. The defocused images (near and far-focused) were optically registered for magnification using Telecentric optics. A frequency domain approach based on phase correlation was employed to measure the radial shifts due to magnification and also to optimally position the external aperture. The telecentric optics ensured pixel to pixel registration between the defocused images was correct and provided more accurate depth estimates

    Speculation in elastic systems

    Get PDF
    Speculation is a well-known technique for increasing parallelism of the microprocessor pipelines and hence their performance. While implementing speculation in modern design practice is error-prone and mostly ad-hoc, this paper proposes a correct-by-construction method for implementing speculation in Elastic Systems. The technique is based on applying provably correct transformations such as early evaluation, insertion of anti-tokens and bubbles, retiming, and sharing. It allows to explore different micro-architectural solutions for better design trade-offs. The benefits of speculation are illustrated with two examples in which these transformations are systematically applied. The method proposed in this paper is amenable for automation in a synthesis flow.Postprint (published version

    High-fidelity trajectory optimization for aeroassisted vehicles using variable order pseudospectral method

    Get PDF
    In this study, the problem of time-optimal reconnaissance trajectory design for the aeroassisted vehicle is considered. Different from most works reported previously, we explore the feasibility of applying a high-order aeroassisted vehicle dynamic model to plan the optimal flight trajectory such that the gap between the simulated model and the real system can be narrowed. A highly-constrained optimal control model containing six-degree-of-freedom vehicle dynamics is established. To solve the formulated high-order trajectory planning model, a pipelined optimization strategy is illustrated. This approach is based on the variable order Radau pseudospectral method, indicating that the mesh grid used for discretizing the continuous system experiences several adaption iterations. Utilization of such a strategy can potentially smooth the flight trajectory and improve the algorithm convergence ability. Numerical simulations are reported to demonstrate some key features of the optimized flight trajectory. A number of comparative studies are also provided to verify the effectiveness of the applied method as well as the high-order trajectory planning model

    A bibliography on parallel and vector numerical algorithms

    Get PDF
    This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also
    corecore