6 research outputs found

    Real-time wavefront processors for the next generation of adaptive optics systems: a design and analysis

    Get PDF
    Adaptive optics (AO) systems currently under investigation will require at least two orders of magitude increase in the number of actuators, which in turn translates to effectively a 104 increase in compute latency. Since the performance of an AO system invariably improves as the compute latency decreases, it is important to study how today's computer systems will scale to address this expected increase in actuator utilization. This paper answers this question by characterizing the performance of a single deformable mirror (DM) Shack-Hartmann natural guide star AO system implemented on the present-generation digital signal processor (DSP) TMS320C6701 from Texas Instruments. We derive the compute latency of such a system in terms of a few basic parameters, such as the number of DM actuators, the number of data channels used to read out the camera pixels, the number of DSPs, the available memory bandwidth, as well as the inter-processor communication (IPC) bandwidth and the pixel transfer rate. We show how the results would scale for future systems that utilizes multiple DMs and guide stars. We demonstrate that the principal performance bottleneck of such a system is the available memory bandwidth of the processors and to lesser extent the IPC bandwidth. This paper concludes with suggestions for mitigating this bottleneck

    Energy Minimization of System Pipelines Using Multiple Voltages

    Get PDF
    Modem computer and communication system design has to consider the timing constraints imposed by communication and system pipelines, and minimize the energy consumption. We adopt the recent proposed model for communication pipeline latency[23] and address the problem of how to minimize the power consumption in system-level pipelines under the latency constraints by selecting supply voltage for each pipeline stage using the variable voltage core-based system design methodology[l 11. We define the problem, solve it optimally under realistic assumptions and develop algorithms for power minimization of system pipeline designs based on our theoretical results. We apply this new approach on the 4- stage Myrinet GAM pipeline, with the appropriate voltage profiles, we achieve 93.4%, 91.3% and 26.9% power reduction on three pipeline stages over the traditional design

    Techniques for Energy-Efficient Communication Pipeline Design

    Get PDF
    The performance of many modern computer and communication systems is dictated by the latency of communication pipelines. At the same time, power/energy consumption is often another limiting factor in many portable systems. We address the problem of how to minimize the power consumption in system-level pipelines under latency constraints. In particular, we apply fragmentation technique to achieve parallelism and exploit advantages provided by variable voltage design methodology to optimally select voltage and, therefore, speed of each pipeline stage.We focus our study on the practical case when each pipeline stage operates at a fixed speed. Unlike the conventional pipeline system, where all stages run at the same speed, our system may have different stages running at different speeds to conserve energy while providing guaranteed latency. For a given latency requirement, we find explicit solutions for the most energy efficient fragmentation and voltage setting. We further study a less practical case when each stage can dynamically change its speed to get further energy saving. We define the problem and transform it to a nonlinear system whose solution provides a lower bound for energy consumption. We apply the obtained theoretical results to develop algorithms for power/energy minimization of computer and communication systems. The experimental result suggests that significant power/energy reduction, is possible without additional latency. In fact, we achieve almost 40% total energy saving over the combined minimal supply voltage selection and system shut-down technique and 85% if none of these two energy minimization methods is used

    Real-time wavefront processors for the next generation of adaptive optics systems: a design and analysis

    Get PDF
    Adaptive optics (AO) systems currently under investigation will require at least two orders of magitude increase in the number of actuators, which in turn translates to effectively a 104 increase in compute latency. Since the performance of an AO system invariably improves as the compute latency decreases, it is important to study how today's computer systems will scale to address this expected increase in actuator utilization. This paper answers this question by characterizing the performance of a single deformable mirror (DM) Shack-Hartmann natural guide star AO system implemented on the present-generation digital signal processor (DSP) TMS320C6701 from Texas Instruments. We derive the compute latency of such a system in terms of a few basic parameters, such as the number of DM actuators, the number of data channels used to read out the camera pixels, the number of DSPs, the available memory bandwidth, as well as the inter-processor communication (IPC) bandwidth and the pixel transfer rate. We show how the results would scale for future systems that utilizes multiple DMs and guide stars. We demonstrate that the principal performance bottleneck of such a system is the available memory bandwidth of the processors and to lesser extent the IPC bandwidth. This paper concludes with suggestions for mitigating this bottleneck

    Modeling Communication Pipeline Latency

    No full text
    In this paper, we study how to minimize the latency of a message through a network that consists of a number of store-and-forward stages. This research is especially relevant for today's low overhead communication systems that employ dedicated processing elements for protocol processing. We develop an abstract pipeline model that reveals a crucial performance tradeoff involving the effects of the overhead of the bottleneck stage and the bandwidth of the remaining stages. We exploit this tradeoff to develop a suite of fragmentation algorithms designed to minimize message latency. We also provide an experimental methodology that enables the construction of customized pipeline algorithms that can adapt to the specific system characteristics and application workloads. By applying this methodology to the MyrinetGAM system, we have improved its latency by up to 51%. Our theoretical framework is also applicable to pipelined systems beyond the context of high speed networks. 1 Introduction T..
    corecore