7,791 research outputs found

    Atmospheric tomography with separate minimum variance laser and natural guide star mode control

    Get PDF
    This paper introduces a novel, computationally efficient, and practical atmospheric tomography wavefront control architecture with separate minimum variance laser and natural guide star mode estimation. The architecture is applicable to all laser tomography systems, including multi conjugate adaptive optics (MCAO), laser tomography adaptive optics (LTAO), and multi object adaptive optics (MOAO) systems. Monte Carlo simulation results for the Thirty Meter Telescope (TMT) MCAO system demonstrate its benefit over a previously introduced “ad hoc” split MCAO architecture, calling for further in-depth analysis and simulations over a representative ensemble of natural guide star (NGS) asterisms with optimized loop frame rates and modal gains

    Hardware Accelerators for Animated Ray Tracing

    Get PDF
    Future graphics processors are likely to incorporate hardware accelerators for real-time ray tracing, in order to render increasingly complex lighting effects in interactive applications. However, ray tracing poses difficulties when drawing scenes with dynamic content, such as animated characters and objects. In dynamic scenes, the spatial datastructures used to accelerate ray tracing are invalidated on each animation frame, and need to be rapidly updated. Tree update is a complex subtask in its own right, and becomes highly expensive in complex scenes. Both ray tracing and tree update are highly memory-intensive tasks, and rendering systems are increasingly bandwidth-limited, so research on accelerator hardware has focused on architectural techniques to optimize away off-chip memory traffic. Dynamic scene support is further complicated by the recent introduction of compressed trees, which use low-precision numbers for storage and computation. Such compression reduces both the arithmetic and memory bandwidth cost of ray tracing, but adds to the complexity of tree update.This thesis proposes methods to cope with dynamic scenes in hardware-accelerated ray tracing, with focus on reducing traffic to external memory. Firstly, a hardware architecture is designed for linear bounding volume hierarchy construction, an algorithm which is a basic building block in most state-of-the-art software tree builders. The algorithm is rearranged into a streaming form which reduces traffic to one-third of software implementations of the same algorithm. Secondly, an algorithm is proposed for compressing bounding volume hierarchies in a streaming manner as they are output from a hardware builder, instead of performing compression as a postprocessing pass. As a result, with the proposed method, compression reduces the overall cost of tree update rather than increasing it. The last main contribution of this thesis is an evaluation of shallow bounding volume hierarchies, common in software ray tracing, for use in hardware pipelines. These are found to be more energy-efficient than binary hierarchies. The results in this thesis both conïŹrm that dynamic scene support may become a bottleneck in real time ray tracing, and add to the state of the art on tree update in terms of energy-efficiency, as well as the complexity of scenes that can be handled in real time on resource-constrained platforms

    Downlink and Uplink Decoupling: a Disruptive Architectural Design for 5G Networks

    Full text link
    Cell association in cellular networks has traditionally been based on the downlink received signal power only, despite the fact that up and downlink transmission powers and interference levels differed significantly. This approach was adequate in homogeneous networks with macro base stations all having similar transmission power levels. However, with the growth of heterogeneous networks where there is a big disparity in the transmit power of the different base station types, this approach is highly inefficient. In this paper, we study the notion of Downlink and Uplink Decoupling (DUDe) where the downlink cell association is based on the downlink received power while the uplink is based on the pathloss. We present the motivation and assess the gains of this 5G design approach with simulations that are based on Vodafone's LTE field trial network in a dense urban area, employing a high resolution ray-tracing pathloss prediction and realistic traffic maps based on live network measurements.Comment: 6 pages, 7 figures, conference paper, submitted to IEEE GLOBECOM 201

    Doctor of Philosophy in Computer Science

    Get PDF
    dissertationRay tracing is becoming more widely adopted in offline rendering systems due to its natural support for high quality lighting. Since quality is also a concern in most real time systems, we believe ray tracing would be a welcome change in the real time world, but is avoided due to insufficient performance. Since power consumption is one of the primary factors limiting the increase of processor performance, it must be addressed as a foremost concern in any future ray tracing system designs. This will require cooperating advances in both algorithms and architecture. In this dissertation I study ray tracing system designs from a data movement perspective, targeting the various memory resources that are the primary consumer of power on a modern processor. The result is high performance, low energy ray tracing architectures

    Pulse interspersing in static multipath chip environments for Impulse Radio communications

    Get PDF
    Communications are becoming the bottleneck in the performance of Chip Multiprocessor (CMP). To address this issue, the use of wireless communications within a chip has been proposed, since they offer a low latency among nodes and high reconfigurability. The chip scenario has the particularity that is static, and the multipath can be known a priori. Within this context, we propose in this paper a simple yet very efficient modulation technique, based on Impulse Radio-On–Off-Keying (IR-OOK), which significantly optimizes the performance in Wireless Network-on-Chip (WNoC) as well as off-chip scenarios. This technique is based on interspersing information pulses among the reflected pulses in order to reduce the time between pulses, thus increasing the data rate. We prove that the final data rate can be considerably increased without increasing the hardware complexity of the transceiver.Peer ReviewedPostprint (published version

    Exploring performance and power properties of modern multicore chips via simple machine models

    Full text link
    Modern multicore chips show complex behavior with respect to performance and power. Starting with the Intel Sandy Bridge processor, it has become possible to directly measure the power dissipation of a CPU chip and correlate this data with the performance properties of the running code. Going beyond a simple bottleneck analysis, we employ the recently published Execution-Cache-Memory (ECM) model to describe the single- and multi-core performance of streaming kernels. The model refines the well-known roofline model, since it can predict the scaling and the saturation behavior of bandwidth-limited loop kernels on a multicore chip. The saturation point is especially relevant for considerations of energy consumption. From power dissipation measurements of benchmark programs with vastly different requirements to the hardware, we derive a simple, phenomenological power model for the Sandy Bridge processor. Together with the ECM model, we are able to explain many peculiarities in the performance and power behavior of multicore processors, and derive guidelines for energy-efficient execution of parallel programs. Finally, we show that the ECM and power models can be successfully used to describe the scaling and power behavior of a lattice-Boltzmann flow solver code.Comment: 23 pages, 10 figures. Typos corrected, DOI adde
    • 

    corecore