9 research outputs found

    Chaotic Routing: An Overview

    No full text
    Of the many router designs, most can be classified as either oblivious or adaptive, depending on whether the path selection is statically determined based on the network topology, o

    Heterogeneous wireless network management

    No full text
    low-power, wireless, network, management Today's wireless networks are highly heterogeneous, with mobile devices consisting of multiple wireless network interfaces (WNICs). Since battery lifetime is limited, power management of the interfaces has become essential. We develop an integrated approach for the management of power and performance of mobile devices in heterogeneous wireless environments. Our policy decides which WNIC to employ for a given application and optimizes its usage based on the current power and performance needs of the system. The policy dynamically switches between WNICs during program execution if data communication requirements and/or network conditions change. We have experimentally characterized Bluetooth and 802.11b wireless interfaces. Our policy has been implemented on HP's IPAQ portable device communicating with HP's HotSpot server [14]. The applications we tested range from MPEG video to email. The results show that our policy offers a large improvement in power savings as compared to singly using 802.11b or Bluetooth while enhancing performance

    Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing

    No full text
    General-purpose processors, while tremendously versatile, pay a huge cost for their flexibility by wasting over 99% of the energy in programmability overheads. We observe that reducing this waste requires tuning data storage and compute structures and their connectivity to the data-flow and data-locality patterns in the algorithms. Hence, by backing off from full programmability and instead targeting key data-flow patterns used in a domain, we can create efficient engines that can be programmed and reused across a wide range of applications within that domain. We present the Convolution Engine (CE)—a programmable processor specialized for the convolution-like data-flow prevalent in computational photography, computer vision, and video processing. The CE achieves energy efficiency by capturing data-reuse patterns, eliminating data transfer overheads, and enabling a large number of operations per memory access. We demonstrate that the CE is within a factor of 2–3× of the energy and area efficiency of custom units optimized for a single kernel. The CE improves energy and area efficiency by 8–15× over data-parallel Single Instruction Multiple Data (SIMD) engines for most image processing applications

    Heterogeneous wireless network management

    No full text
    Abstract. Today’s wireless networks are highly heterogeneous, with mobile devices consisting of multiple wireless network interfaces (WNICs). Since battery lifetime is limited, power management of the interfaces has become essential. We develop an integrated approach for the management of power and performance of mobile devices in heterogeneous wireless environments. Our policy decides which WNIC to employ for a given application and optimizes its usage based on the current power and performance needs of the system. The policy dynamically switches between WNICs during program execution if data communication requirements and/or network conditions change. We have experimentally characterized Bluetooth and 802.11b wireless interfaces. Our policy has been implemented on HP’s IPAQ portable device communicating with HP’s HotSpot server [14]. The applications we tested range from MPEG video to email. The results show that our policy offers a large improvement in power savings as compared to singly using 802.11b or Bluetooth while enhancing performance.

    B.7.2 [Hardware]: Integrated Circuits – Design Aids

    No full text
    The drive for low-power, high performance computation coupled with the extremely high design costs for ASIC designs, has driven a number of designers to try to create a flexible, universal computing platform that will supersede the microprocessor. We argue that these flexible, general computing chips are trying to accomplish more than is commercially needed. Since design NRE costs are an order of magnitude larger than fabrication NRE costs, a two-step design system seems attractive. First, the users configure/program a flexible computing framework to run their application with the desired performance. Then, the system “compiles ” the program and configuration, tailoring the original framework to create a chip that is optimized toward the desired set of applications. Thus the user gets the reduced development costs of using a flexible solution with the efficiency of a custom chip

    Understanding sources of inefficiency in general-purpose chips

    No full text
    Due to their high volume, general-purpose processors, and now chip multiprocessors (CMPs), are much more cost effective than ASICs, but lag significantly in terms of performance and energy efficiency. This paper explores the sources of these performance and energy overheads in general-purpose processing systems by quantifying the overheads of a 720p HD H.264 encoder running on a general-purpose CMP system. It then explores methods to eliminate these overheads by transforming the CPU into a specialized system for H.264 encoding. We evaluate the gains from customizations useful to broad classes of algorithms, such as SIMD units, as well as those specific to particular computation, such as customized storage and functional units. The ASIC is 500x more energy efficient than our original fourprocessor CMP. Broadly, applicable optimizations improve performance by 10x and energy by 7x. However, the very low energy costs of actual core ops (100s fJ in 90nm) mean that over 90 % of the energy used in these solutions is still “overhead”. Achieving ASIC-like performance and efficiency requires algorithm-specific optimizations. For each sub-algorithm of H.264, we create a large, specialized functional unit that is capable of executing 100s of operations per instruction. This improves performance and energy by an additional 25x and the final customized CMP matches an ASIC solution’s performance within 3x of its energy and within comparable area
    corecore