885 research outputs found

    Low power techniques and architectures for multicarrier wireless receivers

    Get PDF

    Power Modeling and Optimization for GPGPUs

    Get PDF
    Modern graphics processing units (GPUs) supports tens of thousands of parallel threads and delivers remarkably high computing throughput. General-Purpose computing on GPUs (GPGPUs) is becoming the attractive platform for general-purpose applications that request high computational performance such as scientific computing, financial applications, medical data processing, and so on. However, GPGPUs is facing severe power challenge due to the increasing number of cores placed on a single chip with decreasing feature size. In order to explore the power optimization techniques in GPGPUs, I first build a power model for GPGPUs, which is able to estimate both dynamic and leakage power of major microarchitecture structures in GPGPUs. I then target on the power-hungry structures (e.g. register file) to explore the energy-efficient GPGPUs. In order to hide the long latency operations, GPGPUs employs the fine-grained multi-threading among numerous active threads, leading to the sizeable register files with massive power consumption. The conventional method to reduce dynamic power consumption is the supply voltage scaling. And the inter-bank tunneling FETs (TFETs) is the promising candidate compared to CMOS for low voltage operations regarding to both leakage and performance. However, always executing at the low voltage will result in significant performance degradation. In this study, I propose the hybrid CMOS-TFET based register file and allocate TFET-based registers to threads whose execution progress can be delayed to some degree to avoid the memory contentions with other threads to reduce both dynamic and leakage power, and the CMOS-based registers are still used for threads requiring normal execution speed. My experimental results show that the proposed technique achieves 30% energy (including both dynamic and leakage) reduction in register files with negligible performance degradation compared to the baseline case equipped with naive power optimization technique

    Domain specific high performance reconfigurable architecture for a communication platform

    Get PDF

    Pulse stream VLSI circuits and techniques for the implementation of neural networks

    Get PDF

    Computer arithmetic based on the Continuous Valued Number System

    Get PDF

    Digitally Interfaced Analog Correlation Filter System for Object Tracking Applications

    Get PDF
    Advanced correlation filters have been employed in a wide variety of image processing and pattern recognition applications such as automatic target recognition and biometric recognition. Among those, object recognition and tracking have received more attention recently due to their wide range of applications such as autonomous cars, automated surveillance, human-computer interaction, and vehicle navigation.Although digital signal processing has long been used to realize such computational systems, they consume extensive silicon area and power. In fact, computational tasks that require low to moderate signal-to-noise ratios are more efficiently realized in analog than digital. However, analog signal processing has its own caveats. Mainly, noise and offset accumulation which degrades the accuracy, and lack of a scalable and standard input/output interface capable of managing a large number of analog data.Two digitally-interfaced analog correlation filter systems are proposed. While digital interfacing provided a standard and scalable way of communication with pre- and post-processing blocks without undermining the energy efficiency of the system, the multiply-accumulate operations were performed in analog. Moreover, non-volatile floating-gate memories are utilized as storage for coefficients. The proposed systems incorporate techniques to reduce the effects of analog circuit imperfections.The first system implements a 24x57 Gilbert-multiplier-based correlation filter. The I/O interface is implemented with low-power D/A and A/D converters and a correlated double sampling technique is implemented to reduce offset and lowfrequency noise at the output of analog array. The prototype chip occupies an area of 3.23mm2 and demonstrates a 25.2pJ/MAC energy-efficiency at 11.3 kVec/s and 3.2% RMSE.The second system realizes a 24x41 PWM-based correlation filter. Benefiting from a time-domain approach to multiplication, this system eliminates the need for explicit D/A and A/D converters. Careful utilization of clock and available hardware resources in the digital I/O interface, along with application of power management techniques has significantly reduced the circuit complexity and energy consumption of the system. Additionally, programmable transconductance amplifiers are incorporated at the output of the analog array for offset and gain error calibration. The prototype system occupies an area of 0.98mm2 and is expected to achieve an outstanding energy-efficiency of 3.6pJ/MAC at 319kVec/s with 0.28% RMSE

    Real-time neural signal processing and low-power hardware co-design for wireless implantable brain machine interfaces

    Get PDF
    Intracortical Brain-Machine Interfaces (iBMIs) have advanced significantly over the past two decades, demonstrating their utility in various aspects, including neuroprosthetic control and communication. To increase the information transfer rate and improve the devices’ robustness and longevity, iBMI technology aims to increase channel counts to access more neural data while reducing invasiveness through miniaturisation and avoiding percutaneous connectors (wired implants). However, as the number of channels increases, the raw data bandwidth required for wireless transmission also increases becoming prohibitive, requiring efficient on-implant processing to reduce the amount of data through data compression or feature extraction. The fundamental aim of this research is to develop methods for high-performance neural spike processing co-designed within low-power hardware that is scaleable for real-time wireless BMI applications. The specific original contributions include the following: Firstly, a new method has been developed for hardware-efficient spike detection, which achieves state-of-the-art spike detection performance and significantly reduces the hardware complexity. Secondly, a novel thresholding mechanism for spike detection has been introduced. By incorporating firing rate information as a key determinant in establishing the spike detection threshold, we have improved the adaptiveness of spike detection. This eventually allows the spike detection to overcome the signal degradation that arises due to scar tissue growth around the recording site, thereby ensuring enduringly stable spike detection results. The long-term decoding performance, as a consequence, has also been improved notably. Thirdly, the relationship between spike detection performance and neural decoding accuracy has been investigated to be nonlinear, offering new opportunities for further reducing transmission bandwidth by at least 30% with minor decoding performance degradation. In summary, this thesis presents a journey toward designing ultra-hardware-efficient spike detection algorithms and applying them to reduce the data bandwidth and improve neural decoding performance. The software-hardware co-design approach is essential for the next generation of wireless brain-machine interfaces with increased channel counts and a highly constrained hardware budget. The fundamental aim of this research is to develop methods for high-performance neural spike processing co-designed within low-power hardware that is scaleable for real-time wireless BMI applications. The specific original contributions include the following: Firstly, a new method has been developed for hardware-efficient spike detection, which achieves state-of-the-art spike detection performance and significantly reduces the hardware complexity. Secondly, a novel thresholding mechanism for spike detection has been introduced. By incorporating firing rate information as a key determinant in establishing the spike detection threshold, we have improved the adaptiveness of spike detection. This eventually allows the spike detection to overcome the signal degradation that arises due to scar tissue growth around the recording site, thereby ensuring enduringly stable spike detection results. The long-term decoding performance, as a consequence, has also been improved notably. Thirdly, the relationship between spike detection performance and neural decoding accuracy has been investigated to be nonlinear, offering new opportunities for further reducing transmission bandwidth by at least 30\% with only minor decoding performance degradation. In summary, this thesis presents a journey toward designing ultra-hardware-efficient spike detection algorithms and applying them to reduce the data bandwidth and improve neural decoding performance. The software-hardware co-design approach is essential for the next generation of wireless brain-machine interfaces with increased channel counts and a highly constrained hardware budget.Open Acces

    Novel Optimization to Reduce Power Drainage in Mobile Devices for Multicarrier-based Communication

    Get PDF
    With increasing adoption of multicarrier-based communications e.g. 3G and 4G, the users are significantly benefited with impressive data rate but at the cost of battery life of their mobile devices. We reviewed the existing techniques to find an open research gap in this regard. This paper presents a novel framework where an optimization is carried out with the objective function to maintain higher level of equilibrium between maximized data delivery and minimized transmit power. An analytical model considering multiple radio antennae in the mobile device is presented with constraint formulations of data quality and threshold power factor. The model outcome is evaluated with respect to amount of power being conserved as performance factor. The study was found to offer maximum energy conservation and the framework also suits well with existing communication system of mobile networks
    corecore