13 research outputs found
Parallel H.264/AVC Fast Rate-Distortion Optimized Motion Estimation using Graphics Processing Unit and Dedicated Hardware
Heterogeneous systems on a single chip composed of CPU, Graphical Processing Unit (GPU), and Field Programmable Gate Array (FPGA) are expected to emerge in near future. In this context, the System on Chip (SoC) can be dynamically adapted to employ different architectures for execution of data-intensive applications. Motion estimation is one such task that can be accelerated using FPGA and GPU for high performance H.264/AVC encoder implementation. In most of works on parallel implementation of motion estimation, the bit rate cost of motion vectors is generally ignored. On the contrary, this paper presents a fast rate-distortion optimized parallel motion estimation algorithm implemented on GPU using OpenCL and FPGA/ASIC using VHDL. The predicted motion vectors are estimated from temporally preceding motion vectors and used for evaluating the bit rate cost of the motion vectors simultaneously. The experimental results show that the proposed scheme achieves significant speedup on GPU and FPGA, and has comparable ratedistortion performance with respect to sequential fast motion estimation algorithm
Data reduction algorithms to enable long-term monitoring from low-power miniaturised wireless EEG systems
Objectives: The weight and volume of battery-powered wireless electroencephalography
(EEG) systems are dominated by the batteries. Battery dimensions are in
turn determined by the required energy capacity, which is derived from the system
power consumption and required monitoring time. Data reduction may be carried
out to reduce the amount of data transmitted and thus proportionally reduce
the power consumption of the wireless transmitter, which dominates system power
consumption. This thesis presents two new data selection algorithms that, in addition
to achieving data reduction, also select EEG containing epileptic seizures and
spikes that are important in diagnosis.
Methods: The algorithms analyse short EEG sections, during monitoring, to
determine the presence of candidate seizures or spikes. Phase information from
different frequency components of the signal are used to detect spikes. For seizure
detection, frequencies below 10 Hz are investigated for a relative increase in frequency
and/or amplitude.
Significant attention has also been given to metrics in order to accurately evaluate
the performance of these algorithms for practical use in the proposed system.
Additionally, signal processing techniques to emphasize seizures within the EEG
and techniques to correct for broad-level amplitude variation in the EEG have been
investigated.
Results: The spike detection algorithm detected 80% of spikes whilst achieving
50% data reduction, when tested on 992 spikes from 105 hours of 10-channel scalp
EEG data obtained from 25 adults. The seizure detection algorithm identified 94%
of seizures selecting 80% of their duration for transmission and achieving 79% data
reduction. It was tested on 34 seizures with a total duration of 4158 s in a database
of over 168 hours of 16-channel scalp EEG obtained from 21 adults. These algorithms
show great potential for longer monitoring times from miniaturised wireless
EEG systems that would improve electroclinical diagnosis of patients
Architecting Energy Efficient Servers.
This dissertation investigates how energy efficient servers can be architected using current and future technology. We leverage recent trends in packaging and device technology to deliver low power and high throughput. Specifically at the package level, this dissertation looks at 3D stacking technology that has emerged as a promising solution in achieving energy efficiency by delivering high throughput at a low cost. It shows how one would leverage this new technology into a datacenter. 3D stacking technology can be used to implement a simple, low-power, high-performance chip
multiprocessor suitable for throughput processing. Our proposed architecture leveraging this technology, PicoServer, employs 3D technology to bond one die containing
several simple slow processing cores to multiple memory dies sufficient for a primary memory. The multiple memory dies are composed of DRAM. 3D stacking technology also enables wide low-latency buses between processors and memory. These remove the need for an L2 cache allowing its area to be re-allocated to additional simple cores. The additional cores allow the clock frequency to be lowered without impairing throughput. Lower clock frequency along with the integration of non-volatile memory in turn reduces power and means that thermal constraints, a concern with 3D stacking, are easily satisfied. The PicoServer architecture targets server applications,which exhibit a high degree of thread level parallelism. An architecture targeted to efficient throughput is ideal for this application domain.
At the memory device level, this dissertation investigates how the system memory could be re-architected to reduce the rising power consumption of system memory and disk drives. Flash memory has emerged as a strong candidate to reduce system memory power while remaining cost effective than conventional system memory. This dissertation discusses how Flash could be integrated at the system level and provides
insights on the architectural support for Flash in servers. Our architecture uses a two level disk cache composed of a relatively small DRAM, which includes a primary disk cache, and a Flash based secondary disk cache. Further, based on our observations, we found that the Flash based disk caches should be split into a read optimized disk cache and write optimized disk cache.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/57602/2/tkgil_1.pd
Superconducting Quantum Circuits, Qubits and Computing
This paper gives an introduction to the physics and principles of operation
of quantized superconducting electrical circuits for quantum information
processing.Comment: 59 pages 68 figures. Prepared for Handbook of Theoretical and
Computational Nanotechnolog
Computing 3-D Motion in Custom Analog and Digital VLSI
This thesis examines a complete design framework for a real-time, autonomous system with specialized VLSI hardware for computing 3-D camera motion. In the proposed architecture, the first step is to determine point correspondences between two images. Two processors, a CCD array edge detector and a mixed analog/digital binary block correlator, are proposed for this task. The report is divided into three parts. Part I covers the algorithmic analysis; part II describes the design and test of a 32\time 32 CCD edge detector fabricated through MOSIS; and part III compares the design of the mixed analog/digital correlator to a fully digital implementation
NASA Tech Briefs, July 1997
Topics: Mechanical Components; Electronic Components and Circuits; Electronic Systems; Physical Sciences; Materials; Computer Software; Mechanics; Machinery/Automation; Manufacturing/Fabrication; Life Sciences
Modeling and automated synthesis of reconfigurable interfaces
Stefan IhmorPaderborn, Univ., Diss., 200