30,309 research outputs found
Ultra-Low-Power Superconductor Logic
We have developed a new superconducting digital technology, Reciprocal
Quantum Logic, that uses AC power carried on a transmission line, which also
serves as a clock. Using simple experiments we have demonstrated zero static
power dissipation, thermally limited dynamic power dissipation, high clock
stability, high operating margins and low BER. These features indicate that the
technology is scalable to far more complex circuits at a significant level of
integration. On the system level, Reciprocal Quantum Logic combines the high
speed and low-power signal levels of Single-Flux- Quantum signals with the
design methodology of CMOS, including low static power dissipation, low latency
combinational logic, and efficient device count.Comment: 7 pages, 5 figure
A low complexity hardware architecture for motion estimation
This paper tackles the problem of accelerating motion estimation for video processing. A novel architecture using binary data is proposed, which attempts to reduce power consumption. The solution exploits redundant operations in the sum of absolute differences (SAD) calculation, by a mechanism known as early termination. Further data redundancies are exploited by using a run length coding addressing scheme, where access to pixels which do not contribute to the final SAD value is minimised. By using these two techniques operations and memory accesses are reduced by 93.29% and 69.17% respectively relative to a systolic array implementation
Formal Verification of an Iterative Low-Power x86 Floating-Point Multiplier with Redundant Feedback
We present the formal verification of a low-power x86 floating-point
multiplier. The multiplier operates iteratively and feeds back intermediate
results in redundant representation. It supports x87 and SSE instructions in
various precisions and can block the issuing of new instructions. The design
has been optimized for low-power operation and has not been constrained by the
formal verification effort. Additional improvements for the implementation were
identified through formal verification. The formal verification of the design
also incorporates the implementation of clock-gating and control logic. The
core of the verification effort was based on ACL2 theorem proving.
Additionally, model checking has been used to verify some properties of the
floating-point scheduler that are relevant for the correct operation of the
unit.Comment: In Proceedings ACL2 2011, arXiv:1110.447
Low Complexity Belief Propagation Polar Code Decoders
Since its invention, polar code has received a lot of attention because of
its capacity-achieving performance and low encoding and decoding complexity.
Successive cancellation decoding (SCD) and belief propagation decoding (BPD)
are two of the most popular approaches for decoding polar codes. SCD is able to
achieve good error-correcting performance and is less computationally expensive
as compared to BPD. However SCDs suffer from long latency and low throughput
due to the serial nature of the successive cancellation algorithm. BPD is
parallel in nature and hence is more attractive for high throughput
applications. However since it is iterative in nature, the required latency and
energy dissipation increases linearly with the number of iterations. In this
work, we borrow the idea of SCD and propose a novel scheme based on
sub-factor-graph freezing to reduce the average number of computations as well
as the average number of iterations required by BPD, which directly translates
into lower latency and energy dissipation. Simulation results show that the
proposed scheme has no performance degradation and achieves significant
reduction in computation complexity over the existing methods.Comment: 6 page
Enabling Fine-Grain Restricted Coset Coding Through Word-Level Compression for PCM
Phase change memory (PCM) has recently emerged as a promising technology to
meet the fast growing demand for large capacity memory in computer systems,
replacing DRAM that is impeded by physical limitations. Multi-level cell (MLC)
PCM offers high density with low per-byte fabrication cost. However, despite
many advantages, such as scalability and low leakage, the energy for
programming intermediate states is considerably larger than programing
single-level cell PCM. In this paper, we study encoding techniques to reduce
write energy for MLC PCM when the encoding granularity is lowered below the
typical cache line size. We observe that encoding data blocks at small
granularity to reduce write energy actually increases the write energy because
of the auxiliary encoding bits. We mitigate this adverse effect by 1) designing
suitable codeword mappings that use fewer auxiliary bits and 2) proposing a new
Word-Level Compression (WLC) which compresses more than 91% of the memory lines
and provides enough room to store the auxiliary data using a novel restricted
coset encoding applied at small data block granularities.
Experimental results show that the proposed encoding at 16-bit data
granularity reduces the write energy by 39%, on average, versus the leading
encoding approach for write energy reduction. Furthermore, it improves
endurance by 20% and is more reliable than the leading approach. Hardware
synthesis evaluation shows that the proposed encoding can be implemented
on-chip with only a nominal area overhead.Comment: 12 page
- …