309,329 research outputs found
A Link Quality Model for Generalised Frequency Division Multiplexing
5G systems aim to achieve extremely high data rates, low end-to-end latency
and ultra-low power consumption. Recently, there has been considerable interest
in the design of 5G physical layer waveforms. One important candidate is
Generalised Frequency Division Multiplexing (GFDM). In order to evaluate its
performance and features, system-level studies should be undertaken in a range
of scenarios. These studies, however, require highly complex computations if
they are performed using bit-level simulators. In this paper, the Mutual
Information (MI) based link quality model (PHY abstraction), which has been
regularly used to implement system-level studies for Orthogonal Frequency
Division Multiplexing (OFDM), is applied to GFDM. The performance of the GFDM
waveform using this model and the bit-level simulation performance is measured
using different channel types. Moreover, a system-level study for a GFDM based
LTE-A system in a realistic scenario, using both a bit-level simulator and this
abstraction model, has been studied and compared. The results reveal the
accuracy of this model using realistic channel data. Based on these results,
the PHY abstraction technique can be applied to evaluate the performance of
GFDM based systems in an effective manner with low complexity. The maximum
difference in the Packet Error Rate (PER) and throughput results in the
abstraction case compared to bit-level simulation does not exceed 4% whilst
offering a simulation time saving reduction of around 62,000 times.Comment: 5 pages, 8 figures, accepted in VTC- spring 201
Implementation of a CMOS Wallace-tree Multiplier
© ASEE 2009As slow and expensive operation units, multipliers are often the bottleneck limiting the overall performance of many computational VLSI circuits. Various CMOS multiplier architectures are available, such as the array multiplier, carry-save multiplier, and Wallace-tree multiplier. Wallace-tree multiplier has been a very popular design due to its fast speed, ease for modularization and fabrication. In this paper, the design and simulation of an 8-bit Wallace-tree multiplier with PSPICE is proposed. In order for comparison, an 8-bit CMOS array multiplier is also designed. The worst-case delay of both multiplier architectures are extracted and Wallace-tree multiplier demonstrates significant speed enhancement compared to CMOS array multiplier. Some efforts are made to further improve the performance of Wallace-tree multiplier. The revision in the circuit structure demonstrates effective speed improvement for the Wallace-tree multiplier
Power Efficient SRAM Design with Integrated Bit Line Charge Pump
Bit line toggling of SRAM systems in write operations leads to the largest portion of power dissipation. To reduce this amount of power loss and achieve power efficient memory, we propose a new SRAM design that integrates charge pump circuits to harvest and reuse bit line charge. In this work, a power-efficient charge recycling SRAM is designed and implemented in 180nm CMOS technology. Post-layout simulation demonstrates an 11% of power saving and 3.8% of area overhead, if the bit width of SRAM is chosen as 8. Alternatively, 22% of power reduction is obtained if the bit width of SRAM is extended to 64. Compared with existing charge recycling SRAM schemes, this proposed SRAM is robust to process variation, demonstrates good read/write stability, and illustrates better trade-off between design complexity and power reduction
Design and Simulation of an 8-Bit Successive Approximation Register Charge-Redistribution Analog-To-Digital Converter
The thesis initially investigates the history of the monolithic ADCs. The next chapter explores the different types of ADCs available in the market today. Next, the operation of a 4-bit SAR ADC has been studied. Based on this analysis, an 8-bit charge-redistribution SAR ADC has been designed and simulated with Multisim (National Instruments, Austin, TX). The design is divided into different blocks which are individually implemented and tested. Level-1 SPICE MOSFET models representative of 5μm devices were used wherever individual MOSFETs were used in the design. Finally, the power dissipation during the conversion period was also estimated. The supply voltage for the ADC is 5V and the clock frequency is 500KHz
Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks
Fully realizing the potential of acceleration for Deep Neural Networks (DNNs)
requires understanding and leveraging algorithmic properties. This paper builds
upon the algorithmic insight that bitwidth of operations in DNNs can be reduced
without compromising their classification accuracy. However, to prevent
accuracy loss, the bitwidth varies significantly across DNNs and it may even be
adjusted for each layer. Thus, a fixed-bitwidth accelerator would either offer
limited benefits to accommodate the worst-case bitwidth requirements, or lead
to a degradation in final accuracy. To alleviate these deficiencies, this work
introduces dynamic bit-level fusion/decomposition as a new dimension in the
design of DNN accelerators. We explore this dimension by designing Bit Fusion,
a bit-flexible accelerator, that constitutes an array of bit-level processing
elements that dynamically fuse to match the bitwidth of individual DNN layers.
This flexibility in the architecture enables minimizing the computation and the
communication at the finest granularity possible with no loss in accuracy. We
evaluate the benefits of BitFusion using eight real-world feed-forward and
recurrent DNNs. The proposed microarchitecture is implemented in Verilog and
synthesized in 45 nm technology. Using the synthesis results and cycle accurate
simulation, we compare the benefits of Bit Fusion to two state-of-the-art DNN
accelerators, Eyeriss and Stripes. In the same area, frequency, and process
technology, BitFusion offers 3.9x speedup and 5.1x energy savings over Eyeriss.
Compared to Stripes, BitFusion provides 2.6x speedup and 3.9x energy reduction
at 45 nm node when BitFusion area and frequency are set to those of Stripes.
Scaling to GPU technology node of 16 nm, BitFusion almost matches the
performance of a 250-Watt Titan Xp, which uses 8-bit vector instructions, while
BitFusion merely consumes 895 milliwatts of power
Design of Low Power Vedic Multiplier Based on Reversible Logic
Reversible logic is a new technique to reduce the power dissipation. There is no loss of information in reversible
logic and produces unique output for specified inputs and vice-versa. There is no loss of bits so the power
dissipation is reduced. In this paper new design for high speed, low power and area efficient 8-bit Vedic
multiplier using Urdhva Tiryakbhyam Sutra (ancient methodology of Indian mathematics) is introduced and
implemented using Reversible logic to generate products with low power dissipation. UT Sutra generates partial
product and sum in single step with less number of adders unit when compare to conventional booth and array
multipliers which will reduce the delay and area utilized, Reversible logic will reduce the power dissipation. An
8-bit Vedic multiplier is realized using a 4-bit Vedic multiplier and modified ripple carry adders. The proposed
logic blocks are implemented using Verilog HDL programming language, simulation using Xilinx ISE software
Active inductor shunt peaking in high-speed VCSEL driver design
An all transistor active inductor shunt peaking structure has been used in a
prototype of 8-Gbps high-speed VCSEL driver which is designed for the optical
link in ATLAS liquid Argon calorimeter upgrade. The VCSEL driver is fabricated
in a commercial 0.25-um Silicon-on-Sapphire (SoS) CMOS process for radiation
tolerant purpose. The all transistor active inductor shunt peaking is used to
overcome the bandwidth limitation from the CMOS process. The peaking structure
has the same peaking effect as the passive one, but takes a small area, does
not need linear resistors and can overcome the process variation by adjust the
peaking strength via an external control. The design has been tapped out, and
the prototype has been proofed by the preliminary electrical test results and
bit error ratio test results. The driver achieves 8-Gbps data rate as simulated
with the peaking. We present the all transistor active inductor shunt peaking
structure, simulation and test results in this paper.Comment: 4 pages, 6 figures and 1 table, Submitted to 'Chinese Physics C
Design of 370-ps Delay Floating-Voltage Level Shifters With 30-V/ns Power Supply Slew Tolerance
A new design method for producing high-performance and power-rail slew-tolerant floating-voltage level shifters is presented, offering increased speed, reduced power consumption, and smaller layout area compared with previous designs. The method uses an energy-saving pulse-triggered input, a high-bandwidth current mirror, and a simple full latch composed of two inverters. A number of optimizations are explored in detail, resulting in a presented design with a dVdd slew immunity of 30 V/ns, and near-zero static power dissipation in a 180-nm technology. Experimental results show a delay of below 370 ps for a level-shift range of 8-20 V. Postlayout simulation puts the energy consumption at 2.6 pJ/bit at 4 V and 7.2 pJ/bit at 20 V, with near symmetric rise and fall delays
- …