7,206 research outputs found

    Multi-standard programmable baseband modulator for next generation wireless communication

    Full text link
    Considerable research has taken place in recent times in the area of parameterization of software defined radio (SDR) architecture. Parameterization decreases the size of the software to be downloaded and also limits the hardware reconfiguration time. The present paper is based on the design and development of a programmable baseband modulator that perform the QPSK modulation schemes and as well as its other three commonly used variants to satisfy the requirement of several established 2G and 3G wireless communication standards. The proposed design has been shown to be capable of operating at a maximum data rate of 77 Mbps on Xilinx Virtex 2-Pro University field programmable gate array (FPGA) board. The pulse shaping root raised cosine (RRC) filter has been implemented using distributed arithmetic (DA) technique in the present work in order to reduce the computational complexity, and to achieve appropriate power reduction and enhanced throughput. The designed multiplier-less programmable 32-tap FIR-based RRC filter has been found to withstand a peak inter-symbol interference (ISI) distortion of -41 dB

    PULP-HD: Accelerating Brain-Inspired High-Dimensional Computing on a Parallel Ultra-Low Power Platform

    Full text link
    Computing with high-dimensional (HD) vectors, also referred to as hypervectors\textit{hypervectors}, is a brain-inspired alternative to computing with scalars. Key properties of HD computing include a well-defined set of arithmetic operations on hypervectors, generality, scalability, robustness, fast learning, and ubiquitous parallel operations. HD computing is about manipulating and comparing large patterns-binary hypervectors with 10,000 dimensions-making its efficient realization on minimalistic ultra-low-power platforms challenging. This paper describes HD computing's acceleration and its optimization of memory accesses and operations on a silicon prototype of the PULPv3 4-core platform (1.5mm2^2, 2mW), surpassing the state-of-the-art classification accuracy (on average 92.4%) with simultaneous 3.7×\times end-to-end speed-up and 2×\times energy saving compared to its single-core execution. We further explore the scalability of our accelerator by increasing the number of inputs and classification window on a new generation of the PULP architecture featuring bit-manipulation instruction extensions and larger number of 8 cores. These together enable a near ideal speed-up of 18.4×\times compared to the single-core PULPv3

    System-on-chip Computing and Interconnection Architectures for Telecommunications and Signal Processing

    Get PDF
    This dissertation proposes novel architectures and design techniques targeting SoC building blocks for telecommunications and signal processing applications. Hardware implementation of Low-Density Parity-Check decoders is approached at both the algorithmic and the architecture level. Low-Density Parity-Check codes are a promising coding scheme for future communication standards due to their outstanding error correction performance. This work proposes a methodology for analyzing effects of finite precision arithmetic on error correction performance and hardware complexity. The methodology is throughout employed for co-designing the decoder. First, a low-complexity check node based on the P-output decoding principle is designed and characterized on a CMOS standard-cells library. Results demonstrate implementation loss below 0.2 dB down to BER of 10^{-8} and a saving in complexity up to 59% with respect to other works in recent literature. High-throughput and low-latency issues are addressed with modified single-phase decoding schedules. A new "memory-aware" schedule is proposed requiring down to 20% of memory with respect to the traditional two-phase flooding decoding. Additionally, throughput is doubled and logic complexity reduced of 12%. These advantages are traded-off with error correction performance, thus making the solution attractive only for long codes, as those adopted in the DVB-S2 standard. The "layered decoding" principle is extended to those codes not specifically conceived for this technique. Proposed architectures exhibit complexity savings in the order of 40% for both area and power consumption figures, while implementation loss is smaller than 0.05 dB. Most modern communication standards employ Orthogonal Frequency Division Multiplexing as part of their physical layer. The core of OFDM is the Fast Fourier Transform and its inverse in charge of symbols (de)modulation. Requirements on throughput and energy efficiency call for FFT hardware implementation, while ubiquity of FFT suggests the design of parametric, re-configurable and re-usable IP hardware macrocells. In this context, this thesis describes an FFT/IFFT core compiler particularly suited for implementation of OFDM communication systems. The tool employs an accuracy-driven configuration engine which automatically profiles the internal arithmetic and generates a core with minimum operands bit-width and thus minimum circuit complexity. The engine performs a closed-loop optimization over three different internal arithmetic models (fixed-point, block floating-point and convergent block floating-point) using the numerical accuracy budget given by the user as a reference point. The flexibility and re-usability of the proposed macrocell are illustrated through several case studies which encompass all current state-of-the-art OFDM communications standards (WLAN, WMAN, xDSL, DVB-T/H, DAB and UWB). Implementations results are presented for two deep sub-micron standard-cells libraries (65 and 90 nm) and commercially available FPGA devices. Compared with other FFT core compilers, the proposed environment produces macrocells with lower circuit complexity and same system level performance (throughput, transform size and numerical accuracy). The final part of this dissertation focuses on the Network-on-Chip design paradigm whose goal is building scalable communication infrastructures connecting hundreds of core. A low-complexity link architecture for mesochronous on-chip communication is discussed. The link enables skew constraint looseness in the clock tree synthesis, frequency speed-up, power consumption reduction and faster back-end turnarounds. The proposed architecture reaches a maximum clock frequency of 1 GHz on 65 nm low-leakage CMOS standard-cells library. In a complex test case with a full-blown NoC infrastructure, the link overhead is only 3% of chip area and 0.5% of leakage power consumption. Finally, a new methodology, named metacoding, is proposed. Metacoding generates correct-by-construction technology independent RTL codebases for NoC building blocks. The RTL coding phase is abstracted and modeled with an Object Oriented framework, integrated within a commercial tool for IP packaging (Synopsys CoreTools suite). Compared with traditional coding styles based on pre-processor directives, metacoding produces 65% smaller codebases and reduces the configurations to verify up to three orders of magnitude

    Arithmetic coding revisited

    Get PDF
    Over the last decade, arithmetic coding has emerged as an important compression tool. It is now the method of choice for adaptive coding on multisymbol alphabets because of its speed, low storage requirements, and effectiveness of compression. This article describes a new implementation of arithmetic coding that incorporates several improvements over a widely used earlier version by Witten, Neal, and Cleary, which has become a de facto standard. These improvements include fewer multiplicative operations, greatly extended range of alphabet sizes and symbol probabilities, and the use of low-precision arithmetic, permitting implementation by fast shift/add operations. We also describe a modular structure that separates the coding, modeling, and probability estimation components of a compression system. To motivate the improved coder, we consider the needs of a word-based text compression program. We report a range of experimental results using this and other models. Complete source code is available

    Designing Flexible, Energy Efficient and Secure Wireless Solutions for the Internet of Things

    Full text link
    The Internet of Things (IoT) is an emerging concept where ubiquitous physical objects (things) consisting of sensor, transceiver, processing hardware and software are interconnected via the Internet. The information collected by individual IoT nodes is shared among other often heterogeneous devices and over the Internet. This dissertation presents flexible, energy efficient and secure wireless solutions in the IoT application domain. System design and architecture designs are discussed envisioning a near-future world where wireless communication among heterogeneous IoT devices are seamlessly enabled. Firstly, an energy-autonomous wireless communication system for ultra-small, ultra-low power IoT platforms is presented. To achieve orders of magnitude energy efficiency improvement, a comprehensive system-level framework that jointly optimizes various system parameters is developed. A new synchronization protocol and modulation schemes are specified for energy-scarce ultra-small IoT nodes. The dynamic link adaptation is proposed to guarantee the ultra-small node to always operate in the most energy efficiency mode, given an operating scenario. The outcome is a truly energy-optimized wireless communication system to enable various new applications such as implanted smart-dust devices. Secondly, a configurable Software Defined Radio (SDR) baseband processor is designed and shown to be an efficient platform on which to execute several IoT wireless standards. It is a custom SIMD execution model coupled with a scalar unit and several architectural optimizations: streaming registers, variable bitwidth, dedicated ALUs, and an optimized reduction network. Voltage scaling and clock gating are employed to further reduce the power, with a more than a 100% time margin reserved for reliable operation in the near-threshold region. Two upper bound systems are evaluated. A comprehensive power/area estimation indicates that the overhead of realizing SDR flexibility is insignificant. The benefit of baseband SDR is quantified and evaluated. To further augment the benefits of a flexible baseband solution and to address the security issue of IoT connectivity, a light-weight Galois Field (GF) processor is proposed. This processor enables both energy-efficient block coding and symmetric/asymmetric cryptography kernel processing for a wide range of GF sizes (2^m, m = 2, 3, ..., 233) and arbitrary irreducible polynomials. Program directed connections among primitive GF arithmetic units enable dynamically configured parallelism to efficiently perform either four-way SIMD GF operations, including multiplicative inverse, or a long bit-width GF product in a single cycle. This demonstrates the feasibility of a unified architecture to enable error correction coding flexibility and secure wireless communication in the low power IoT domain.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/137164/1/yajchen_1.pd
    corecore