234 research outputs found

    A Scalable VLSI Architecture for Soft-Input Soft-Output Depth-First Sphere Decoding

    Full text link
    Multiple-input multiple-output (MIMO) wireless transmission imposes huge challenges on the design of efficient hardware architectures for iterative receivers. A major challenge is soft-input soft-output (SISO) MIMO demapping, often approached by sphere decoding (SD). In this paper, we introduce the - to our best knowledge - first VLSI architecture for SISO SD applying a single tree-search approach. Compared with a soft-output-only base architecture similar to the one proposed by Studer et al. in IEEE J-SAC 2008, the architectural modifications for soft input still allow a one-node-per-cycle execution. For a 4x4 16-QAM system, the area increases by 57% and the operating frequency degrades by 34% only.Comment: Accepted for IEEE Transactions on Circuits and Systems II Express Briefs, May 2010. This draft from April 2010 will not be updated any more. Please refer to IEEE Xplore for the final version. *) The final publication will appear with the modified title "A Scalable VLSI Architecture for Soft-Input Soft-Output Single Tree-Search Sphere Decoding

    High-Throughput Soft-Output MIMO Detector Based on Path-Preserving Trellis-Search Algorithm

    Get PDF
    In this paper, we propose a novel path-preserving trellis-search (PPTS) algorithm and its high-speed VLSI architecture for soft-output multiple-input-multiple-output (MIMO) detection. We represent the search space of the MIMO signal with an unconstrained trellis, where each node in stage of the trellis maps to a possible complex-valued symbol transmitted by antenna. Based on the trellis model, we convert the soft-output MIMO detection problem into a multiple shortest paths problem subject to the constraint that every trellis node must be covered in this set of paths. The PPTS detector is guaranteed to have soft information for every possible symbol transmitted on every antenna so that the log-likelihood ratio (LLR) for each transmitted data bit can be more accurately formed. Simulation results show that the PPTS algorithm can achieve near-optimal error performance with a low search complexity. The PPTS algorithm is a hardware-friendly data-parallel algorithm because the search operations are evenly distributed among multiple trellis nodes for parallel processing. As a case study, we have designed and synthesized a fully-parallel systolic-array detector and two folded detectors for a 4x4 16-QAM system using a 1.08 V TSMC 65-nm CMOS technology.With a 1.18 mm2 core area, the folded detector can achieve a throughput of 2.1 Gbps.With a 3.19 mm2 core area, the fully-parallel systolic-array detector can achieve a throughput of 6.4 Gbps

    FlexCore: Massively Parallel and Flexible Processing for Large MIMO Access Points

    Get PDF
    Large MIMO base stations remain among wireless network designers’ best tools for increasing wireless throughput while serving many clients, but current system designs, sacrifice throughput with simple linear MIMO detection algorithms. Higher-performance detection techniques are known, but remain off the table because these systems parallelize their computation at the level of a whole OFDM subcarrier, sufficing only for the less demanding linear detection approaches they opt for. This paper presents FlexCore, the first computational architecture capable of parallelizing the detection of large numbers of mutually-interfering information streams at a granularity below individual OFDM subcarriers, in a nearly-embarrassingly parallel manner while utilizing any number of available processing elements. For 12 clients sending 64-QAM symbols to a 12-antenna base station, our WARP testbed evaluation shows similar network throughput to the state-of-the-art while using an order of magnitude fewer processing elements. For the same scenario, our combined WARP-GPU testbed evaluation demonstrates a 19x computational speedup, with 97% increased energy efficiency when compared with the state of the art. Finally, for the same scenario, an FPGA-based comparison between FlexCore and the state of the art shows that FlexCore can achieve up to 96% better energy efficiency, and can offer up to 32x the processing throughput

    Improving MIMO Sphere Detection Through Antenna Detection Order Scheduling

    Get PDF
    This paper proposes a novel scalable Multiple-Input Multiple-Output (MIMO) detector that does not require preprocessing to achieve good bit error rate (BER) performance. MIMO processing is a key technology in broadband wireless technologies such as 3G LTE, WiMAX, and 802.11n. Existing detectors such as Flexsphere use preprocessing before MIMO detection to improve performance. Instead of costly preprocessing, the proposed detector schedules multiple search passes, where each search pass detects the transmit stream with a different permuted detection order. By changing the number of parallel search passes, we show that this scalable detector can achieve comparable performance to Flexsphere with reduced resource requirement, or can eliminate LLR clipping and achieve BER performance within 0.25 dB of exhaustive search with increased resource requirement

    An Iterative Soft Decision Based LR-Aided MIMO Detector

    Get PDF
    The demand for wireless and high-rate communication system is increasing gradually and multiple-input-multiple-output (MIMO) is one of the feasible solutions to accommodate the growing demand for its spatial multiplexing and diversity gain. However, with high number of antennas, the computational and hardware complexity of MIMO increases exponentially. This accumulating complexity is a paramount problem in MIMO detection system directly leading to large power consumption. Hence, the major focus of this dissertation is algorithmic and hardware development of MIMO decoder with reduced complexity for both real and complex domain, which can be a beneficial solution with power efficiency and high throughput. Both hard and soft domain MIMO detectors are considered. The use of lattice reduction (LR) algorithm and on-demand-child-expansion for the reduction of noise propagation and node calculation respectively are the two of the key features of our developed architecture, presented in this literature. The real domain iterative soft MIMO decoding algorithm, simulated for 4 × 4 MIMO with different modulation scheme, achieves 1.1 to 2.7 dB improvement over Lease Sphere Decoder (LSD) and more than 8x reduction in list size, K as well as complexity of the detector. Next, the iterative real domain K-Best decoder is expanded to the complex domain with new detection scheme. It attains 6.9 to 8.0 dB improvement over real domain K-Best decoder and 1.4 to 2.5 dB better performance over conventional complex decoder for 8 × 8 MIMO with 64 QAM modulation scheme. Besides K, a new adjustable parameter, Rlimit has been introduced in order to append re-configurability trading-off between complexity and performance. After that, a novel low-power hardware architecture of complex decoder is developed for 8 × 8 MIMO and 64 QAM modulation scheme. The total word length of only 16 bits has been adopted limiting the bit error rate (BER) degradation to 0.3 dB with K and Rlimit equal to 4. The proposed VLSI architecture is modeled in Verilog HDL using Xilinx and synthesized using Synopsys Design Vision in 45 nm CMOS technology. According to the synthesize result, it achieves 1090.8 Mbps throughput with power consumption of 580 mW and latency of 0.33 us. The maximum frequency the design proposed is 181.8 MHz. All of the proposed decoders mentioned above are bounded by the fixed K. Hence, an adaptive real domain K-Best decoder is further developed to achieve the similar performance with less K, thereby reducing the computational complexity of the decoder. It does not require accurate SNR measurement to perform the initial estimation of list size, K. Instead, the difference between the first two minimal distances is considered, which inherently eliminates complexity. In summary, a novel iterative K-Best detector for both real and complex domain with efficient VLSI design is proposed in this dissertation. The results from extensive simulation and VHDL with analysis using Synopsys tool are also presented for justification and validation of the proposed works

    An Iterative Soft Decision Based LR-Aided MIMO Detector

    Get PDF
    The demand for wireless and high-rate communication system is increasing gradually and multiple-input-multiple-output (MIMO) is one of the feasible solutions to accommodate the growing demand for its spatial multiplexing and diversity gain. However, with high number of antennas, the computational and hardware complexity of MIMO increases exponentially. This accumulating complexity is a paramount problem in MIMO detection system directly leading to large power consumption. Hence, the major focus of this dissertation is algorithmic and hardware development of MIMO decoder with reduced complexity for both real and complex domain, which can be a beneficial solution with power efficiency and high throughput. Both hard and soft domain MIMO detectors are considered. The use of lattice reduction (LR) algorithm and on-demand-child-expansion for the reduction of noise propagation and node calculation respectively are the two of the key features of our developed architecture, presented in this literature. The real domain iterative soft MIMO decoding algorithm, simulated for 4 × 4 MIMO with different modulation scheme, achieves 1.1 to 2.7 dB improvement over Lease Sphere Decoder (LSD) and more than 8x reduction in list size, K as well as complexity of the detector. Next, the iterative real domain K-Best decoder is expanded to the complex domain with new detection scheme. It attains 6.9 to 8.0 dB improvement over real domain K-Best decoder and 1.4 to 2.5 dB better performance over conventional complex decoder for 8 × 8 MIMO with 64 QAM modulation scheme. Besides K, a new adjustable parameter, Rlimit has been introduced in order to append re-configurability trading-off between complexity and performance. After that, a novel low-power hardware architecture of complex decoder is developed for 8 × 8 MIMO and 64 QAM modulation scheme. The total word length of only 16 bits has been adopted limiting the bit error rate (BER) degradation to 0.3 dB with K and Rlimit equal to 4. The proposed VLSI architecture is modeled in Verilog HDL using Xilinx and synthesized using Synopsys Design Vision in 45 nm CMOS technology. According to the synthesize result, it achieves 1090.8 Mbps throughput with power consumption of 580 mW and latency of 0.33 us. The maximum frequency the design proposed is 181.8 MHz. All of the proposed decoders mentioned above are bounded by the fixed K. Hence, an adaptive real domain K-Best decoder is further developed to achieve the similar performance with less K, thereby reducing the computational complexity of the decoder. It does not require accurate SNR measurement to perform the initial estimation of list size, K. Instead, the difference between the first two minimal distances is considered, which inherently eliminates complexity. In summary, a novel iterative K-Best detector for both real and complex domain with efficient VLSI design is proposed in this dissertation. The results from extensive simulation and VHDL with analysis using Synopsys tool are also presented for justification and validation of the proposed works

    Efficient VLSI Implementation of Soft-input Soft-output Fixed-complexity Sphere Decoder

    Get PDF
    Fixed-complexity sphere decoder (FSD) is one of the most promising techniques for the implementation of multipleinput multiple-output (MIMO) detection, with relevant advantages in terms of constant throughput and high flexibility of parallel architecture. The reported works on FSD are mainly based on software level simulations and a few details have been provided on hardware implementation. The authors present the study based on a four-nodes-per-cycle parallel FSD architecture with several examples of VLSI implementation in 4 × 4 systems with both 16-quadrature amplitude modulation (QAM) and 64-QAM modulation and both real and complex signal models. The implementation aspects and details of the architecture are analysed in order to provide a variety of performance-complexity trade-offs. The authors also provide a parallel implementation of loglikelihood- ratio (LLR) generator with optimised algorithm to enhance the proposed FSD architecture to be a soft-input softoutput (SISO) MIMO detector. To the authors best knowledge, this is the first complete VLSI implementation of an FSD based SISO MIMO detector. The implementation results show that the proposed SISO FSD architecture is highly efficient and flexible, making it very suitable for real application
    • …
    corecore