7 research outputs found
Massive MIMO Systems With Low-Resolution ADCs: Baseband Energy Consumption vs. Symbol Detection Performance
In massive multiple-input multiple-output (MIMO) systems using a large number of antennas, it would be difficult to connect high-resolution analog-to-digital converters (ADCs) to each antenna component due to high cost and energy consumption problems. To resolve these issues, there has been much work on implementing symbol detectors and channel estimators using low-resolution ADCs for massive MIMO systems. Although it is intuitively true that using low-resolution ADCs makes it possible to save a large amount of energy consumption in massive MIMO systems, the relationship between energy consumption using low-resolution ADCs and detection performance has not been properly analyzed yet. In this paper, the tradeoff between different detectors and total baseband energy consumption including flexible ADCs is thoroughly analyzed taking the optimal fixed-point operations performed during the detection processes into account. In order to minimize the energy consumption for the given channel condition, the proposed scheme selects the best mode among various processing options while supporting the target frame error rate. The numerous case studies reveal that the proposed work remarkably saves the energy consumption of the massive MIMO processing compared with the existing schemes.11Ysciescopu
Ultralow-Latency Successive Cancellation Polar Decoding Architecture Using Tree-Level Parallelism
Achieving the attractive error-correcting capability with a simple decoder structure, the polar code using successive cancellation (SC) decoding is now expected to be installed at the resource-limited IoT or embedded communications. However, the existing SC decoders normally suffer from the long processing latency caused by the serialized processing steps, limiting the practical applications of polar codes. In this article, to solve this latency problem, we present a new low-complexity merging operation that can increase the number of parallel factors for realizing the tree-level parallelism. We also modify the previous pruning method to further reduce the number of visited nodes at the parallel SC decoding scenario. In addition, a novel parallel partial-sum calculator (PSC) architecture is introduced to update partial-sum registers with multiple decoded bits by taking only one processing cycle. Implementation results show that the proposed 8-parallel SC polar decoder in 28-nm CMOS requires only 0.140 mu s to decode a (1024, 512) codeword of 5C system, remarkably reducing the decoding latency when compared to the state-of-the-art designs.11Nsciescopu
Low-latency polar decoder using overlapped scl processing
In this paper, we present a novel scheduling method that reduces the latency of polar decoders significantly. Unlike the prior pruning-based successive cancellation list (SCL) decoding that suffers from a number of idle cycles, the proposed overlapped SCL scheme immediately begins node operations without waiting for the list to be sorted, being exempt from such unfavorable cycles. All possible candidates for the next node operations are precomputed in parallel with the pruning operations, and are readily selected to minimize the latency. For the 5G New Radio systems, the proposed method shortens the decoding latency of the state-of-the-art approaches by up to 22% without degrading the error-correcting performance.1
High-Throughput and Low-Latency Digital Baseband Architecture for Energy-Efficient Wireless VR Systems
This paper presents a novel baseband architecture that supports high-speed wireless VR solutions using 60 GHz RF circuits. Based on the experimental observations by our previous 60 GHz transceiver circuits, the efficient baseband architecture is proposed to enhance the quality of transmission. To achieve a zero-latency transmission, we define an (106,920, 95,040) interleaved-BCH error-correction code (ECC), which removes iterative processing steps in the previous LDPC ECC standardized for the near-field wireless communication. Introducing the block-level interleaving, the proposed baseband processing successfully scatters the existing burst errors to the small-sized component codes, and recovers up to 1080 consecutive bit errors in a data frame of 106,920 bits. To support the high-speed wireless VR system, we also design the massive-parallel BCH encoder and decoder, which is tightly connected to the block-level interleaver and de-interleaver. Including the high-speed analog interfaces for the external devices, the proposed baseband architecture is designed in 65 nm CMOS, supporting a data rate of up to 12.8 Gbps. Experimental results show that the proposed wireless VR solution can transfer up to 4 K high-resolution video streams without using time-consuming compression and decompression, successfully achieving a transfer latency of 1 ms
Design and Evaluation Frameworks for Advanced RISC-based Ternary Processor
In this paper, we introduce the design and veri-fication frameworks for developing a fully-functional emerging ternary processor. Based on the existing compiling environments for binary processors, for the given ternary instructions, the software-level framework provides an efficient way to convert the given programs to the ternary assembly codes. We also present a hardware-level framework to rapidly evaluate the performance of a ternary processor implemented in arbitrary design technology. As a case study, the fully-functional 9-trit advanced RISC-based ternary (ART-9) core is newly developed by using the proposed frameworks. Utilizing 24 custom ternary instructions, the 5-stage ART-9 prototype architecture is successfully verified by a number of test programs including dhrystone benchmark in a ternary domain, achieving the processing efficiency of 57.8 DMIPS/W and 3.06times 10^{6} DMIPS/W in the FPGA-level ternary-logic emulations and the emerging CNTFET ternary gates, respectively.1
Design and Evaluation Frameworks for Advanced RISC-based Ternary Processor
In this paper, we introduce the design and veri-fication frameworks for developing a fully-functional emerging ternary processor. Based on the existing compiling environments for binary processors, for the given ternary instructions, the software-level framework provides an efficient way to convert the given programs to the ternary assembly codes. We also present a hardware-level framework to rapidly evaluate the performance of a ternary processor implemented in arbitrary design technology. As a case study, the fully-functional 9-trit advanced RISC-based ternary (ART-9) core is newly developed by using the proposed frameworks. Utilizing 24 custom ternary instructions, the 5-stage ART-9 prototype architecture is successfully verified by a number of test programs including dhrystone benchmark in a ternary domain, achieving the processing efficiency of 57.8 DMIPS/W and 3.06times 10^{6} DMIPS/W in the FPGA-level ternary-logic emulations and the emerging CNTFET ternary gates, respectively.1