2,934 research outputs found

    An Iteratively Decodable Tensor Product Code with Application to Data Storage

    Full text link
    The error pattern correcting code (EPCC) can be constructed to provide a syndrome decoding table targeting the dominant error events of an inter-symbol interference channel at the output of the Viterbi detector. For the size of the syndrome table to be manageable and the list of possible error events to be reasonable in size, the codeword length of EPCC needs to be short enough. However, the rate of such a short length code will be too low for hard drive applications. To accommodate the required large redundancy, it is possible to record only a highly compressed function of the parity bits of EPCC's tensor product with a symbol correcting code. In this paper, we show that the proposed tensor error-pattern correcting code (T-EPCC) is linear time encodable and also devise a low-complexity soft iterative decoding algorithm for EPCC's tensor product with q-ary LDPC (T-EPCC-qLDPC). Simulation results show that T-EPCC-qLDPC achieves almost similar performance to single-level qLDPC with a 1/2 KB sector at 50% reduction in decoding complexity. Moreover, 1 KB T-EPCC-qLDPC surpasses the performance of 1/2 KB single-level qLDPC at the same decoder complexity.Comment: Hakim Alhussien, Jaekyun Moon, "An Iteratively Decodable Tensor Product Code with Application to Data Storage

    A low-power, high-performance speech recognition accelerator

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Automatic Speech Recognition (ASR) is becoming increasingly ubiquitous, especially in the mobile segment. Fast and accurate ASR comes at high energy cost, not being affordable for the tiny power-budgeted mobile devices. Hardware acceleration reduces energy-consumption of ASR systems, while delivering high-performance. In this paper, we present an accelerator for largevocabulary, speaker-independent, continuous speech-recognition. It focuses on the Viterbi search algorithm representing the main bottleneck in an ASR system. The proposed design consists of innovative techniques to improve the memory subsystem, since memory is the main bottleneck for performance and power in these accelerators' design. It includes a prefetching scheme tailored to the needs of ASR systems that hides main memory latency for a large fraction of the memory accesses, negligibly impacting area. Additionally, we introduce a novel bandwidth-saving technique that removes off-chip memory accesses by 20 percent. Finally, we present a power saving technique that significantly reduces the leakage power of the accelerators scratchpad memories, providing between 8.5 and 29.2 percent reduction in entire power dissipation. Overall, the proposed design outperforms implementations running on the CPU by orders of magnitude, and achieves speedups between 1.7x and 5.9x for different speech decoders over a highly optimized CUDA implementation running on Geforce-GTX-980 GPU, while reducing the energy by 123-454x.Peer ReviewedPostprint (author's final draft

    Bandwidth efficient CCSDS coding standard proposals

    Get PDF
    The basic concatenated coding system for the space telemetry channel consists of a Reed-Solomon (RS) outer code, a symbol interleaver/deinterleaver, and a bandwidth efficient trellis inner code. A block diagram of this configuration is shown. The system may operate with or without the outer code and interleaver. In this recommendation, the outer code remains the (255,223) RS code over GF(2 exp 8) with an error correcting capability of t = 16 eight bit symbols. This code's excellent performance and the existence of fast, cost effective, decoders justify its continued use. The purpose of the interleaver/deinterleaver is to distribute burst errors out of the inner decoder over multiple codewords of the outer code. This utilizes the error correcting capability of the outer code more efficiently and reduces the probability of an RS decoder failure. Since the space telemetry channel is not considered bursty, the required interleaving depth is primarily a function of the inner decoding method. A diagram of an interleaver with depth 4 that is compatible with the (255,223) RS code is shown. Specific interleaver requirements are discussed after the inner code recommendations

    Efficient Embedded Speech Recognition for Very Large Vocabulary Mandarin Car-Navigation Systems

    Get PDF
    Automatic speech recognition (ASR) for a very large vocabulary of isolated words is a difficult task on a resource-limited embedded device. This paper presents a novel fast decoding algorithm for a Mandarin speech recognition system which can simultaneously process hundreds of thousands of items and maintain high recognition accuracy. The proposed algorithm constructs a semi-tree search network based on Mandarin pronunciation rules, to avoid duplicate syllable matching and save redundant memory. Based on a two-stage fixed-width beam-search baseline system, the algorithm employs a variable beam-width pruning strategy and a frame-synchronous word-level pruning strategy to significantly reduce recognition time. This algorithm is aimed at an in-car navigation system in China and simulated on a standard PC workstation. The experimental results show that the proposed method reduces recognition time by nearly 6-fold and memory size nearly 2- fold compared to the baseline system, and causes less than 1% accuracy degradation for a 200,000 word recognition task

    On decoding of multi-level MPSK modulation codes

    Get PDF
    The decoding problem of multi-level block modulation codes is investigated. The hardware design of soft-decision Viterbi decoder for some short length 8-PSK block modulation codes is presented. An effective way to reduce the hardware complexity of the decoder by reducing the branch metric and path metric, using a non-uniform floating-point to integer mapping scheme, is proposed and discussed. The simulation results of the design are presented. The multi-stage decoding (MSD) of multi-level modulation codes is also investigated. The cases of soft-decision and hard-decision MSD are considered and their performance are evaluated for several codes of different lengths and different minimum squared Euclidean distances. It is shown that the soft-decision MSD reduces the decoding complexity drastically and it is suboptimum. The hard-decision MSD further simplifies the decoding while still maintaining a reasonable coding gain over the uncoded system, if the component codes are chosen properly. Finally, some basic 3-level 8-PSK modulation codes using BCH codes as component codes are constructed and their coding gains are found for hard decision multistage decoding

    Reliable Memory Storage by Natural Redundancy

    Get PDF
    Non-volatile memories are becoming the dominant type of storage devices in modern computers because of their fast speed, physical robustness and high data density. However, there still exist many challenges, such as the data reliability issues due to noise. An important example is the memristor, which uses programmable resistance to store data. Memristor memories use the crossbar architecture and suffer from the sneak-path problem: when a memristor cell of high resistance is read, it can be mistakenly read as a low-resistance cell due to low-resistance sneak-paths in the crossbar that are parallel to the cell. In this work, we study new ways to correct errors using the inherent redundancy in stored data (called Natural Redundancy), and combine them with conventional error-correcting codes. In particular, we define a Huffman encoding for the English language based on a repository of books. In addition, we study data stored using convolutional codes and use natural redundancy to verify if decoded codewords are valid or invalid. We present statistics over the Viterbi Algorithm and its ability to decode convolutional codewords, then discuss Yen's Algorithm, an augmentation of the Viterbi Algorithm. Finally, we present an efficient algorithm to search for a list of the most likely codewords, and choose a codeword that meets the criteria of both natural redundancy and the ECC as the decoding solution. We find that this algorithm is no more powerful than Yen's Algorithm in terms of decoding noisy convolutional codewords, but does present some interesting ideas for further exploration across multiple fields of study

    Convolutional coded dual header pulse interval modulation for line of sight photonic wireless links.

    Get PDF
    The analysis and simulation for convolutional coded dual header pulse interval modulation (CC-DH-PIM) scheme using a rate ½ convolutional code with the constraint length of 3 is presented. Decoding is implemented using the Viterbi algorithm with a hard decision. Mathematical analysis for the slot error rate (SER) upper bounds is presented and results are compared with the simulated data for a number of different modulation techniques. The authors show that the coded DH-PIM outperforms the pulse position modulation (PPM) scheme and offers >4 dB code gain at the SER of 10?4 compared to the standard DH-PIM. Results presented show that the CC-DH-PIM with a higher constraint length of 7 offers a code gain of 2 dB at SER of 10?5 compared to the CC-DH-PIM with a constraint length of 3. However, in CC-DH-PIM the improvement in the error performance is achieved at the cost of reduced transmission throughput compared to the standard DH-PIM
    • …
    corecore