5 research outputs found

    Design of Soft Viterbi Algorithm Decoder Enhanced With Non-Transmittable Codewords for Storage Media

    Get PDF
    Viterbi Algorithm Decoder Enhanced with Non-transmittable Codewords is one of the best decoding algorithm which effectively improves forward error correction performance. HoweverViterbi decoder enhanced with NTCs is not yet designed to work in storage media devices. Currently Reed Solomon (RS) Algorithm is almost the dominant algorithm used in correcting error in storage media. Conversely, recent studies show that there still exist low reliability of data in storage media while the demand for storage media increases drastically. This study proposes a design of the Soft Viterbi Algorithm decoder enhanced with Non-transmittable Codewords (SVAD-NTCs) to be used in storage media for error correction. Matlab simulation was used in this design in order to investigate behavior and effectiveness of SVAD-NTCs in correcting errors in data retrieving from storage media.Sample data of one million bits are randomly generated, Additive White Gaussian Noise (AWGN) was used as data distortion model and Binary Phase- Shift Keying (BPSK) was applied for simulation modulation. Results show that,behaviors of SVAD-NTC performance increase as you increase the NTCs, but beyond 6NTCs there is no significant change and SVAD-NTCs design drastically reduce the total residual error from 216,878 of Reed Solomon to 23,900

    Use of CUDA for the Continuous Space Language Model

    Get PDF
    The training phase of the Continuous Space Language Model (CSLM) was implemented in the NVIDIA hardware/software architecture Compute Unified Device Architecture (CUDA). Implementation was accomplished using a combination of CUBLAS library routines and CUDA kernel calls on three different CUDA enabled devices of varying compute capability and a time savings over the traditional CPU approach demonstrated

    A Tile-based Parallel Viterbi Algorithm for Biological Sequence Alignment on GPU with CUDA

    No full text
    Abstract The Viterbi algorithm is the compute-intensive kernel in Hidden Markov Model (HMM) based sequence alignment applications. In this paper, we investigate extending several parallel methods, such as the wave-front and streaming methods for the Smith-Waterman algorithm, to achieve a significant speed-up on a GPU. The wave-front method can take advantage of the computing power of the GPU but it cannot handle long sequences because of the physical GPU memory limit. On the other hand, the streaming method can process long sequences but with increased overhead due to the increased data transmission between CPU and GPU. To further improve the performance on GPU, we propose a new tile-based parallel algorithm. We take advantage of the homological segments to divide long sequences into many short pieces and each piece pair (tile) can be fully held in the GPU’s memory. By reorganizing the computational kernel of the Viterbi algorithm, the basic computing unit can be divided into two parts: independent and dependent parts. All of the independent parts are executed with a balanced load in an optimized coalesced memory-accessing manner, which significantly improves the Viterbi algorithm’s performance on GPU. The experimental results show that our new tile-based parallel Viterbi algorithm can outperform the wave-front and the streaming methods. Especially for the long sequence alignment problem, the best performance of tile-based algorithm is on average about an order magnitude faster than the serial Viterbi algorithm

    Identifying and Harnessing Concurrency for Parallel and Distributed Network Simulation

    Get PDF
    Although computer networks are inherently parallel systems, the parallel execution of network simulations on interconnected processors frequently yields only limited benefits. In this thesis, methods are proposed to estimate and understand the parallelization potential of network simulations. Further, mechanisms and architectures for exploiting the massively parallel processing resources of modern graphics cards to accelerate network simulations are proposed and evaluated

    Identifying and Harnessing Concurrency for Parallel and Distributed Network Simulation

    Get PDF
    Although computer networks are inherently parallel systems, the parallel execution of network simulations on interconnected processors frequently yields only limited benefits. In this thesis, methods are proposed to estimate and understand the parallelization potential of network simulations. Further, mechanisms and architectures for exploiting the massively parallel processing resources of modern graphics cards to accelerate network simulations are proposed and evaluated
    corecore