5 research outputs found

    Area efficient parallel lfsr for cyclic redundancy check

    Get PDF
    Cyclic Redundancy Check (CRC), code for error detection finds many applications in the field of digital communication, data storage, control system and data compression. CRC encoding operation is carried out by using a Linear Feedback Shift Register (LFSR). Serial implementation of CRC requires more clock cycles which is equal to data message length plus generator polynomial degree but in parallel implementation of CRC one clock cycle is required if a whole data message is applied at a time. In previous work related to parallel LFSR, hardware complexity of the architecture reduced using a technique named state space transformation. This paper presents detailed explaination of search algorithm implementation and technique to find number of XOR gates required for different CRC algorithms. This paper presents a searching algorithm and new technique to find the number of XOR gates required for different CRC algorithms. The comparison between proposed and previous architectures shows that the number of XOR gates are reduced for CRC algorithms which improve the hardware efficiency. Searching algorithm and all the matrix computations have been performed using MATLAB simulations

    An Optimization Technique for CRC Generation

    Get PDF
    Abstract: In networking environments, the cyclic redundancy check (CRC) is widely utilized to determine whether errors have been introduced during transmissions over physical links. In this paper, we present a fast cyclic redundancy check (CRC) algorithm that performs CRC computation for an arbitrary length of message in parallel. This paper proposes 64 bits parallel CRC architecture based on F matrix with order of generator polynomial is 32 and showed CRC-64 is having less latency and high throughput compared to CRC-32 parallel architecture through Xilinx Simulator

    HARDWARE INSTRUCTION BASED CRC32C, A BETTER ALTERNATIVE TO THE TCP ONE’S COMPLEMENT CHECKSUM

    Get PDF
    End-to-end data integrity is of utmost importance when sending data through a communication network, and a common way to ensure this is by appending a few bits for error detection (e.g., a checksum or cyclic redundancy check) to the data sent. Data can be corrupted at the sending or receiving hosts, in one of the intermediate systems (e.g., routers and switches), in the network interface card, or on the transmission link. The Internet’s Transmission Control Protocol (TCP) uses a 16-bit one’s complement checksum for end-to-end error detection of each TCP segment [1]. The TCP protocol specification dates back to the 1970s, and better error detection alternatives exist (e.g., Fletcher checksum, Adler checksum, Cyclic Redundancy Check (CRC)) that provide higher error detection efficiency; nevertheless, the one’s complement checksum is still in use today as part of the TCP standard. The TCP checksum has low computational complexity when compared to software implementations of the other algorithms. Some of the original reasons for selecting the 16-bit one’s complement checksum are its simple calculation, and the property that its computation on big- and little-endian machines result in the same checksum but byte-swapped. This latter characteristic is not true for a two’s complement checksum. A negative characteristic of one’s and two’s complement checksums is that changing the order of the data does not affect the checksum. In [2], the authors collected two years of data and concluded after analysis that the TCP checksum “will fail to detect errors for roughly one in 16 million to 10 billion packets.” While some of the sources responsible for TCP checksum errors have decreased in the nearly 20 years since this study was published (e.g., the ACK-of-FIN TCP software bug), it is not clear what we would find if the study were repeated. It would also be difficult to repeat this study today because of privacy concerns. The advent of hardware CRC32C instructions on Intel x86 and ARM CPUs offers the promise of significantly improved error detection (probability of undetected errors proportional to 2 -32 versus 2-16) at a comparable CPU time to the one’s complement checksum. The goal of this research is to compare the execution time of the following error detection algorithms: CRC32C (using generator polynomial 0x1EDC6F41), Adler checksum, Fletcher check sum, and one’s complement checksum using both software and special hardware instructions. For CRC32C, the software implementations tested were bit-wise, nibble-wise, byte-wise, slicing-by-4 and slicing-by-8 algorithms. Intel’s CRC32 and PCLMULQDQ instructions and ARM’s CRC32C instruction were also used as part of testing hardware instruction implementations. A comparative study of all these algorithms on Intel Core i3-2330M shows that the CRC32C hardware instruction implementation is approximately 38% faster than the 16-bit TCP one’s complement checksum at 1500 bytes, and the 16-bit TCP one’s complement checksum is roughly 11% faster than the hardware instruction based CRC32C at 64 bytes. On the ARM Cortex-A53, the hardware CRC32C algorithm is approximately 20% faster than the 16-bit TCP one’s complement checksum at 64 bytes, and the 16-bit TCP one’s complement checksum is roughly 13% faster than the hardware instruction based CRC32C at 1500 bytes. Because the hardware CRC32C instruction is commonly available on most Intel processors and a growing number of ARM processors these days, we argue that it is time to reconsider adding a TCP Option to use hardware CRC32C. The primary impediments to replacing the TCP one’s complement checksum with CRC32C are Network Address Translation (NAT) and TCP checksum offload. NAT requires the recalculation of the TCP checksum in the NAT device because the IPv4 address, and possibly the TCP port number change, when packets move through a NAT device. These NAT devices are able to compute the new checksum incrementally due to the properties of the one’s complement checksum. The eventual transition to IPv6 will hopefully eliminate the need for NAT. Most Ethernet Network Interface Cards (NIC) support TCP checksum offload, where the TCP checksum is computed in the NIC rather than on the host CPU. There is a risk of undetected errors with this approach since the error detection is no longer end-to-end; nevertheless, it is the default configuration in many operating systems including Windows 10 [3] and MacOS. CRC32C has been implemented in some NICs to support the iSCSI protocol, so it is possible that TCP CRC32C offload could be supported in the future. In the near term, our proposal is to include a TCP Option for CRC32C in addition to the one’s complement checksum for higher reliability

    High-Performance Hardware and Software Implementations of the Cyclic Redundancy Check Computation

    Get PDF
    The Cyclic Redundancy Check (CRC) is an error detection code used in many digital transmission and storage systems. The two major research areas surrounding CRCs concern developing computation approaches and studying error detection properties. This thesis aims to explore the various aspects of the CRC computation, with the primary objective being to propose novel computation approaches which outperform the existing ones. The work begins with a thorough examination of the formulations found throughout the literature. Then, their subsequent realizations as hardware architectures and software algorithms are investigated. During this investigation, some improvements are presented including optimizations of the state-space trans­ formed and primitive architectures. Afterward, novel formulations are derived and the most significant contribution consists of a matrix decomposition that gives rise to a high-performance software algorithm. Simulation and implementation results are gathered for both hardware and software deployments of the investigated computa­ tion approaches. The theoretical results obtained by simulations are validated with implementation experiments. The proposed algorithm is shown to outperform the existing comparable low-memory algorithm in terms of time complexity
    corecore