3,057 research outputs found

    Investigation of sequence features of hinge-bending regions in proteins with domain movements using kernel logistic regression

    Get PDF
    Background: Hinge-bending movements in proteins comprising two or more domains form a large class of functional movements. Hinge-bending regions demarcate protein domains and collectively control the domain movement. Consequently, the ability to recognise sequence features of hinge-bending regions and to be able to predict them from sequence alone would benefit various areas of protein research. For example, an understanding of how the sequence features of these regions relate to dynamic properties in multi-domain proteins would aid in the rational design of linkers in therapeutic fusion proteins. Results: The DynDom database of protein domain movements comprises sequences annotated to indicate whether the amino acid residue is located within a hinge-bending region or within an intradomain region. Using statistical methods and Kernel Logistic Regression (KLR) models, this data was used to determine sequence features that favour or disfavour hinge-bending regions. This is a difficult classification problem as the number of negative cases (intradomain residues) is much larger than the number of positive cases (hinge residues). The statistical methods and the KLR models both show that cysteine has the lowest propensity for hinge-bending regions and proline has the highest, even though it is the most rigid amino acid. As hinge-bending regions have been previously shown to occur frequently at the terminal regions of the secondary structures, the propensity for proline at these regions is likely due to its tendency to break secondary structures. The KLR models also indicate that isoleucine may act as a domain-capping residue. We have found that a quadratic KLR model outperforms a linear KLR model and that improvement in performance occurs up to very long window lengths (eighty residues) indicating long-range correlations. Conclusion: In contrast to the only other approach that focused solely on interdomain hinge-bending regions, the method provides a modest and statistically significant improvement over a random classifier. An explanation of the KLR results is that in the prediction of hinge-bending regions a long-range correlation is at play between a small number amino acids that either favour or disfavour hinge-bending regions. The resulting sequence-based prediction tool, HingeSeek, is available to run through a webserver at hingeseek.cmp.uea.ac.uk

    A new class of codes for Boolean masking of cryptographic computations

    Full text link
    We introduce a new class of rate one-half binary codes: {\bf complementary information set codes.} A binary linear code of length 2n2n and dimension nn is called a complementary information set code (CIS code for short) if it has two disjoint information sets. This class of codes contains self-dual codes as a subclass. It is connected to graph correlation immune Boolean functions of use in the security of hardware implementations of cryptographic primitives. Such codes permit to improve the cost of masking cryptographic algorithms against side channel attacks. In this paper we investigate this new class of codes: we give optimal or best known CIS codes of length <132.<132. We derive general constructions based on cyclic codes and on double circulant codes. We derive a Varshamov-Gilbert bound for long CIS codes, and show that they can all be classified in small lengths ≀12\le 12 by the building up construction. Some nonlinear permutations are constructed by using Z4\Z_4-codes, based on the notion of dual distance of an unrestricted code.Comment: 19 pages. IEEE Trans. on Information Theory, to appea

    Pruned Bit-Reversal Permutations: Mathematical Characterization, Fast Algorithms and Architectures

    Full text link
    A mathematical characterization of serially-pruned permutations (SPPs) employed in variable-length permuters and their associated fast pruning algorithms and architectures are proposed. Permuters are used in many signal processing systems for shuffling data and in communication systems as an adjunct to coding for error correction. Typically only a small set of discrete permuter lengths are supported. Serial pruning is a simple technique to alter the length of a permutation to support a wider range of lengths, but results in a serial processing bottleneck. In this paper, parallelizing SPPs is formulated in terms of recursively computing sums involving integer floor and related functions using integer operations, in a fashion analogous to evaluating Dedekind sums. A mathematical treatment for bit-reversal permutations (BRPs) is presented, and closed-form expressions for BRP statistics are derived. It is shown that BRP sequences have weak correlation properties. A new statistic called permutation inliers that characterizes the pruning gap of pruned interleavers is proposed. Using this statistic, a recursive algorithm that computes the minimum inliers count of a pruned BR interleaver (PBRI) in logarithmic time complexity is presented. This algorithm enables parallelizing a serial PBRI algorithm by any desired parallelism factor by computing the pruning gap in lookahead rather than a serial fashion, resulting in significant reduction in interleaving latency and memory overhead. Extensions to 2-D block and stream interleavers, as well as applications to pruned fast Fourier transforms and LTE turbo interleavers, are also presented. Moreover, hardware-efficient architectures for the proposed algorithms are developed. Simulation results demonstrate 3 to 4 orders of magnitude improvement in interleaving time compared to existing approaches.Comment: 31 page

    A STUDY OF LINEAR ERROR CORRECTING CODES

    Get PDF
    Since Shannon's ground-breaking work in 1948, there have been two main development streams of channel coding in approaching the limit of communication channels, namely classical coding theory which aims at designing codes with large minimum Hamming distance and probabilistic coding which places the emphasis on low complexity probabilistic decoding using long codes built from simple constituent codes. This work presents some further investigations in these two channel coding development streams. Low-density parity-check (LDPC) codes form a class of capacity-approaching codes with sparse parity-check matrix and low-complexity decoder Two novel methods of constructing algebraic binary LDPC codes are presented. These methods are based on the theory of cyclotomic cosets, idempotents and Mattson-Solomon polynomials, and are complementary to each other. The two methods generate in addition to some new cyclic iteratively decodable codes, the well-known Euclidean and projective geometry codes. Their extension to non binary fields is shown to be straightforward. These algebraic cyclic LDPC codes, for short block lengths, converge considerably well under iterative decoding. It is also shown that for some of these codes, maximum likelihood performance may be achieved by a modified belief propagation decoder which uses a different subset of 7^ codewords of the dual code for each iteration. Following a property of the revolving-door combination generator, multi-threaded minimum Hamming distance computation algorithms are developed. Using these algorithms, the previously unknown, minimum Hamming distance of the quadratic residue code for prime 199 has been evaluated. In addition, the highest minimum Hamming distance attainable by all binary cyclic codes of odd lengths from 129 to 189 has been determined, and as many as 901 new binary linear codes which have higher minimum Hamming distance than the previously considered best known linear code have been found. It is shown that by exploiting the structure of circulant matrices, the number of codewords required, to compute the minimum Hamming distance and the number of codewords of a given Hamming weight of binary double-circulant codes based on primes, may be reduced. A means of independently verifying the exhaustively computed number of codewords of a given Hamming weight of these double-circulant codes is developed and in coiyunction with this, it is proved that some published results are incorrect and the correct weight spectra are presented. Moreover, it is shown that it is possible to estimate the minimum Hamming distance of this family of prime-based double-circulant codes. It is shown that linear codes may be efficiently decoded using the incremental correlation Dorsch algorithm. By extending this algorithm, a list decoder is derived and a novel, CRC-less error detection mechanism that offers much better throughput and performance than the conventional ORG scheme is described. Using the same method it is shown that the performance of conventional CRC scheme may be considerably enhanced. Error detection is an integral part of an incremental redundancy communications system and it is shown that sequences of good error correction codes, suitable for use in incremental redundancy communications systems may be obtained using the Constructions X and XX. Examples are given and their performances presented in comparison to conventional CRC schemes

    Constructions of Rank Modulation Codes

    Full text link
    Rank modulation is a way of encoding information to correct errors in flash memory devices as well as impulse noise in transmission lines. Modeling rank modulation involves construction of packings of the space of permutations equipped with the Kendall tau distance. We present several general constructions of codes in permutations that cover a broad range of code parameters. In particular, we show a number of ways in which conventional error-correcting codes can be modified to correct errors in the Kendall space. Codes that we construct afford simple encoding and decoding algorithms of essentially the same complexity as required to correct errors in the Hamming metric. For instance, from binary BCH codes we obtain codes correcting tt Kendall errors in nn memory cells that support the order of n!/(log⁥2n!)tn!/(\log_2n!)^t messages, for any constant t=1,2,...t= 1,2,... We also construct families of codes that correct a number of errors that grows with nn at varying rates, from Θ(n)\Theta(n) to Θ(n2)\Theta(n^{2}). One of our constructions gives rise to a family of rank modulation codes for which the trade-off between the number of messages and the number of correctable Kendall errors approaches the optimal scaling rate. Finally, we list a number of possibilities for constructing codes of finite length, and give examples of rank modulation codes with specific parameters.Comment: Submitted to IEEE Transactions on Information Theor
    • 

    corecore