2,386 research outputs found

    Pruned Bit-Reversal Permutations: Mathematical Characterization, Fast Algorithms and Architectures

    Full text link
    A mathematical characterization of serially-pruned permutations (SPPs) employed in variable-length permuters and their associated fast pruning algorithms and architectures are proposed. Permuters are used in many signal processing systems for shuffling data and in communication systems as an adjunct to coding for error correction. Typically only a small set of discrete permuter lengths are supported. Serial pruning is a simple technique to alter the length of a permutation to support a wider range of lengths, but results in a serial processing bottleneck. In this paper, parallelizing SPPs is formulated in terms of recursively computing sums involving integer floor and related functions using integer operations, in a fashion analogous to evaluating Dedekind sums. A mathematical treatment for bit-reversal permutations (BRPs) is presented, and closed-form expressions for BRP statistics are derived. It is shown that BRP sequences have weak correlation properties. A new statistic called permutation inliers that characterizes the pruning gap of pruned interleavers is proposed. Using this statistic, a recursive algorithm that computes the minimum inliers count of a pruned BR interleaver (PBRI) in logarithmic time complexity is presented. This algorithm enables parallelizing a serial PBRI algorithm by any desired parallelism factor by computing the pruning gap in lookahead rather than a serial fashion, resulting in significant reduction in interleaving latency and memory overhead. Extensions to 2-D block and stream interleavers, as well as applications to pruned fast Fourier transforms and LTE turbo interleavers, are also presented. Moreover, hardware-efficient architectures for the proposed algorithms are developed. Simulation results demonstrate 3 to 4 orders of magnitude improvement in interleaving time compared to existing approaches.Comment: 31 page

    Test-retest reliability of the magnetic mismatch negativity response to sound duration and omission deviants

    Get PDF
    Mismatch negativity (MMN) is a neurophysiological measure of auditory novelty detection that could serve as a translational biomarker of psychiatric disorders, such as schizophrenia. However, the replicability of its magnetoencephalographic (MEG) counterpart (MMNm) has been insufficiently addressed. In the current study, test-retest reliability of the MMNm response to both duration and omission deviants was evaluated over two MEG sessions in 16 healthy adults. MMNm amplitudes and latencies were obtained at both sensor- and source-level using a cortically-constrained minimum-norm approach. Intraclass correlations (ICC) were derived to assess stability of MEG responses over time. In addition, signal-to-noise ratios (SNR) and within-subject statistics were obtained in order to determine MMNm detectability in individual participants. ICC revealed robust values at both sensor- and source-level for both duration and omission MMNm amplitudes (ICC = 0.81-0.90), in particular in the right hemisphere, while moderate to strong values were obtained for duration MMNm and omission MMNm peak latencies (ICC = 0.74-0.88). Duration MMNm was robustly identified in individual participants with high SNR, whereas omission MMNm responses were only observed in half of the participants. Our data indicate that MMNm to unexpected duration changes and omitted sounds are highly reproducible, providing support for the use of MEG-parameters in basic and clinical research

    Throughput analysis for a high-performance FPGA-accelerated real-time search application

    Get PDF
    We propose an FPGA design for the relevancy computation part of a high-throughput real-time search application. The application matches terms in a stream of documents against a static profile, held in off-chip memory. We present a mathematical analysis of the throughput of the application and apply it to the problem of scaling the Bloom filter used to discard nonmatches

    Efficient Execution of Sequential Instructions Streams by Physical Machines

    Get PDF
    Any computational model which relies on a physical system is likely to be subject to the fact that information density and speed have intrinsic, ultimate limits. The RAM model, and in particular the underlying assumption that memory accesses can be carried out in time independent from memory size itself, is not physically implementable. This work has developed in the field of limiting technology machines, in which it is somewhat provocatively assumed that technology has achieved the physical limits. The ultimate goal for this is to tackle the problem of the intrinsic latencies of physical systems by encouraging scalable organizations for processors and memories. An algorithmic study is presented, which depicts the implementation of high concurrency programs for SP and SPE, sequential machine models able to compute direct-flow programs in optimal time. Then, a novel pieplined, hierarchical memory organization is presented, with optimal latency and bandwidth for a physical system. In order to both take full advantage of the memory capabilities and exploit the available instruction level parallelism of the code to be executed, a novel processor model is developed. Particular care is put in devising an efficient information flow within the processor itself. Both designs are extremely scalable, as they are based on fixed capacity and fixed size nodes, which are connected as a multidimensional array. Performance analysis on the resulting machine design has led to the discovery that latencies internal to the processor can be the dominating source of complexity in instruction flow execution, which adds to the effects of processor-memory interaction. A characterization of instruction flows is then developed, which is based on the topology induced by instruction dependences
    • …
    corecore