2,386 research outputs found
Pruned Bit-Reversal Permutations: Mathematical Characterization, Fast Algorithms and Architectures
A mathematical characterization of serially-pruned permutations (SPPs)
employed in variable-length permuters and their associated fast pruning
algorithms and architectures are proposed. Permuters are used in many signal
processing systems for shuffling data and in communication systems as an
adjunct to coding for error correction. Typically only a small set of discrete
permuter lengths are supported. Serial pruning is a simple technique to alter
the length of a permutation to support a wider range of lengths, but results in
a serial processing bottleneck. In this paper, parallelizing SPPs is formulated
in terms of recursively computing sums involving integer floor and related
functions using integer operations, in a fashion analogous to evaluating
Dedekind sums. A mathematical treatment for bit-reversal permutations (BRPs) is
presented, and closed-form expressions for BRP statistics are derived. It is
shown that BRP sequences have weak correlation properties. A new statistic
called permutation inliers that characterizes the pruning gap of pruned
interleavers is proposed. Using this statistic, a recursive algorithm that
computes the minimum inliers count of a pruned BR interleaver (PBRI) in
logarithmic time complexity is presented. This algorithm enables parallelizing
a serial PBRI algorithm by any desired parallelism factor by computing the
pruning gap in lookahead rather than a serial fashion, resulting in significant
reduction in interleaving latency and memory overhead. Extensions to 2-D block
and stream interleavers, as well as applications to pruned fast Fourier
transforms and LTE turbo interleavers, are also presented. Moreover,
hardware-efficient architectures for the proposed algorithms are developed.
Simulation results demonstrate 3 to 4 orders of magnitude improvement in
interleaving time compared to existing approaches.Comment: 31 page
Test-retest reliability of the magnetic mismatch negativity response to sound duration and omission deviants
Mismatch negativity (MMN) is a neurophysiological measure of auditory novelty detection that could serve as a translational biomarker of psychiatric disorders, such as schizophrenia. However, the replicability of its magnetoencephalographic (MEG) counterpart (MMNm) has been insufficiently addressed. In the current study, test-retest reliability of the MMNm response to both duration and omission deviants was evaluated over two MEG sessions in 16 healthy adults. MMNm amplitudes and latencies were obtained at both sensor- and source-level using a cortically-constrained minimum-norm approach. Intraclass correlations (ICC) were derived to assess stability of MEGÂ responses over time. In addition, signal-to-noise ratios (SNR) and within-subject statistics were obtained in order to determine MMNm detectability in individual participants. ICC revealed robust values at both sensor- and source-level for both duration and omission MMNm amplitudes (ICC = 0.81-0.90), in particular in the right hemisphere, while moderate to strong values were obtained for duration MMNm and omission MMNm peak latencies (ICC = 0.74-0.88). Duration MMNm was robustly identified in individual participants with high SNR, whereas omission MMNm responses were only observed in half of the participants. Our data indicate that MMNm to unexpected duration changes and omitted sounds are highly reproducible, providing support for the use of MEG-parameters in basic and clinical research
Throughput analysis for a high-performance FPGA-accelerated real-time search application
We propose an FPGA design for the relevancy computation part of a high-throughput real-time search application. The application matches terms in a stream of documents against a static profile, held in off-chip memory. We present a mathematical analysis of the throughput of the application and apply it to the problem of scaling the Bloom filter used to discard nonmatches
Efficient Execution of Sequential Instructions Streams by Physical Machines
Any computational model which relies on a physical system is likely to be subject to the fact that information density and speed have intrinsic, ultimate limits. The RAM model, and in particular the underlying assumption that memory accesses can be carried out in time independent from memory size itself, is not physically implementable.
This work has developed in the field of limiting technology machines, in which it is somewhat provocatively assumed that technology has achieved the physical limits. The ultimate goal for this is to tackle the problem of the intrinsic latencies of physical systems by encouraging scalable organizations for processors and memories.
An algorithmic study is presented, which depicts the implementation of high concurrency programs for SP and SPE, sequential machine models able to compute direct-flow programs in optimal time.
Then, a novel pieplined, hierarchical memory organization is presented, with optimal latency and bandwidth for a physical system.
In order to both take full advantage of the memory capabilities and exploit the available instruction level parallelism of the code to be executed, a novel processor model is developed. Particular care is put in devising an efficient information flow within the processor itself.
Both designs are extremely scalable, as they are based on fixed capacity and fixed size nodes, which are connected as a multidimensional array.
Performance analysis on the resulting machine design has led to the discovery that latencies internal to the processor can be the dominating source of complexity in instruction flow execution, which adds to the effects of processor-memory interaction. A characterization of instruction flows is then developed, which is based on the topology induced by instruction dependences
- …