85 research outputs found

    Pruned Bit-Reversal Permutations: Mathematical Characterization, Fast Algorithms and Architectures

    Full text link
    A mathematical characterization of serially-pruned permutations (SPPs) employed in variable-length permuters and their associated fast pruning algorithms and architectures are proposed. Permuters are used in many signal processing systems for shuffling data and in communication systems as an adjunct to coding for error correction. Typically only a small set of discrete permuter lengths are supported. Serial pruning is a simple technique to alter the length of a permutation to support a wider range of lengths, but results in a serial processing bottleneck. In this paper, parallelizing SPPs is formulated in terms of recursively computing sums involving integer floor and related functions using integer operations, in a fashion analogous to evaluating Dedekind sums. A mathematical treatment for bit-reversal permutations (BRPs) is presented, and closed-form expressions for BRP statistics are derived. It is shown that BRP sequences have weak correlation properties. A new statistic called permutation inliers that characterizes the pruning gap of pruned interleavers is proposed. Using this statistic, a recursive algorithm that computes the minimum inliers count of a pruned BR interleaver (PBRI) in logarithmic time complexity is presented. This algorithm enables parallelizing a serial PBRI algorithm by any desired parallelism factor by computing the pruning gap in lookahead rather than a serial fashion, resulting in significant reduction in interleaving latency and memory overhead. Extensions to 2-D block and stream interleavers, as well as applications to pruned fast Fourier transforms and LTE turbo interleavers, are also presented. Moreover, hardware-efficient architectures for the proposed algorithms are developed. Simulation results demonstrate 3 to 4 orders of magnitude improvement in interleaving time compared to existing approaches.Comment: 31 page

    Time-frequency warped waveforms for well-contained massive machine type communications

    Get PDF
    This paper proposes a novel time-frequency warped waveform for short symbols, massive machine-type communication (mMTC), and internet of things (IoT) applications. The waveform is composed of asymmetric raised cosine (RC) pulses to increase the signal containment in time and frequency domains. The waveform has low power tails in the time domain, hence better performance in the presence of delay spread and time offsets. The time-axis warping unitary transform is applied to control the waveform occupancy in time-frequency space and to compensate for the usage of high roll-off factor pulses at the symbol edges. The paper explains a step-by-step analysis for determining the roll-off factors profile and the warping functions. Gains are presented over the conventional Zero-tail Discrete Fourier Transform-spread-Orthogonal Frequency Division Multiplexing (ZT-DFT-s-OFDM), and Cyclic prefix (CP) DFT-s-OFDM schemes in the simulations section.United States Department of Energy (DOE) ; Office of Advanced Scientific Computing Research ; National Science Foundation (NSF

    Algebraic Signal Processing Theory: Cooley-Tukey Type Algorithms for DCTs and DSTs

    Full text link
    This paper presents a systematic methodology based on the algebraic theory of signal processing to classify and derive fast algorithms for linear transforms. Instead of manipulating the entries of transform matrices, our approach derives the algorithms by stepwise decomposition of the associated signal models, or polynomial algebras. This decomposition is based on two generic methods or algebraic principles that generalize the well-known Cooley-Tukey FFT and make the algorithms' derivations concise and transparent. Application to the 16 discrete cosine and sine transforms yields a large class of fast algorithms, many of which have not been found before.Comment: 31 pages, more information at http://www.ece.cmu.edu/~smar

    Optimising Linear Key Recovery Attacks with Affine Walsh Transform Pruning

    Get PDF
    International audienceLinear cryptanalysis [25] is one of the main families of keybrecovery attacks on block ciphers. Several publications [16,19] have drawn attention towards the possibility of reducing their time complexity using the fast Walsh transform. These previous contributions ignore the structure of the key recovery rounds, which are treated as arbitrary boolean functions. In this paper, we optimise the time and memory complexities of these algorithms by exploiting zeroes in the Walsh spectra of these functions using a novel affine pruning technique for the Walsh Transform. These new optimisation strategies are then showcased with two application examples: an improved attack on the DES [1] and the first known atttack on 29-round PRESENT-128 [9]

    Low-Complexity Multicarrier Waveform Processing Schemes fo Future Wireless Communications

    Get PDF
    Wireless communication systems deliver enormous variety of services and applications. Nowa- days, wireless communications play a key-role in many fields, such as industry, social life, education, and home automation. The growing demand for wireless services and applications has motivated the development of the next generation cellular radio access technology called fifth-generation new radio (5G-NR). The future networks are required to magnify the delivered user data rates to gigabits per second, reduce the communication latency below 1 ms, and en- able communications for massive number of simple devices. Those main features of the future networks come with new demands for the wireless communication systems, such as enhancing the efficiency of the radio spectrum use at below 6 GHz frequency bands, while supporting various services with quite different requirements for the waveform related key parameters. The current wireless systems lack the capabilities to handle those requirements. For exam- ple, the long-term evolution (LTE) employs the cyclic-prefix orthogonal frequency-division multiplexing (CP-OFDM) waveform, which has critical drawbacks in the 5G-NR context. The basic drawback of CP-OFDM waveform is the lack of spectral localization. Therefore, spectrally enhanced variants of CP-OFDM or other multicarrier waveforms with well localized spectrum should be considered. This thesis investigates spectrally enhanced CP-OFDM (E-OFDM) schemes to suppress the out-of-band (OOB) emissions, which are normally produced by CP-OFDM. Commonly, the weighted overlap-and-add (WOLA) scheme applies smooth time-domain window on the CP- OFDM waveform, providing spectrally enhanced subcarriers and reducing the OOB emissions with very low additional computational complexity. Nevertheless, the suppression perfor- mance of WOLA-OFDM is not sufficient near the active subband. Another technique is based on filtering the CP-OFDM waveform, which is referred to as F-OFDM. F-OFDM is able to provide well-localized spectrum, however, with significant increase in the computational com- plexity in the basic scheme with time-domain filters. Also filter-bank multicarrier (FBMC) waveforms are included in this study. FBMC has been widely studied as a potential post- OFDM scheme with nearly ideal subcarrier spectrum localization. However, this scheme has quite high computational complexity while being limited to uniformly distributed sub- bands. Anyway, filter-bank based waveform processing is one of the main topics of this work. Instead of traditional polyphase network (PPN) based uniform filter banks, the focus is on fast-convolution filter banks (FC-FBs), which utilize fast Fourier transform (FFT) domain processing to realize effectively filter-banks with high flexibility in terms of subcarrier bandwidths and center frequencies. FC-FBs are applied for both FBMC and F-OFDM waveform genera- tion and processing with greatly increased flexibility and significantly reduced computational complexity. This study proposes novel structures for FC-FB processing based on decomposition of the FC-FB structure consisting of forward and inverse discrete Fourier transforms (DFT and IDFT). The decomposition of multirate FC provides means of reducing the computational complexity in some important specific scenarios. A generic FC decomposition model is proposed and analyzed. This scheme is mathematically equivalent to the corresponding direct FC imple- mentation, with exactly the same performance. The benefits of the optimized decomposition structure appear mainly in communication scenarios with relatively narrow active transmis- sion band, resulting in significantly reduced computational complexity compared to the direct FC structure. The narrowband scenarios find their places in the recent 3GPP specification of cellular low- power wide-area (LPWA) access technology called narrowband internet-of-things (NB-IoT). NB-IoT aims at introducing the IoT to LTE and GSM frequency bands in coexistence with those technologies. NB-IoT uses CP-OFDM based waveforms with parameters compatible with the LTE. However, additional means are needed also for NB-IoT transmitters to improve the spec- trum localization. For NB-IoT user devices, it is important to consider ultra-low complexity solutions, and a look-up table (LUT) based approach is proposed to implement NB-IoT uplink transmitters with filtered waveforms. This approach provides completely multiplication-free digital baseband implementations and the addition rates are similar or smaller than in the basic NB-IoT waveform generation without the needed elements for spectrum enhancement. The basic idea includes storing full or partial waveforms for all possible data symbol combinations. Then the transmitted waveform is composed through summation of needed stored partial waveforms and trivial phase rotations. The LUT based scheme is developed with different vari- ants tackling practical implementations issues of NB-IoT device transmitters, considering also the effects of nonlinear power amplifier. Moreover, a completely multiplication and addition- free LUT variant is proposed and found to be feasible for very narrowband transmission, with up to 3 subcarriers. The finite-wordlength performance of LUT variants is evaluated through simulations

    Toatie : functional hardware description with dependent types

    Get PDF
    Describing correct circuits remains a tall order, despite four decades of evolution in Hardware Description Languages (HDLs). Many enticing circuit architectures require recursive structures or complex compile-time computation — two patterns that prove difficult to capture in traditional HDLs. In a signal processing context, the Fast FIR Algorithm (FFA) structure for efficient parallel filtering proves to be naturally recursive, and most Multiple Constant Multiplication (MCM) blocks decompose multiplications into graphs of simple shifts and adds using demanding compile time computation. Generalised versions of both remain mostly in academic folklore. The implementations which do exist are often ad hoc circuit generators, written in software languages. These pose challenges for verification and are resistant to composition. Embedded functional HDLs, that represent circuits as data, allow for these descriptions at the cost of forcing the designer to work at the gate-level. A promising alternative is to use a stand-alone compiler, representing circuits as plain functions, exemplified by the CλaSH HDL. This, however, raises new challenges in capturing a circuit’s staging — which expressions in the single language should be reduced during compile-time elaboration, and which should remain in the circuit’s run-time? To better reflect the physical separation between circuit phases, this work proposes a new functional HDL (representing circuits as functions) with first-class staging constructs. Orthogonal to this, there are also long-standing challenges in the verification of parameterised circuit families. Industry surveys have consistently reported that only a slim minority of FPGA projects reach production without non-trivial bugs. While a healthy growth in the adoption of automatic formal methods is also reported, the majority of testing remains dynamic — presenting difficulties for testing entire circuit families at once. This research offers an alternative verification methodology via the combination of dependent types and automatic synthesis of user-defined data types. Given precise enough types for synthesisable data, this environment can be used to develop circuit families with full functional verification in a correct-by-construction fashion. This approach allows for verification of entire circuit families (not just one concrete member) and side-steps the state-space explosion of model checking methods. Beyond the existing work, this research offers synthesis of combinatorial circuits — not just a software model of their behaviour. This additional step requires careful consideration of staging, erasure & irrelevance, deriving bit representations of user-defined data types, and a new synthesis scheme. This thesis contributes steps towards HDLs with sufficient expressivity for awkward, combinatorial signal processing structures, allowing for a correct-by-construction approach, and a prototype compiler for netlist synthesis.Describing correct circuits remains a tall order, despite four decades of evolution in Hardware Description Languages (HDLs). Many enticing circuit architectures require recursive structures or complex compile-time computation — two patterns that prove difficult to capture in traditional HDLs. In a signal processing context, the Fast FIR Algorithm (FFA) structure for efficient parallel filtering proves to be naturally recursive, and most Multiple Constant Multiplication (MCM) blocks decompose multiplications into graphs of simple shifts and adds using demanding compile time computation. Generalised versions of both remain mostly in academic folklore. The implementations which do exist are often ad hoc circuit generators, written in software languages. These pose challenges for verification and are resistant to composition. Embedded functional HDLs, that represent circuits as data, allow for these descriptions at the cost of forcing the designer to work at the gate-level. A promising alternative is to use a stand-alone compiler, representing circuits as plain functions, exemplified by the CλaSH HDL. This, however, raises new challenges in capturing a circuit’s staging — which expressions in the single language should be reduced during compile-time elaboration, and which should remain in the circuit’s run-time? To better reflect the physical separation between circuit phases, this work proposes a new functional HDL (representing circuits as functions) with first-class staging constructs. Orthogonal to this, there are also long-standing challenges in the verification of parameterised circuit families. Industry surveys have consistently reported that only a slim minority of FPGA projects reach production without non-trivial bugs. While a healthy growth in the adoption of automatic formal methods is also reported, the majority of testing remains dynamic — presenting difficulties for testing entire circuit families at once. This research offers an alternative verification methodology via the combination of dependent types and automatic synthesis of user-defined data types. Given precise enough types for synthesisable data, this environment can be used to develop circuit families with full functional verification in a correct-by-construction fashion. This approach allows for verification of entire circuit families (not just one concrete member) and side-steps the state-space explosion of model checking methods. Beyond the existing work, this research offers synthesis of combinatorial circuits — not just a software model of their behaviour. This additional step requires careful consideration of staging, erasure & irrelevance, deriving bit representations of user-defined data types, and a new synthesis scheme. This thesis contributes steps towards HDLs with sufficient expressivity for awkward, combinatorial signal processing structures, allowing for a correct-by-construction approach, and a prototype compiler for netlist synthesis

    Low power FFT processor design considerations for OFDM communications

    Full text link
    Today\u27s emerging communication technologies require fast processing as well as efficient use of resources. This project specifically addresses the power-efficient design of an FFT processor as it relates to OFDM communications such as cognitive radio. The Fast Fourier Transform (FFT) processor is what enables the efficient modulation in OFDM. As the FFT processor is the most computationally intensive component in OFDM communication, the power efficiency improvement of this component can have great impacts on the overall system. These impacts are significant considering the number of mobile and remote communication devices that rely on limited battery-powered operation. This project explores current FFT processor algorithms and architectures as well as optimization techniques that aim to reduce the power consumption of these devices. A floating point as well as a fixed point dynamically size-configurable FFT processor was designed in VHDL for FPGA applications, and power-saving modifications were implemented while analyzing the results
    corecore