969 research outputs found

    Summed Parallel Infinite Impulse Response (SPIIR) Filters For Low-Latency Gravitational Wave Detection

    Get PDF
    With the upgrade of current gravitational wave detectors, the first detection of gravitational wave signals is expected to occur in the next decade. Low-latency gravitational wave triggers will be necessary to make fast follow-up electromagnetic observations of events related to their source, e.g., prompt optical emission associated with short gamma-ray bursts. In this paper we present a new time-domain low-latency algorithm for identifying the presence of gravitational waves produced by compact binary coalescence events in noisy detector data. Our method calculates the signal to noise ratio from the summation of a bank of parallel infinite impulse response (IIR) filters. We show that our summed parallel infinite impulse response (SPIIR) method can retrieve the signal to noise ratio to greater than 99% of that produced from the optimal matched filter. We emphasise the benefits of the SPIIR method for advanced detectors, which will require larger template banks.Comment: 9 pages, 6 figures, for PR

    Efficient implementation of 90 degrees phase shifter in FPGA

    Get PDF
    In this article, we present an efficient way of implementing 90 phase shifter using Hilbert transformer with canonic signed digit (CSD) coefficients in FPGA. It is implemented using 27-tap symmetric finite impulse response (FIR) filter. Representing the filter coefficients by CSD eliminates the need for multipliers and the filter is implemented using shifters and adders/subtractors. The simulated results for the frequency response of the Hilbert transformer with infinite precision coefficients and CSD coefficients agree with each other. The proposed architecture requires less hardware as one adder is saved for the realization of every negative coefficient compared to convectional CSD FIR filter implementation. Also, it offers a high accuracy of phase shift

    Application of evolutionary computing in the design of high throughput digital filters.

    Get PDF

    An Efficient Design of 2-D Digital Filters Using Singular Value Decomposition and Genetic Algorithm with Canonical Signed Digit (CSD) Coefficients

    Get PDF
    In this thesis, the design of 2-D filters by SVD is proposed. This technique reduces the complexity of the designed 2-D digital filters by decomposing it into a set of 1-D digital filters in zl and z2 connected in cascade. The design by SVD can be improved by varying the order of 1-D digital filters in each section based on their corresponding singular values. It is shown that by assigning higher order filters to the sections with greater singular values (SVs), and lower order filters to the sections with lower SVs, a sizable reduction in the total number of required multiplications is achieved. A Genetic Algorithm (GA) is used to design each of the 1-D filters instead of classical optimization. Canonical signed digit system is used to represent filters\u27 coefficients. CSD helps to improve the efficiency of multiplications and thus increase the throughput rate. Examples are provided to demonstrate the effectiveness and usefulness of the proposed technique

    Approximation of L\"owdin Orthogonalization to a Spectrally Efficient Orthogonal Overlapping PPM Design for UWB Impulse Radio

    Full text link
    In this paper we consider the design of spectrally efficient time-limited pulses for ultrawideband (UWB) systems using an overlapping pulse position modulation scheme. For this we investigate an orthogonalization method, which was developed in 1950 by Per-Olov L\"owdin. Our objective is to obtain a set of N orthogonal (L\"owdin) pulses, which remain time-limited and spectrally efficient for UWB systems, from a set of N equidistant translates of a time-limited optimal spectral designed UWB pulse. We derive an approximate L\"owdin orthogonalization (ALO) by using circulant approximations for the Gram matrix to obtain a practical filter implementation. We show that the centered ALO and L\"owdin pulses converge pointwise to the same Nyquist pulse as N tends to infinity. The set of translates of the Nyquist pulse forms an orthonormal basis or the shift-invariant space generated by the initial spectral optimal pulse. The ALO transform provides a closed-form approximation of the L\"owdin transform, which can be implemented in an analog fashion without the need of analog to digital conversions. Furthermore, we investigate the interplay between the optimization and the orthogonalization procedure by using methods from the theory of shift-invariant spaces. Finally we develop a connection between our results and wavelet and frame theory.Comment: 33 pages, 11 figures. Accepted for publication 9 Sep 201

    Multiplierless CSD techniques for high performance FPGA implementation of digital filters.

    Get PDF
    I leverage FastCSD to develop a new, high performance iterative multiplierless structure based on a novel real-time CSD recoding, so that more zero partial products are introduced. Up to 66.7% zero partial products occur compared to 50% in the traditional modified Booth's recoding. Also, this structure reduces the non-zero partial products to a minimum. As a result, the number of arithmetic operations in the carry-save structure is reduced. Thus, an overall speed-up, as well as low-power consumption can be achieved. Furthermore, because the proposed structure involves real time CSD recoding and does not require a fixed value for the multiplier input to be known a priori, the proposed multiplier can be applied to implement digital filters with non-fixed filter coefficients, such as adaptive filters.My work is based on a dramatic new technique for converting between 2's complement and CSD number systems, and results in high-performance structures that are particularly effective for implementing adaptive systems in reconfigurable logic.My research focus is on two key ideas for improving DSP performance: (1) Develop new high performance, efficient shift-add techniques ("multiplierless") to implement the multiply-add operations without the need for a traditional multiplier structure. (2) There is a growing trend toward design prototyping and even production in FPGAs as opposed to dedicated DSP processors or ASICs; leverage this trend synergistically with the new multiplierless structures to improve performance.Implementation of digital signal processing (DSP) algorithms in hardware, such as field programmable gate arrays (FPGAs), requires a large number of multipliers. Fast, low area multiply-adds have become critical in modern commercial and military DSP applications. In many contemporary real-time DSP and multimedia applications, system performance is severely impacted by the limitations of currently available speed, energy efficiency, and area requirement of an onboard silicon multiplier.I also introduce a new multi-input Canonical Signed Digit (CSD) multiplier unit, which requires fewer shift/add/subtract operations and reduced CSD number conversion overhead compared to existing techniques. This results in reduced power consumption and area requirements in the hardware implementation of DSP algorithms. Furthermore, because all the products are produced simultaneously, the multiplication speed and thus the throughput are improved. The multi-input multiplier unit is applied to implement digital filters with non-fixed filter coefficients, such as adaptive filters. The implementation cost of these digital filters can be further reduced by limiting the wordlength of the input signal with little or no sacrifice to the filter performance, which is confirmed by my simulation results. The proposed multiplier unit can also be applied to other DSP algorithms, such as digital filter banks or matrix and vector multiplications.Finally, the tradeoff between filter order and coefficient length in the design and implementation of high-performance filters in Field Programmable Gate Arrays (FPGAs) is discussed. Non-minimum order FIR filters are designed for implementation using Canonical Signed Digit (CSD) multiplierless implementation techniques. By increasing the filter order, the length of the coefficients can be decreased without reducing the filter performance. Thus, an overall hardware savings can be achieved.Adaptive system implementations require real-time conversion of coefficients to Canonical Signed Digit (CSD) or similar representations to benefit from multiplierless techniques for implementing filters. Multiplierless approaches are used to reduce the hardware and increase the throughput. This dissertation introduces the first non-iterative hardware algorithm to convert 2's complement numbers to their CSD representations (FastCSD) using a fixed number of shift and logic operations. As a result, the power consumption and area requirements required for hardware implementation of DSP algorithms in which the coefficients are not known a priori can be greatly reduced. Because all CSD digits are produced simultaneously, the conversion speed and thus the throughput are improved when compared to overlap-and-scan techniques such as Booth's recoding

    Equalization of Third-Order Intermodulation Products in Wideband Direct Conversion Receivers

    Get PDF
    This paper reports a SAW-less direct-conversion receiver which utilizes a mixed-signal feedforward path to regenerate and adaptively cancel IM3 products, thus accomplishing system-level linearization. The receiver system performance is dominated by a custom integrated RF front end implemented in 130-nm CMOS and achieves an uncorrected out-of-band IIP3 of -7.1 dBm under the worst-case UMTS FDD Region 1 blocking specifications. Under IM3 equalization, the receiver achieves an effective IIP3 of +5.3 dBm and meets the UMTS BER sensitivity requirement with 3.7 dB of margin

    Maximum-likelihood estimation of delta-domain model parameters from noisy output signals

    Get PDF
    Fast sampling is desirable to describe signal transmission through wide-bandwidth systems. The delta-operator provides an ideal discrete-time modeling description for such fast-sampled systems. However, the estimation of delta-domain model parameters is usually biased by directly applying the delta-transformations to a sampled signal corrupted by additive measurement noise. This problem is solved here by expectation-maximization, where the delta-transformations of the true signal are estimated and then used to obtain the model parameters. The method is demonstrated on a numerical example to improve on the accuracy of using a shift operator approach when the sample rate is fast

    Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions

    Full text link
    Recent advances in attention-free sequence models rely on convolutions as alternatives to the attention operator at the core of Transformers. In particular, long convolution sequence models have achieved state-of-the-art performance in many domains, but incur a significant cost during auto-regressive inference workloads -- naively requiring a full pass (or caching of activations) over the input sequence for each generated token -- similarly to attention-based models. In this paper, we seek to enable O(1)\mathcal O(1) compute and memory cost per token in any pre-trained long convolution architecture to reduce memory footprint and increase throughput during generation. Concretely, our methods consist in extracting low-dimensional linear state-space models from each convolution layer, building upon rational interpolation and model-order reduction techniques. We further introduce architectural improvements to convolution-based layers such as Hyena: by weight-tying the filters across channels into heads, we achieve higher pre-training quality and reduce the number of filters to be distilled. The resulting model achieves 10x higher throughput than Transformers and 1.5x higher than Hyena at 1.3B parameters, without any loss in quality after distillation

    The design and multiplier-less realization of software radio receivers with reduced system delay

    Get PDF
    This paper studies the design and multiplier-less realization of a new software radio receiver (SRR) with reduced system delay. It employs low-delay finite-impulse response (FIR) and digital allpass filters to effectively reduce the system delay of the multistage decimators in SRRs. The optimal least-square and minimax designs of these low-delay FIR and allpass-based filters are formulated as a semidefinite programming (SDP) problem, which allows zero magnitude constraint at ω = π to be incorporated readily as additional linear matrix inequalities (LMIs). By implementing the sampling rate converter (SRC) using a variable digital filter (VDF) immediately after the integer decimators, the needs for an expensive programmable FIR filter in the traditional SRR is avoided. A new method for the optimal minimax design of this VDF-based SRC using SDP is also proposed and compared with traditional weight least squares method. Other implementation issues including the multiplier-less and digital signal processor (DSP) realizations of the SRR and the generation of the clock signal in the SRC are also studied. Design results show that the system delay and implementation complexities (especially in terms of high-speed variable multipliers) of the proposed architecture are considerably reduced as compared with conventional approaches. © 2004 IEEE.published_or_final_versio
    • …
    corecore