3,091 research outputs found

    Low-complexity RLS algorithms using dichotomous coordinate descent iterations

    Get PDF
    In this paper, we derive low-complexity recursive least squares (RLS) adaptive filtering algorithms. We express the RLS problem in terms of auxiliary normal equations with respect to increments of the filter weights and apply this approach to the exponentially weighted and sliding window cases to derive new RLS techniques. For solving the auxiliary equations, line search methods are used. We first consider conjugate gradient iterations with a complexity of O(N-2) operations per sample; N being the number of the filter weights. To reduce the complexity and make the algorithms more suitable for finite precision implementation, we propose a new dichotomous coordinate descent (DCD) algorithm and apply it to the auxiliary equations. This results in a transversal RLS adaptive filter with complexity as low as 3N multiplications per sample, which is only slightly higher than the complexity of the least mean squares (LMS) algorithm (2N multiplications). Simulations are used to compare the performance of the proposed algorithms against the classical RLS and known advanced adaptive algorithms. Fixed-point FPGA implementation of the proposed DCD-based RLS algorithm is also discussed and results of such implementation are presented

    Minimum entropy restoration using FPGAs and high-level techniques

    Get PDF
    One of the greatest perceived barriers to the widespread use of FPGAs in image processing is the difficulty for application specialists of developing algorithms on reconfigurable hardware. Minimum entropy deconvolution (MED) techniques have been shown to be effective in the restoration of star-field images. This paper reports on an attempt to implement a MED algorithm using simulated annealing, first on a microprocessor, then on an FPGA. The FPGA implementation uses DIME-C, a C-to-gates compiler, coupled with a low-level core library to simplify the design task. Analysis of the C code and output from the DIME-C compiler guided the code optimisation. The paper reports on the design effort that this entailed and the resultant performance improvements

    A Multi-Grid Iterative Method for Photoacoustic Tomography

    Get PDF
    Inspired by the recent advances on minimizing nonsmooth or bound-constrained convex functions on models using varying degrees of fidelity, we propose a line search multigrid (MG) method for full-wave iterative image reconstruction in photoacoustic tomography (PAT) in heterogeneous media. To compute the search direction at each iteration, we decide between the gradient at the target level, or alternatively an approximate error correction at a coarser level, relying on some predefined criteria. To incorporate absorption and dispersion, we derive the analytical adjoint directly from the first-order acoustic wave system. The effectiveness of the proposed method is tested on a total-variation penalized Iterative Shrinkage Thresholding algorithm (ISTA) and its accelerated variant (FISTA), which have been used in many studies of image reconstruction in PAT. The results show the great potential of the proposed method in improving speed of iterative image reconstruction

    Hardware Based Projection onto The Parity Polytope and Probability Simplex

    Full text link
    This paper is concerned with the adaptation to hardware of methods for Euclidean norm projections onto the parity polytope and probability simplex. We first refine recent efforts to develop efficient methods of projection onto the parity polytope. Our resulting algorithm can be configured to have either average computational complexity O(d)\mathcal{O}\left(d\right) or worst case complexity O(dlog⁥d)\mathcal{O}\left(d\log{d}\right) on a serial processor where dd is the dimension of projection space. We show how to adapt our projection routine to hardware. Our projection method uses a sub-routine that involves another Euclidean projection; onto the probability simplex. We therefore explain how to adapt to hardware a well know simplex projection algorithm. The hardware implementations of both projection algorithms achieve area scalings of O(d(log⁥d)2)\mathcal{O}(d\left(\log{d}\right)^2) at a delay of O((log⁥d)2)\mathcal{O}(\left(\log{d}\right)^2). Finally, we present numerical results in which we evaluate the fixed-point accuracy and resource scaling of these algorithms when targeting a modern FPGA

    Real-Time Dense Stereo Matching With ELAS on FPGA Accelerated Embedded Devices

    Full text link
    For many applications in low-power real-time robotics, stereo cameras are the sensors of choice for depth perception as they are typically cheaper and more versatile than their active counterparts. Their biggest drawback, however, is that they do not directly sense depth maps; instead, these must be estimated through data-intensive processes. Therefore, appropriate algorithm selection plays an important role in achieving the desired performance characteristics. Motivated by applications in space and mobile robotics, we implement and evaluate a FPGA-accelerated adaptation of the ELAS algorithm. Despite offering one of the best trade-offs between efficiency and accuracy, ELAS has only been shown to run at 1.5-3 fps on a high-end CPU. Our system preserves all intriguing properties of the original algorithm, such as the slanted plane priors, but can achieve a frame rate of 47fps whilst consuming under 4W of power. Unlike previous FPGA based designs, we take advantage of both components on the CPU/FPGA System-on-Chip to showcase the strategy necessary to accelerate more complex and computationally diverse algorithms for such low power, real-time systems.Comment: 8 pages, 7 figures, 2 table

    Multi-Level Pre-Correlation RFI Flagging for Real-Time Implementation on UniBoard

    Get PDF
    Because of the denser active use of the spectrum, and because of radio telescopes higher sensitivity, radio frequency interference (RFI) mitigation has become a sensitive topic for current and future radio telescope designs. Even if quite sophisticated approaches have been proposed in the recent years, the majority of RFI mitigation operational procedures are based on post-correlation corrupted data flagging. Moreover, given the huge amount of data delivered by current and next generation radio telescopes, all these RFI detection procedures have to be at least automatic and, if possible, real-time. In this paper, the implementation of a real-time pre-correlation RFI detection and flagging procedure into generic high-performance computing platforms based on Field Programmable Gate Arrays (FPGA) is described, simulated and tested. One of these boards, UniBoard, developed under a Joint Research Activity in the RadioNet FP7 European programme is based on eight FPGAs interconnected by a high speed transceiver mesh. It provides up to ~4 TMACs with Altera Stratix IV FPGA and 160 Gbps data rate for the input data stream. Considering the high in-out data rate in the pre-correlation stages, only real-time and go-through detectors (i.e. no iterative processing) can be implemented. In this paper, a real-time and adaptive detection scheme is described. An ongoing case study has been set up with the Electronic Multi-Beam Radio Astronomy Concept (EMBRACE) radio telescope facility at Nan\c{c}ay Observatory. The objective is to evaluate the performances of this concept in term of hardware complexity, detection efficiency and additional RFI metadata rate cost. The UniBoard implementation scheme is described.Comment: 16 pages, 13 figure

    Efficient FPGA implementation of Recursive Least Square adaptive filter using non-restoring division algorithm

    Get PDF
    In this paper, Recursive Least Square (RLS) and Affine Projection (AP) adaptive filters are designed using Xilinx System Generator and implemented on the Spartan6 xc6slx16-2csg324 FPGA platform. FPGA platform utilizes the non-restoring division algorithm and the COordinate Rotation DIgital Computer (CORDIC) division algorithm to perform the division task of the RLS and AP adaptive filters. The Non-restoring division algorithm demonstrates efficient performance in terms of convergence speed and signal-to-noise ratio. In contrast, the CORDIC division algorithm requires 31 cycles for division initialization, whereas the non-restoring algorithm initializes division in just one cycle. To validate the effectiveness of the proposed filters, a set of ten ECG records from the BIT-MIT database is used to test their ability to remove Power Line Interference (PLI) noise from the ECG signal. The proposed adaptive filters are compared with various adaptive algorithms in terms of Signal-to-Noise Ratio (SNR), convergence speed, residual noise, steady-state Mean Square Error (MSE), and complexity
    • 

    corecore