45 research outputs found

    Final report of the Working Group on Nephrops Survey (WGNEPS)

    Get PDF
    WGNEPS es un grupo de coordinación de Grupos de Expertos en campañas de arrastre y de imágenes submarinas dirigidas a la estimación de la abundancia de Neprhops en el área ICES y de forma exploratoria en el Mediterráneo

    Approximate computing design exploration through data lifetime metrics

    Get PDF
    When designing an approximate computing system, the selection of the resources to modify is key. It is important that the error introduced in the system remains reasonable, but the size of the design exploration space can make this extremely difficult. In this paper, we propose to exploit a new metric for this selection: data lifetime. The concept comes from the field of reliability, where it can guide selective hardening: the more often a resource handles "live" data, the more critical it be-comes, the more important it will be to protect it. In this paper, we propose to use this same metric in a new way: identify the less critical resources as approximation targets in order to minimize the impact on the global system behavior and there-fore decrease the impact of approximation while increasing gains on other criteria

    Design of High Speed Memory-Based FFT Processor Using 90nm Technology

    Get PDF
    In order to enhance performance, the Fast Fourier Transformation is a important operation in Digital Signal Processing (DSP) systems had been extensively studied. State-of-the-art transmission technology uses Orthogonal frequency division multiplexing (OFDM), which primary operation is the Fast fourier transform (FFT). This analysis presents the design of a high-speed memory-based FFT processor using 90nm technology. The novel hybrid multiplier and hybrid adder is used in this analysis. The main objective of this method is to develop an efficient, memory-efficient FFT processor that requires less area.  Using 90nm CMOS (Complementary Metal Oxide Semiconductor) technology, the proposed FFT processor was created and implemented in process. With reduced processing time, this means that the proposed FFT processor performs better than the prior memory-based FFT processors in terms of performance and the number of LUTs required which reduces area and memory utilization

    Advanced Wireless Digital Baseband Signal Processing Beyond 100 Gbit/s

    Get PDF
    International audienceThe continuing trend towards higher data rates in wireless communication systems will, in addition to a higher spectral efficiency and lowest signal processing latencies, lead to throughput requirements for the digital baseband signal processing beyond 100 Gbit/s, which is at least one order of magnitude higher than the tens of Gbit/s targeted in the 5G standardization. At the same time, advances in silicon technology due to shrinking feature sizes and increased performance parameters alone won't provide the necessary gain, especially in energy efficiency for wireless transceivers, which have tightly constrained power and energy budgets. In this paper, we highlight the challenges for wireless digital baseband signal processing beyond 100 Gbit/s and the limitations of today's architectures. Our focus lies on the channel decoding and MIMO detection, which are major sources of complexity in digital baseband signal processing. We discuss techniques on algorithmic and architectural level, which aim to close this gap. For the first time we show Turbo-Code decoding techniques towards 100 Gbit/s and a complete MIMO receiver beyond 100 Gbit/s in 28 nm technology

    Impact of fast-converging PEVD algorithms on broadband AoA estimation

    Get PDF
    Polynomial matrix eigenvalue decomposition (PEVD) algorithms have been shown to enable a solution to the broadband angle of arrival (AoA) estimation problem. A parahermitian cross-spectral density (CSD) matrix can be generated from samples gathered by multiple array elements. The application of the PEVD to this CSD matrix leads to a paraunitary matrix which can be used within the spatio-spectral polynomial multiple signal classification (SSP-MUSIC) AoA estimation algorithm. Here, we demonstrate that the recent low-complexity divide-and-conquer sequential matrix diagonalisation (DC-SMD) algorithm, when paired with SSP-MUSIC, is able to provide superior AoA estimation versus traditional PEVD methods for the same algorithm execution time. We also provide results that quantify the performance trade-offs that DC-SMD offers for various algorithm parameters, and show that algorithm convergence speed can be increased at the expense of increased decomposition error and poorer AoA estimation performance

    Exploring Hardware Fault Impacts on Different Real Number Representations of the Structural Resilience of TCUs in GPUs

    Get PDF
    The most recent generations of graphics processing units (GPUs) boost the execution of convolutional operations required by machine learning applications by resorting to specialized and efficient in-chip accelerators (Tensor Core Units or TCUs) that operate on matrix multiplication tiles. Unfortunately, modern cutting-edge semiconductor technologies are increasingly prone to hardware defects, and the trend to highly stress TCUs during the execution of safety-critical and high-performance computing (HPC) applications increases the likelihood of TCUs producing different kinds of failures. In fact, the intrinsic resiliency to hardware faults of arithmetic units plays a crucial role in safety-critical applications using GPUs (e.g., in automotive, space, and autonomous robotics). Recently, new arithmetic formats have been proposed, particularly those suited to neural network execution. However, the reliability characterization of TCUs supporting different arithmetic formats was still lacking. In this work, we quantitatively assessed the impact of hardware faults in TCU structures while employing two distinct formats (floating-point and posit) and using two different configurations (16 and 32 bits) to represent real numbers. For the experimental evaluation, we resorted to an architectural description of a TCU core (PyOpenTCU) and performed 120 fault simulation campaigns, injecting around 200,000 faults per campaign and requiring around 32 days of computation. Our results demonstrate that the posit format of TCUs is less affected by faults than the floating-point one (by up to three orders of magnitude for 16 bits and up to twenty orders for 32 bits). We also identified the most sensible fault locations (i.e., those that produce the largest errors), thus paving the way to adopting smart hardening solutions

    Mathematical tools for processing broadband multi-sensor signals

    Get PDF
    Spatial information in broadband array signals is embedded in the relative delay with which sources illuminate different sensors. Therefore, second order statistics, on which cost functions such as the mean square rest, must include such delays. Typically, a space-time covariance matrix therefore arises, which can be represented as a Laurent polynomial matrix. The optimisation of a cost function then requires extending the utility of the eigenvalue decomposition from narrowband covariance matrices to the broadband case of operating in a space-time covariance matrix. This overview paper summarises efforts in performing such factorisations, and demonstrated via the exemplar application of a broadband beamformer how thus well-known narrowband solutions can be extended to the broadband case using polynomial matrices and their factorisations

    Customizing Fixed-Point and Floating-Point Arithmetic - A Case Study in K-Means Clustering

    Get PDF
    International audienceThis paper presents a comparison between custom fixed-point (FxP) and floating-point (FlP) arithmetic, applied to bidimensional K-means clustering algorithm. After a discussion on the K-means clustering algorithm and arithmetic characteristics, hardware implementations of FxP and FlP arithmetic operators are compared in terms of area, delay and energy, for different bitwidth, using the ApxPerf2.0 framework. Finally, both are compared in the context of K-means clustering. The direct comparison shows the large difference between 8-to-16-bit FxP and FlP operators, FlP adders consuming 5-12Ă— more energy than FxP adders, and multipliers 2-10Ă— more. However, when applied to K-means clustering algorithm, the gap between FxP and FlP tightens. Indeed, the accuracy improvements brought by FlP make the computation more accurate and lead to an accuracy equivalent to FxP with less iterations of the algorithm, proportionally reducing the global energy spent. The 8-bit version of the algorithm becomes more profitable using FlP, which is 80% more accurate with only 1.6Ă— more energy. This paper finally discusses the stake of custom FlP for low-energy general-purpose computation, thanks to its ease of use, supported by an energy overhead lower than what could have been expected
    corecore