1,376 research outputs found

    Hardware-software co-design for low-cost AI processing in space processors

    Get PDF
    In the recent years there has been an increasing interest in artificial intelligence (AI) and machine learning (ML). The advantages of such applications are widespread across many areas and have drawn the attention of different sectors, such as aerospace. However, these applications require much more performance than the one provided by space processors. In space the environment is not ideal for high-performance cutting-edge processors, due to radiation. For this reason, radiation hardened or radiation tolerant processors are required, which use older technologies and redundant logic, reducing the available die resources that can be exploited. In order to accelerate demanding AI applications in space processors, this thesis presents SPARROW, a low-cost SIMD accelerator for AI operations. SPARROW has been designed following a hardware-software co-design approach by analyzing the requirements of common AI applications in order to improve the efficiency of the module. The design of such module does not use any existing vector extension and instead has in its portability one of the key advantages over other implementations. Furthermore, SPARROW reuses the integer register file of the processor avoiding complex managing of the data while reducing significantly the hardware cost of the module, which is specially interesting in the space domain due to the constraints in the processor area. SPARROW operates with 8-bit integer vector components in two different stages, performing parallel computations in the first and reduction operations in the second. This design is integrated within the baseline processor not requiring any additional pipeline stage nor a modification of the processor frequency. SPARROW also includes swizzling and masking capabilities for the input vectors as well as saturation to work with 8 bits without overflow. SPARROW has been integrated with the LEON3 and NOEL-V space-grade processors, both distributed by Cobham Gaisler. Since each of the baseline processors has a different architecture set, software support for SPARROW has been provided for both SPARC v8 and RISC-V ISAs, showing the portability of the design. Software support been developed using two well established compilers, LLVM and GCC allowing for a comparison of the cost of developing support for each of them. The modifications have included the SPARROW instructions in the assembly language of each architecture and with the use of inline assembly and macros allow a programming model similar to SIMD intrinsics. LEON3 and NOEL-V extended with SPARROW have been implemented on a FPGA to evaluate the performance increase provided by our proposal. In order to compare the performance with the scalar version of the processor, different AI related applications have been tested such as matrix multiplication and image filters, which are essential building blocks for convolutional neural networks. With the use of SPARROW a speed-ups of 6x and up to 15x have been achieved

    Verification of Magnitude and Phase Responses in Fixed-Point Digital Filters

    Full text link
    In the digital signal processing (DSP) area, one of the most important tasks is digital filter design. Currently, this procedure is performed with the aid of computational tools, which generally assume filter coefficients represented with floating-point arithmetic. Nonetheless, during the implementation phase, which is often done in digital signal processors or field programmable gate arrays, the representation of the obtained coefficients can be carried out through integer or fixed-point arithmetic, which often results in unexpected behavior or even unstable filters. The present work addresses this issue and proposes a verification methodology based on the digital-system verifier (DSVerifier), with the goal of checking fixed-point digital filters w.r.t. implementation aspects. In particular, DSVerifier checks whether the number of bits used in coefficient representation will result in a filter with the same features specified during the design phase. Experimental results show that errors regarding frequency response and overflow are likely to be identified with the proposed methodology, which thus improves overall system's reliability

    SMT-Based Bounded Model Checking of Fixed-Point Digital Controllers

    Full text link
    Digital controllers have several advantages with respect to their flexibility and design's simplicity. However, they are subject to problems that are not faced by analog controllers. In particular, these problems are related to the finite word-length implementation that might lead to overflows, limit cycles, and time constraints in fixed-point processors. This paper proposes a new method to detect design's errors in digital controllers using a state-of-the art bounded model checker based on satisfiability modulo theories. The experiments with digital controllers for a ball and beam plant demonstrate that the proposed method can be very effective in finding errors in digital controllers than other existing approaches based on traditional simulations tools

    Fast Linear Programming through Transprecision Computing on Small and Sparse Data

    Get PDF
    A plethora of program analysis and optimization techniques rely on linear programming at their heart. However, such techniques are often considered too slow for production use. While today’s best solvers are optimized for complex problems with thousands of dimensions, linear programming, as used in compilers, is typically applied to small and seemingly trivial problems, but to many instances in a single compilation run. As a result, compilers do not benefit from decades of research on optimizing large-scale linear programming. We design a simplex solver targeted at compilers. A novel theory of transprecision computation applied from individual elements to full data-structures provides the computational foundation. By carefully combining it with optimized representations for small and sparse matrices and specialized small-coefficient algorithms, we (1) reduce memory traffic, (2) exploit wide vectors, and (3) use low-precision arithmetic units effectively. We evaluate our work by embedding our solver into a state-of-the-art integer set library and implement one essential operation, coalescing, on top of our transprecision solver. Our evaluation shows more than an order-of-magnitude speedup on the core simplex pivot operation and a mean speedup of 3.2x (vs. GMP) and 4.6x (vs. IMath) for the optimized coalescing operation. Our results demonstrate that our optimizations exploit the wide SIMD instructions of modern microarchitectures effectively. We expect our work to provide foundations for a future integer set library that uses transprecision arithmetic to accelerate compiler analyses.ISSN:2475-142

    Integer multipliers with overflow detection

    Full text link
    corecore