2 research outputs found

    PULP-HD: Accelerating Brain-Inspired High-Dimensional Computing on a Parallel Ultra-Low Power Platform

    Full text link
    Computing with high-dimensional (HD) vectors, also referred to as hypervectors\textit{hypervectors}, is a brain-inspired alternative to computing with scalars. Key properties of HD computing include a well-defined set of arithmetic operations on hypervectors, generality, scalability, robustness, fast learning, and ubiquitous parallel operations. HD computing is about manipulating and comparing large patterns-binary hypervectors with 10,000 dimensions-making its efficient realization on minimalistic ultra-low-power platforms challenging. This paper describes HD computing's acceleration and its optimization of memory accesses and operations on a silicon prototype of the PULPv3 4-core platform (1.5mm2^2, 2mW), surpassing the state-of-the-art classification accuracy (on average 92.4%) with simultaneous 3.7×\times end-to-end speed-up and 2×\times energy saving compared to its single-core execution. We further explore the scalability of our accelerator by increasing the number of inputs and classification window on a new generation of the PULP architecture featuring bit-manipulation instruction extensions and larger number of 8 cores. These together enable a near ideal speed-up of 18.4×\times compared to the single-core PULPv3

    A wide tuning-range ADFLL for mW-SoCs with dithering-enhanced accuracy in 65 nm CMOS

    No full text
    We present an integer-N all-digital frequency-locked loop (ADFLL) suitable for dynamic voltage and frequency scaling in system-on-chips targeting mW-consumption. The proposed ADFLL operates with a 32 kHz clock reference, and offers a large clock multiplication factor of 32786, resulting in a wide tuning-range from 19 kHz to 1.048 GHz at 1.2 V and to 250 MHz at 0.8 V,. It incorporates a jitter reduction technique enabling the generation of accurate low-rate clocks in ADFLLs, combining clock division and dithering based on a 1st-order digital \uce\ua3\uce\u94-modulator. The measured clock division factor dependent reduction of the peak cycle-to-cycle (C2C) jitter was between 40% and 70% at a 200 MHz DCO clock. The lowest peak C2C jitter of 0.14% was measured at a 3.15MHz output clock derived from a 800 MHz DCO clock. A prototype in UMC 65 nm CMOS occupies 0.013 mm2 of area, and at 100 MHz consumes 605 \uce\ubcW (scaling with 3 \uce\ubcW/MHz) at 1.2 V, and 205 \uce\ubcW (scaling with 1.2 \uce\ubcW/MHz) at 0.8 V
    corecore