2,259 research outputs found
DeSyRe: on-Demand System Reliability
The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints
Stochastic rounding and reduced-precision fixed-point arithmetic for solving neural ordinary differential equations
Although double-precision floating-point arithmetic currently dominates
high-performance computing, there is increasing interest in smaller and simpler
arithmetic types. The main reasons are potential improvements in energy
efficiency and memory footprint and bandwidth. However, simply switching to
lower-precision types typically results in increased numerical errors. We
investigate approaches to improving the accuracy of reduced-precision
fixed-point arithmetic types, using examples in an important domain for
numerical computation in neuroscience: the solution of Ordinary Differential
Equations (ODEs). The Izhikevich neuron model is used to demonstrate that
rounding has an important role in producing accurate spike timings from
explicit ODE solution algorithms. In particular, fixed-point arithmetic with
stochastic rounding consistently results in smaller errors compared to single
precision floating-point and fixed-point arithmetic with round-to-nearest
across a range of neuron behaviours and ODE solvers. A computationally much
cheaper alternative is also investigated, inspired by the concept of dither
that is a widely understood mechanism for providing resolution below the least
significant bit (LSB) in digital signal processing. These results will have
implications for the solution of ODEs in other subject areas, and should also
be directly relevant to the huge range of practical problems that are
represented by Partial Differential Equations (PDEs).Comment: Submitted to Philosophical Transactions of the Royal Society
Monitoring and Data Quality assessments for the ATLAS Liquid Argon Calorimeter at the LHC
The ATLAS detector at the Large Hadron Collider is expected to collect an unprecedented wealth of data at a completely new energy scale. In particular its Liquid Argon (LAr) electromagnetic and hadronic calorimeters will play an essential role in measuring final states with electrons and photons and in contributing to the measurement of jets and missing transverse energy. The ATLAS LAr calorimeter is a system of three sampling calorimeters (electromagnetic barrel, hadronic endcaps and forward calorimeters) with LAr as sensitive medium. It is composed by 182,468 readout channels and covers a pseudo-rapidity region up to 4.9. Efficient monitoring will be crucial from the earliest data taking onward and at multiple levels of the electronic readout and triggering systems. Detection of serious data integrity issues along the read-out chain during data taking will be essential so that quick actions can be taken. Moreover, by providing essential information about the performance of each sub-detector, the quality of the data collected (hot or dead channels, alignment and calibration problems, timing problems...) and their impact on physics measurable, the monitoring will be critical in guaranteeing that data is ready for physics analysis in due time. Software tools and criteria for monitoring the LAr data during the cosmic muon runs, which have been taking place since October 2006, are discussed. The further extension to the strategy f or monitoring collisions data expected for the end of year 2009 is also described
Continuous-Time and Companding Digital Signal Processors Using Adaptivity and Asynchronous Techniques
The fully synchronous approach has been the norm for digital signal processors (DSPs) for many decades. Due to its simplicity, the classical DSP structure has been used in many applications. However, due to its rigid discrete-time operation, a classical DSP has limited efficiency or inadequate resolution for some emerging applications, such as processing of multimedia and biological signals. This thesis proposes fundamentally new approaches to designing DSPs, which are different from the classical scheme. The defining characteristic of all new DSPs examined in this thesis is the notion of "adaptivity" or "adaptability." Adaptive DSPs dynamically change their behavior to adjust to some property of their input stream, for example the rate of change of the input. This thesis presents both enhancements to existing adaptive DSPs, as well as new adaptive DSPs. The main class of DSPs that are examined throughout the thesis are continuous-time (CT) DSPs. CT DSPs are clock-less and event-driven; they naturally adapt their activity and power consumption to the rate of their inputs. The absence of a clock also provides a complete avoidance of aliasing in the frequency domain, hence improved signal fidelity. The core of this thesis deals with the complete and systematic design of a truly general-purpose CT DSP. A scalable design methodology for CT DSPs is presented. This leads to the main contribution of this thesis, namely a new CT DSP chip. This chip is the first general-purpose CT DSP chip, able to process many different classes of CT and synchronous signals. The chip has the property of handling various types of signals, i.e. various different digital modulations, both synchronous and asynchronous, without requiring any reconfiguration; such property is presented for the first time CT DSPs and is impossible for classical DSPs. As opposed to previous CT DSPs, which were limited to using only one type of digital format, and whose design was hard to scale for different bandwidths and bit-widths, this chip has a formal, robust and scalable design, due to the systematic usage of asynchronous design techniques. The second contribution of this thesis is a complete methodology to design adaptive delay lines. In particular, it is shown how to make the granularity, i.e. the number of stages, adaptive in a real-time delay line. Adaptive granularity brings about a significant improvement in the line's power consumption, up to 70% as reported by simulations on two design examples. This enhancement can have a direct large power impact on any CT DSP, since a delay line consumes the majority of a CT DSP's power. The robust methodology presented in this thesis allows safe dynamic reconfiguration of the line's granularity, on-the-fly and according to the input traffic. As a final contribution, the thesis also examines two additional DSPs: one operating the CT domain and one using the companding technique. The former operates only on level-crossing samples; the proposed methodology shows a potential for high-quality outputs by using a complex interpolation function. Finally, a companding DSP is presented for MPEG audio. Companding DSPs adapt their dynamic range to the amplitude of their input; the resulting can offer high-quality outputs even for small inputs. By applying companding to MPEG DSPs, it is shown how the DSP distortion can be made almost inaudible, without requiring complex arithmetic hardware
Calibration-free and hardware-efficient neural spike detection for brain machine interfaces
Recent translational efforts in brain-machine interfaces (BMI) are demonstrating the potential to help people with neurological disorders. The current trend in BMI technology is to increase the number of recording channels to the thousands, resulting in the generation of vast amounts of raw data. This in turn places high bandwidth requirements for data transmission, which increases power consumption and thermal dissipation of implanted systems. On-implant compression and/or feature extraction are therefore becoming essential to limiting this increase in bandwidth, but add further power constraints – the power required for data reduction must remain less than the power saved through bandwidth reduction. Spike detection is a common feature extraction technique used for intracortical BMIs. In this paper, we develop a novel firing-rate-based spike detection algorithm that requires no external training and is hardware efficient and therefore ideally suited for real-time applications. Key performance and implementation metrics such as detection accuracy, adaptability in chronic deployment, power consumption, area utilization, and channel scalability are benchmarked against existing methods using various datasets. The algorithm is first validated using a reconfigurable hardware (FPGA) platform and then ported to a digital ASIC implementation in both 65 nm and 0.18MU m CMOS technologies. The 128-channel ASIC design implemented in a 65 nm CMOS technology occupies 0.096 mm2 silicon area and consumes 4.86MU W from a 1.2 V power supply. The adaptive algorithm achieves a 96% spike detection accuracy on a commonly used synthetic dataset, without the need for any prior training
Recommended from our members
Array Architectures and Physical Layer Design for Millimeter-Wave Communications Beyond 5G
Ever increasing demands in mobile data rates have resulted in exploration of millimeter-wave (mmW) frequencies for the next generation (5G) wireless networks. Communications at mmW frequencies is presented with two keys challenges. Firstly, high propagation loss requires base stations (BSs) and user equipment (UEs) to use a large number of antennas and narrow beams to close the link with sufficient received signal power. Consequently, communications using narrow beams create a new challenge in channel estimation and link establishment based on fine angular probing. Current mmW system use analog phased arrays that can probe only one angle at the time which results in high latency during link establishment and channel tracking. It is desirable to design low latency beam training by exploring both physical layer designs and array architectures that could replace current 5G approaches and pave the way to the communications for frequency bands in higher mmW band and sub-THz region where larger antenna arrays and communications bandwidth can be exploited. To this end, we propose a novel signal processing techniques exploiting unique properties of mmW channel, and show both theoretically, in simulation and experiments its advantages over conventional approaches. Secondly, we explore different array architecture design and analyze their trade-offs between spectral efficiency and power consumption and area. For comprehensive comparison, we have developed a methodology for optimal design of system parameters for different array architecture candidates based on the spectral efficiency target, and use these parameters to estimate the array area and power consumption based on the circuits reported in the literature. We show that the hybrid analog and digital architectures have severe scalability concerns in radio frequency signal distribution with increased array size and spatial multiplexing levels, while the fully-digital array architectures have the best performance and power/area trade-offs.The developed approaches are based on a cross-disciplinary research that combines innovation in model based signal processing, machine learning, and radio hardware. This work is the first to apply compressive sensing (CS), a signal processing tool that exploits sparsity of mmW channel model, to accelerate beam training of mmW cellular system. The algorithm is designed to address practical issues including the requirement of cell discovery and synchronization that involves estimation of angular channel together with carrier frequency offset and timing offsets. We have analyzed the algorithm performance in the 5G compliant simulation and showed that an order of magnitude saving is achieved in initial access latency for the desired channel estimation accuracy. Moreover, we are the first to develop and implement a neural network assisted compressive beam alignment to deal with hardware impairments in mmW radios. We have used 60GHz mmW testbed to perform experiments and show that neural networks approach enhances alignment rate compared to CS. To further accelerate beam training, we proposed a novel frequency selective probing beams using the true-time-delay (TTD) analog array architecture. Our approach utilizes different subcarriers to scan different directions, and achieves a single-shot beam alignment, the fastest approach reported to date. Our comprehensive analysis of different array architectures and exploration of emerging architectures enabled us to develop an order of magnitude faster and energy efficient approaches for initial access and channel estimation in mmW systems
GVSoC: A Highly Configurable, Fast and Accurate Full-Platform Simulator for RISC-V based IoT Processors
open6siembargoed_20220427Bruschi, Nazareno; Haugou, Germain; Tagliavini, Giuseppe; Conti, Francesco; Benini, Luca; Rossi, DavideBruschi, Nazareno; Haugou, Germain; Tagliavini, Giuseppe; Conti, Francesco; Benini, Luca; Rossi, David
- …