135 research outputs found
An Inexact Ultra-low Power Bio-signal Processing Architecture With Lightweight Error Recovery
The energy efficiency of digital architectures is tightly linked to the voltage level (Vdd) at which they operate. Aggressive voltage scaling is therefore mandatory when ultra-low power processing is required. Nonetheless, the lowest admissible Vdd is oen bounded by reliability concerns, especially since static and dynamic non-idealities are exacerbated in the near-threshold region, imposing costly guard-bands to guarantee correctness under worst-case conditions. A striking alternative, explored in this paper, waives the requirement for unconditional correctness, undergoing more relaxed constraints. First, aer a run-time failure, processing correctly resumes at a later point in time. Second, failures induce a limited Quality-of-Service (QoS) degradation. We focus our investigation on the practical scenario of embedded bio-signal analysis, a domain in which energy efficiency is key, while applications are inherently error-tolerant to a certain degree. Targeting a domain-specific multi-core platform, we present a study of the impact of inexactness on application-visible errors. en, we introduce a novel methodology to manage them, which requires minimal hardware resources and a negligible energy overhead. Experimental evidence show that, by tolerating 900 errors/hour, the resulting inexact platform can achieve an efficiency increase of up to 24%, with a QoS degradation of less than 3%
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
The challenging deployment of compute-intensive applications from domains
such Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces
the community of computing systems to explore new design approaches.
Approximate Computing appears as an emerging solution, allowing to tune the
quality of results in the design of a system in order to improve the energy
efficiency and/or performance. This radical paradigm shift has attracted
interest from both academia and industry, resulting in significant research on
approximation techniques and methodologies at different design layers (from
system down to integrated circuits). Motivated by the wide appeal of
Approximate Computing over the last 10 years, we conduct a two-part survey to
cover key aspects (e.g., terminology and applications) and review the
state-of-the art approximation techniques from all layers of the traditional
computing stack. In Part II of our survey, we classify and present the
technical details of application-specific and architectural approximation
techniques, which both target the design of resource-efficient
processors/accelerators & systems. Moreover, we present a detailed analysis of
the application spectrum of Approximate Computing and discuss open challenges
and future directions.Comment: Under Review at ACM Computing Survey
Hardware / Software Architectural and Technological Exploration for Energy-Efficient and Reliable Biomedical Devices
Nowadays, the ubiquity of smart appliances in our everyday lives is increasingly strengthening the links between humans and machines. Beyond making our lives easier and more convenient, smart devices are now playing an important role in personalized healthcare delivery. This technological breakthrough is particularly relevant in a world where population aging and unhealthy habits have made non-communicable diseases the first leading cause of death worldwide according to international public health organizations. In this context, smart health monitoring systems termed Wireless Body Sensor Nodes (WBSNs), represent a paradigm shift in the healthcare landscape by greatly lowering the cost of long-term monitoring of chronic diseases, as well as improving patients' lifestyles. WBSNs are able to autonomously acquire biological signals and embed on-node Digital Signal Processing (DSP) capabilities to deliver clinically-accurate health diagnoses in real-time, even outside of a hospital environment. Energy efficiency and reliability are fundamental requirements for WBSNs, since they must operate for extended periods of time, while relying on compact batteries. These constraints, in turn, impose carefully designed hardware and software architectures for hosting the execution of complex biomedical applications. In this thesis, I develop and explore novel solutions at the architectural and technological level of the integrated circuit design domain, to enhance the energy efficiency and reliability of current WBSNs. Firstly, following a top-down approach driven by the characteristics of biomedical algorithms, I perform an architectural exploration of a heterogeneous and reconfigurable computing platform devoted to bio-signal analysis. By interfacing a shared Coarse-Grained Reconfigurable Array (CGRA) accelerator, this domain-specific platform can achieve higher performance and energy savings, beyond the capabilities offered by a baseline multi-processor system. More precisely, I propose three CGRA architectures, each contributing differently to the maximization of the application parallelization. The proposed Single, Multi and Interleaved-Datapath CGRA designs allow the developed platform to achieve substantial energy savings of up to 37%, when executing complex biomedical applications, with respect to a multi-core-only platform. Secondly, I investigate how the modeling of technology reliability issues in logic and memory components can be exploited to adequately adjust the frequency and supply voltage of a circuit, with the aim of optimizing its computing performance and energy efficiency. To this end, I propose a novel framework for workload-dependent Bias Temperature Instability (BTI) impact analysis on biomedical application results quality. Remarkably, the framework is able to determine the range of safe circuit operating frequencies without introducing worst-case guard bands. Experiments highlight the possibility to safely raise the frequency up to 101% above the maximum obtained with the classical static timing analysis. Finally, through the study of several well-known biomedical algorithms, I propose an approach allowing energy savings by dynamically and unequally protecting an under-powered data memory in a new way compared to regular error protection schemes. This solution relies on the Dynamic eRror compEnsation And Masking (DREAM) technique that reduces by approximately 21% the energy consumed by traditional error correction codes
A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits
Given the stringent requirements of energy efficiency for Internet-of-Things
edge devices, approximate multipliers, as a basic component of many processors
and accelerators, have been constantly proposed and studied for decades,
especially in error-resilient applications. The computation error and energy
efficiency largely depend on how and where the approximation is introduced into
a design. Thus, this article aims to provide a comprehensive review of the
approximation techniques in multiplier designs ranging from algorithms and
architectures to circuits. We have implemented representative approximate
multiplier designs in each category to understand the impact of the design
techniques on accuracy and efficiency. The designs can then be effectively
deployed in high-level applications, such as machine learning, to gain energy
efficiency at the cost of slight accuracy loss.Comment: 38 pages, 37 figure
Investigation of reconfigurable-accuracy approximate adder designs for image processing applications
Ph. D. Thesis.In the last decades, integrated circuits with CMOS technology show
progressive scaling challenges of both increased power density and
power dissipation. Meanwhile, high-performance requirements of
current and future application operations show rapid demands of
computing resources like power. This design conflict has pushed
much effort to search for high performance and energy efficient
design approach, such as approximate computing.
Approximate computing exploits the error resilience of compute-
intensive applications such as image processing applications to
implement approximation design techniques with different levels
of abstractions and scalability. The basic principle is to relax the
strict accuracy requirements in favour of a lower design complexity,
thereby achieving more computational performance (i.e., speed)
and energy saving. The adder arithmetic unit is considered one
of the essential computational blocks in most of the applications.
As such, much effort has explored new designs of an efficient
approximate adder design.
This thesis presents an investigation into design enhancement,
novel approximate adder designs and implementation approaches.
The first approach introduces a modification to the error detection
technique of a popular configurable-accuracy approximate adder
design. The proposed lightweight error detection technique reduces
the required gates of the error detection circuit, thus, mitigating
the design area overhead. Furthermore, at the error correction
process of the adder, we have proposed an extensive error detection
while activating more than one correction stage concurrently. As a
result, this ensures achieving an optimum accuracy of outputs for
the worst case of quality requirements.
In general, approximate (speculative) adder designs use the seg-
mentation technique to divide the adder into multiple short length
sub-adders which operate in parallel. Hence, this would limit the
long chains of carry propagation and result in a better performance
operations. However, the use of overlapped parts of sub-adders
regarding a better carry speculation and then more accuracy be-
comes a significant challenge of a large design area overhead. The
second approach continues mitigating this challenge by present-
ing a novel and simpler adder dividing technique to a number of
sub-adders. The new method uses what is known as the carry-kill
signal for both limiting the carry propagation and applying adder
segmentation. Further, between every two adjacent sub-adders,
one AND gate and one XOR gate are used for carry speculation
and error (i.e., carry propagation) detection respectively. Thus, a
significant reduction of the design overhead has been achieved, yet,
with acceptable levels of output results accuracy. In the third final
approach, simple logic OR gates are used to build the approximate
adder while compensating the conventional full adders operation.
The resulted approximate adder design presents very low complex-
ity, high speed, and low power consumption. Furthermore, instead
of augmenting error recovery circuit, short bit-length exact adders
are used as correction stages to control the general level of output
quality (i.e., without error detection overhead). At the final correc-
tion stage, the proposed design would operate the same as an exact
adder.
To validate the efficiency of these approaches, a number of adders
with different bit-widths are designed and synthesized showing
considerable reductions in the critical delay, silicon area and more
savings in energy consumption, compared to other existing ap-
proaches. In addition to acceptable levels or output errors, which
are extensively analysed for each proposed design.
In this study, the proposed configurable adder designs exhibit
energy/quality trade-offs at a different number of correction stages.
These trade-offs can be effectively exploited to implement adders
in applications, where energy can be gracefully minimised within
the envelope of quality requirements. As such, designs implemen-
tation in an image processing application known as Gaussian blur
filter was introduced, demonstrating the loss in the image quality
at each error correction stage. The output images showed promis-
ing results to use the proposed designs for more energy-efficient
applications, where output quality requirements can be relaxed.Mutah Universit
Approximation Opportunities in Edge Computing Hardware : A Systematic Literature Review
With the increasing popularity of the Internet of Things and massive Machine Type Communication technologies, the number of connected devices is rising. However, while enabling valuable effects to our lives, bandwidth and latency constraints challenge Cloud processing of their associated data amounts. A promising solution to these challenges is the combination of Edge and approximate computing techniques that allows for data processing nearer to the user. This paper aims to survey the potential benefits of these paradigms’ intersection. We provide a state-of-the-art review of circuit-level and architecture-level hardware techniques and popular applications. We also outline essential future research directions.publishedVersionPeer reviewe
Tailoring SVM Inference for Resource-Efficient ECG-Based Epilepsy Monitors
Event detection and classification algorithms are resilient towards aggressive resource-aware optimisations. In this paper, we leverage this characteristic in the context of smart health monitoring systems. In more detail, we study the attainable benefits resulting from tailoring Support Vector Machine (SVM) inference engines devoted to the detection of epileptic seizures from ECG-derived features. We conceive and explore multipleoptimisations, each effectively reducing resource budgets while minimally impacting classification performance. These strategies can be seamlessly combined, which results in 12.5X and 16X gains in energy and area, respectively, with a negligible loss, 3.2% in classification performance
Electronic systems for the restoration of the sense of touch in upper limb prosthetics
In the last few years, research on active prosthetics for upper limbs focused
on improving the human functionalities and the control. New methods have
been proposed for measuring the user muscle activity and translating it into
the prosthesis control commands. Developing the feed-forward interface so
that the prosthesis better follows the intention of the user is an important
step towards improving the quality of life of people with limb amputation.
However, prosthesis users can neither feel if something or someone is
touching them over the prosthesis and nor perceive the temperature or
roughness of objects. Prosthesis users are helped by looking at an object,
but they cannot detect anything otherwise. Their sight gives them most
information. Therefore, to foster the prosthesis embodiment and utility,
it is necessary to have a prosthetic system that not only responds to the
control signals provided by the user, but also transmits back to the user
the information about the current state of the prosthesis.
This thesis presents an electronic skin system to close the loop in prostheses
towards the restoration of the sense of touch in prosthesis users. The
proposed electronic skin system inlcudes an advanced distributed sensing
(electronic skin), a system for (i) signal conditioning, (ii) data acquisition,
and (iii) data processing, and a stimulation system. The idea is to integrate
all these components into a myoelectric prosthesis.
Embedding the electronic system and the sensing materials is a critical issue
on the way of development of new prostheses. In particular, processing
the data, originated from the electronic skin, into low- or high-level information
is the key issue to be addressed by the embedded electronic system.
Recently, it has been proved that the Machine Learning is a promising
approach in processing tactile sensors information. Many studies have
been shown the Machine Learning eectiveness in the classication of input
touch modalities.More specically, this thesis is focused on the stimulation system, allowing
the communication of a mechanical interaction from the electronic skin
to prosthesis users, and the dedicated implementation of algorithms for
processing tactile data originating from the electronic skin. On system
level, the thesis provides design of the experimental setup, experimental
protocol, and of algorithms to process tactile data. On architectural level,
the thesis proposes a design
ow for the implementation of digital circuits
for both FPGA and integrated circuits, and techniques for the power
management of embedded systems for Machine Learning algorithms
- …