583 research outputs found
Weighted p-bits for FPGA implementation of probabilistic circuits
Probabilistic spin logic (PSL) is a recently proposed computing paradigm
based on unstable stochastic units called probabilistic bits (p-bits) that can
be correlated to form probabilistic circuits (p-circuits). These p-circuits can
be used to solve problems of optimization, inference and also to implement
precise Boolean functions in an "inverted" mode, where a given Boolean circuit
can operate in reverse to find the input combinations that are consistent with
a given output. In this paper we present a scalable FPGA implementation of such
invertible p-circuits. We implement a "weighted" p-bit that combines stochastic
units with localized memory structures. We also present a generalized tile of
weighted p-bits to which a large class of problems beyond invertible Boolean
logic can be mapped, and how invertibility can be applied to interesting
problems such as the NP-complete Subset Sum Problem by solving a small instance
of this problem in hardware
Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions
In the past decade, Convolutional Neural Networks (CNNs) have demonstrated
state-of-the-art performance in various Artificial Intelligence tasks. To
accelerate the experimentation and development of CNNs, several software
frameworks have been released, primarily targeting power-hungry CPUs and GPUs.
In this context, reconfigurable hardware in the form of FPGAs constitutes a
potential alternative platform that can be integrated in the existing deep
learning ecosystem to provide a tunable balance between performance, power
consumption and programmability. In this paper, a survey of the existing
CNN-to-FPGA toolflows is presented, comprising a comparative study of their key
characteristics which include the supported applications, architectural
choices, design space exploration methods and achieved performance. Moreover,
major challenges and objectives introduced by the latest trends in CNN
algorithmic research are identified and presented. Finally, a uniform
evaluation methodology is proposed, aiming at the comprehensive, complete and
in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal,
201
Electronic systems for the restoration of the sense of touch in upper limb prosthetics
In the last few years, research on active prosthetics for upper limbs focused
on improving the human functionalities and the control. New methods have
been proposed for measuring the user muscle activity and translating it into
the prosthesis control commands. Developing the feed-forward interface so
that the prosthesis better follows the intention of the user is an important
step towards improving the quality of life of people with limb amputation.
However, prosthesis users can neither feel if something or someone is
touching them over the prosthesis and nor perceive the temperature or
roughness of objects. Prosthesis users are helped by looking at an object,
but they cannot detect anything otherwise. Their sight gives them most
information. Therefore, to foster the prosthesis embodiment and utility,
it is necessary to have a prosthetic system that not only responds to the
control signals provided by the user, but also transmits back to the user
the information about the current state of the prosthesis.
This thesis presents an electronic skin system to close the loop in prostheses
towards the restoration of the sense of touch in prosthesis users. The
proposed electronic skin system inlcudes an advanced distributed sensing
(electronic skin), a system for (i) signal conditioning, (ii) data acquisition,
and (iii) data processing, and a stimulation system. The idea is to integrate
all these components into a myoelectric prosthesis.
Embedding the electronic system and the sensing materials is a critical issue
on the way of development of new prostheses. In particular, processing
the data, originated from the electronic skin, into low- or high-level information
is the key issue to be addressed by the embedded electronic system.
Recently, it has been proved that the Machine Learning is a promising
approach in processing tactile sensors information. Many studies have
been shown the Machine Learning eectiveness in the classication of input
touch modalities.More specically, this thesis is focused on the stimulation system, allowing
the communication of a mechanical interaction from the electronic skin
to prosthesis users, and the dedicated implementation of algorithms for
processing tactile data originating from the electronic skin. On system
level, the thesis provides design of the experimental setup, experimental
protocol, and of algorithms to process tactile data. On architectural level,
the thesis proposes a design
ow for the implementation of digital circuits
for both FPGA and integrated circuits, and techniques for the power
management of embedded systems for Machine Learning algorithms
Approximate FPGA-based LSTMs under Computation Time Constraints
Recurrent Neural Networks and in particular Long Short-Term Memory (LSTM)
networks have demonstrated state-of-the-art accuracy in several emerging
Artificial Intelligence tasks. However, the models are becoming increasingly
demanding in terms of computational and memory load. Emerging latency-sensitive
applications including mobile robots and autonomous vehicles often operate
under stringent computation time constraints. In this paper, we address the
challenge of deploying computationally demanding LSTMs at a constrained time
budget by introducing an approximate computing scheme that combines iterative
low-rank compression and pruning, along with a novel FPGA-based LSTM
architecture. Combined in an end-to-end framework, the approximation method's
parameters are optimised and the architecture is configured to address the
problem of high-performance LSTM execution in time-constrained applications.
Quantitative evaluation on a real-life image captioning application indicates
that the proposed methods required up to 6.5x less time to achieve the same
application-level accuracy compared to a baseline method, while achieving an
average of 25x higher accuracy under the same computation time constraints.Comment: Accepted at the 14th International Symposium in Applied
Reconfigurable Computing (ARC) 201
- …