728 research outputs found
Instruction Set Extension of a Low-End Reconfigurable Microcontroller in Bit-Sorting Implementation
The microcontroller-based system is currently having a tremendous boost with the revelation of platforms such as the Internet of Things. Low-end families of microcontroller architecture are still in demand albeit less technologically advanced due to its better I/O better application and control. However, there is clearly a lack of computational capability of the low-end architecture that will affect the pre-processing stage of the received data. The purpose of this research is to combine the best feature of an 8-bit microcontroller architecture together with the computationally complex operations without incurring extra resources. The modules’ integration is implemented using instruction set architecture (ISA) extension technique and is developed on the Field Programmable Gate Array (FPGA). Extensive simulations were performed with the and a comprehensive methodology is proposed. It was found that the ISA extension from 12-bit to 16-bit has produced a faster execution time with fewer resource utilization when implementing the bit-sorting algorithm. The overall development process used in this research is flexible enough for further investigation either by extending its module to more complex algorithms or evaluating other designs of its components
Deep Learning-Based Multiple Object Visual Tracking on Embedded System for IoT and Mobile Edge Computing Applications
Compute and memory demands of state-of-the-art deep learning methods are
still a shortcoming that must be addressed to make them useful at IoT
end-nodes. In particular, recent results depict a hopeful prospect for image
processing using Convolutional Neural Netwoks, CNNs, but the gap between
software and hardware implementations is already considerable for IoT and
mobile edge computing applications due to their high power consumption. This
proposal performs low-power and real time deep learning-based multiple object
visual tracking implemented on an NVIDIA Jetson TX2 development kit. It
includes a camera and wireless connection capability and it is battery powered
for mobile and outdoor applications. A collection of representative sequences
captured with the on-board camera, dETRUSC video dataset, is used to exemplify
the performance of the proposed algorithm and to facilitate benchmarking. The
results in terms of power consumption and frame rate demonstrate the
feasibility of deep learning algorithms on embedded platforms although more
effort to joint algorithm and hardware design of CNNs is needed.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Recent Advances in Embedded Computing, Intelligence and Applications
The latest proliferation of Internet of Things deployments and edge computing combined with artificial intelligence has led to new exciting application scenarios, where embedded digital devices are essential enablers. Moreover, new powerful and efficient devices are appearing to cope with workloads formerly reserved for the cloud, such as deep learning. These devices allow processing close to where data are generated, avoiding bottlenecks due to communication limitations. The efficient integration of hardware, software and artificial intelligence capabilities deployed in real sensing contexts empowers the edge intelligence paradigm, which will ultimately contribute to the fostering of the offloading processing functionalities to the edge. In this Special Issue, researchers have contributed nine peer-reviewed papers covering a wide range of topics in the area of edge intelligence. Among them are hardware-accelerated implementations of deep neural networks, IoT platforms for extreme edge computing, neuro-evolvable and neuromorphic machine learning, and embedded recommender systems
Reconfigurable Architectures and Systems for IoT Applications
abstract: Internet of Things (IoT) has become a popular topic in industry over the recent years, which describes an ecosystem of internet-connected devices or things that enrich the everyday life by improving our productivity and efficiency. The primary components of the IoT ecosystem are hardware, software and services. While the software and services of IoT system focus on data collection and processing to make decisions, the underlying hardware is responsible for sensing the information, preprocess and transmit it to the servers. Since the IoT ecosystem is still in infancy, there is a great need for rapid prototyping platforms that would help accelerate the hardware design process. However, depending on the target IoT application, different sensors are required to sense the signals such as heart-rate, temperature, pressure, acceleration, etc., and there is a great need for reconfigurable platforms that can prototype different sensor interfacing circuits.
This thesis primarily focuses on two important hardware aspects of an IoT system: (a) an FPAA based reconfigurable sensing front-end system and (b) an FPGA based reconfigurable processing system. To enable reconfiguration capability for any sensor type, Programmable ANalog Device Array (PANDA), a transistor-level analog reconfigurable platform is proposed. CAD tools required for implementation of front-end circuits on the platform are also developed. To demonstrate the capability of the platform on silicon, a small-scale array of 24×25 PANDA cells is fabricated in 65nm technology. Several analog circuit building blocks including amplifiers, bias circuits and filters are prototyped on the platform, which demonstrates the effectiveness of the platform for rapid prototyping IoT sensor interfaces.
IoT systems typically use machine learning algorithms that run on the servers to process the data in order to make decisions. Recently, embedded processors are being used to preprocess the data at the energy-constrained sensor node or at IoT gateway, which saves considerable energy for transmission and bandwidth. Using conventional CPU based systems for implementing the machine learning algorithms is not energy-efficient. Hence an FPGA based hardware accelerator is proposed and an optimization methodology is developed to maximize throughput of any convolutional neural network (CNN) based machine learning algorithm on a resource-constrained FPGA.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
FPGA based technical solutions for high throughput data processing and encryption for 5G communication: A review
The field programmable gate array (FPGA) devices are ideal solutions for high-speed processing applications, given their flexibility, parallel processing capability, and power efficiency. In this review paper, at first, an overview of the key applications of FPGA-based platforms in 5G networks/systems is presented, exploiting the improved performances offered by such devices. FPGA-based implementations of cloud radio access network (C-RAN) accelerators, network function virtualization (NFV)-based network slicers, cognitive radio systems, and multiple input multiple output (MIMO) channel characterizers are the main considered applications that can benefit from the high processing rate, power efficiency and flexibility of FPGAs. Furthermore, the implementations of encryption/decryption algorithms by employing the Xilinx Zynq Ultrascale+MPSoC ZCU102 FPGA platform are discussed, and then we introduce our high-speed and lightweight implementation of the well-known AES-128 algorithm, developed on the same FPGA platform, and comparing it with similar solutions already published in the literature. The comparison results indicate that our AES-128 implementation enables efficient hardware usage for a given data-rate (up to 28.16 Gbit/s), resulting in higher efficiency (8.64 Mbps/slice) than other considered solutions. Finally, the applications of the ZCU102 platform for high-speed processing are explored, such as image and signal processing, visual recognition, and hardware resource management
Hardware Implementations of a Deep Learning Approach to Optimal Configuration of Reconfigurable Intelligence Surfaces
Reconfigurable intelligent surfaces (RIS) offer the potential to customize the radio propagation environment for wireless networks, and will be a key element for 6G communications. However, due to the unique constraints in these systems, the optimization problems associated to RIS configuration are challenging to solve. This paper illustrates a new approach to the RIS configuration problem, based on the use of artificial intelligence (AI) and deep learning (DL) algorithms. Concretely, a custom convolutional neural network (CNN) intended for edge computing is presented, and implementations on different representative edge devices are compared, including the use of commercial AI-oriented devices and a field-programmable gate array (FPGA) platform. This FPGA option provides the best performance, with x20 performance increase over the closest FP32, GPU-accelerated option, and almost x3 performance advantage when compared with the INT8-quantized, TPU-accelerated implementation. More noticeably, this is achieved even when high-level synthesis (HLS) tools are used and no custom accelerators are developed. At the same time, the inherent reconfigurability of FPGAs opens a new field for their use as enabler hardware in RIS applications.This work is part of the project TED2021-129938B-I00, funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR
Deploying RIOT operating system on a reconfigurable Internet of Things end-device
Dissertação de mestrado integrado em Engenharia Eletrónica Industrial e ComputadoresThe Internet of Everything (IoE) is enabling the connection of an infinity of
physical objects to the Internet, and has the potential to connect every single
existing object in the world. This empowers a market with endless opportunities
where the big players are forecasting, by 2020, more than 50 billion connected
devices, representing an 8 trillion USD market.
The IoE is a broad concept that comprises several technological areas and will
certainly, include more in the future. Some of those already existing fields are the
Internet of Energy related with the connectivity of electrical power grids, Internet
of Medical Things (IoMT), for instance, enables patient monitoring, Internet of
Industrial Things (IoIT), which is dedicated to industrial plants, and the Internet
of Things (IoT) that focus on the connection of everyday objects (e.g. home
appliances, wearables, transports, buildings, etc.) to the Internet.
The diversity of scenarios where IoT can be deployed, and consequently the
different constraints associated to each device, leads to a heterogeneous network
composed by several communication technologies and protocols co-existing on the
same physical space. Therefore, the key requirements of an IoT network are
the connectivity and the interoperability between devices. Such requirement is
achieved by the adoption of standard protocols and a well-defined lightweight network
stack. Due to the adoption of a standard network stack, the data processed
and transmitted between devices tends to increase. Because most of the devices
connected are resource constrained, i.e., low memory, low processing capabilities,
available energy, the communication can severally decrease the device’s performance.
Hereupon, to tackle such issues without sacrificing other important requirements,
this dissertation aims to deploy an operating system (OS) for IoT, the
RIOT-OS, while providing a study on how network-related tasks can benefit from
hardware accelerators (deployed on reconfigurable technology), specially designed
to process and filter packets received by an IoT device.O conceito Internet of Everything (IoE) permite a conexão de uma infinidade
de objetos à Internet e tem o potencial de conectar todos os objetos existentes no
mundo. Favorecendo assim o aparecimento de novos mercados e infinitas possibilidades,
em que os grandes intervenientes destes mercados preveem até 2020 a
conexão de mais de 50 mil milhões de dispositivos, representando um mercado de
8 mil milhões de dólares.
IoE é um amplo conceito que inclui várias áreas tecnológicas e irá certamente
incluir mais no futuro. Algumas das áreas já existentes são: a Internet of Energy
relacionada com a conexão de redes de transporte e distribuição de energia à
Internet; Internet of Medical Things (IoMT), que possibilita a monotorização de
pacientes; Internet of Industrial Things (IoIT), dedicada a instalações industriais
e a Internet of Things (IoT), que foca na conexão de objetos do dia-a-dia (e.g.
eletrodomésticos, wearables, transportes, edifícios, etc.) à Internet.
A diversidade de cenários à qual IoT pode ser aplicado, e consequentemente,
as diferentes restrições aplicadas a cada dispositivo, levam à criação de uma rede
heterogénea composto por diversas tecnologias de comunicação e protocolos a coexistir
no mesmo espaço físico. Desta forma, os requisitos chave aplicados às redes
IoT são a conectividade e interoperabilidade entre dispositivos. Estes requisitos
são atingidos com a adoção de protocolos standard e pilhas de comunicação bem
definidas. Com a adoção de pilhas de comunicação standard, a informação processada
e transmitida entre dispostos tende a aumentar. Visto que a maioria dos
dispositivos conectados possuem escaços recursos, i.e., memória reduzida, baixa
capacidade de processamento, pouca energia disponível, o aumento da capacidade
de comunicação pode degradar o desempenho destes dispositivos.
Posto isto, para lidar com estes problemas e sem sacrificar outros requisitos importantes,
esta dissertação pretende fazer o porting de um sistema operativo IoT,
o RIOT, para uma solução reconfigurável, o CUTE mote. O principal objetivo
consiste na realização de um estudo sobre os benefícios que as tarefas relacionadas
com as camadas de rede podem ter ao serem executadas em hardware via aceleradores
dedicados. Estes aceleradores são especialmente projetados para processar
e filtrar pacotes de dados provenientes de uma interface radio em redes IoT periféricas
An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics
Near-sensor data analytics is a promising direction for IoT endpoints, as it
minimizes energy spent on communication and reduces network load - but it also
poses security concerns, as valuable data is stored or sent over the network at
various stages of the analytics pipeline. Using encryption to protect sensitive
data at the boundary of the on-chip analytics engine is a way to address data
security issues. To cope with the combined workload of analytics and encryption
in a tight power envelope, we propose Fulmine, a System-on-Chip based on a
tightly-coupled multi-core cluster augmented with specialized blocks for
compute-intensive data processing and encryption functions, supporting software
programmability for regular computing tasks. The Fulmine SoC, fabricated in
65nm technology, consumes less than 20mW on average at 0.8V achieving an
efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to
25MIPS/mW in software. As a strong argument for real-life flexible application
of our platform, we show experimental results for three secure analytics use
cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN
consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with
secured remote recognition in 5.74pJ/op; and seizure detection with encrypted
data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE
Transactions on Circuits and Systems - I: Regular Paper
- …