808 research outputs found
A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems
Recent technological advances have greatly improved the performance and
features of embedded systems. With the number of just mobile devices now
reaching nearly equal to the population of earth, embedded systems have truly
become ubiquitous. These trends, however, have also made the task of managing
their power consumption extremely challenging. In recent years, several
techniques have been proposed to address this issue. In this paper, we survey
the techniques for managing power consumption of embedded systems. We discuss
the need of power management and provide a classification of the techniques on
several important parameters to highlight their similarities and differences.
This paper is intended to help the researchers and application-developers in
gaining insights into the working of power management techniques and designing
even more efficient high-performance embedded systems of tomorrow
Architecture and Design of Medical Processor Units for Medical Networks
This paper introduces analogical and deductive methodologies for the design
medical processor units (MPUs). From the study of evolution of numerous earlier
processors, we derive the basis for the architecture of MPUs. These specialized
processors perform unique medical functions encoded as medical operational
codes (mopcs). From a pragmatic perspective, MPUs function very close to CPUs.
Both processors have unique operation codes that command the hardware to
perform a distinct chain of subprocesses upon operands and generate a specific
result unique to the opcode and the operand(s). In medical environments, MPU
decodes the mopcs and executes a series of medical sub-processes and sends out
secondary commands to the medical machine. Whereas operands in a typical
computer system are numerical and logical entities, the operands in medical
machine are objects such as such as patients, blood samples, tissues, operating
rooms, medical staff, medical bills, patient payments, etc. We follow the
functional overlap between the two processes and evolve the design of medical
computer systems and networks.Comment: 17 page
Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions
In the past decade, Convolutional Neural Networks (CNNs) have demonstrated
state-of-the-art performance in various Artificial Intelligence tasks. To
accelerate the experimentation and development of CNNs, several software
frameworks have been released, primarily targeting power-hungry CPUs and GPUs.
In this context, reconfigurable hardware in the form of FPGAs constitutes a
potential alternative platform that can be integrated in the existing deep
learning ecosystem to provide a tunable balance between performance, power
consumption and programmability. In this paper, a survey of the existing
CNN-to-FPGA toolflows is presented, comprising a comparative study of their key
characteristics which include the supported applications, architectural
choices, design space exploration methods and achieved performance. Moreover,
major challenges and objectives introduced by the latest trends in CNN
algorithmic research are identified and presented. Finally, a uniform
evaluation methodology is proposed, aiming at the comprehensive, complete and
in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal,
201
Design and Validation of a Software Defined Radio Testbed for DVB-T Transmission
This paper describes the design and validation of a Software Defined Radio (SDR) testbed, which can be used for Digital Television transmission using the Digital Video Broadcasting - Terrestrial (DVB-T) standard. In order to generate a DVB-T-compliant signal with low computational complexity, we design an SDR architecture that uses the C/C++ language and exploits multithreading and vectorized instructions. Then, we transmit the generated DVB-T signal in real time, using a common PC equipped with multicore central processing units (CPUs) and a commercially available SDR modem board. The proposed SDR architecture has been validated using fixed TV sets, and portable receivers. Our results show that the proposed SDR architecture for DVB-T transmission is a low-cost low-complexity solution that, in the worst case, only requires less than 22% of CPU load and less than 170 MB of memory usage, on a 3.0 GHz Core i7 processor. In addition, using the same SDR modem board, we design an off-line software receiver that also performs time synchronization and carrier frequency offset estimation and compensation
DyPS: Dynamic Processor Switching for Energy-Aware Video Decoding on Multi-core SoCs
In addition to General Purpose Processors (GPP), Multicore SoCs equipping
modern mobile devices contain specialized Digital Signal Processor designed
with the aim to provide better performance and low energy consumption
properties. However, the experimental measurements we have achieved revealed
that system overhead, in case of DSP video decoding, causes drastic
performances drop and energy efficiency as compared to the GPP decoding. This
paper describes DyPS, a new approach for energy-aware processor switching (GPP
or DSP) according to the video quality . We show the pertinence of our solution
in the context of adaptive video decoding and describe an implementation on an
embedded Linux operating system with the help of the GStreamer framework. A
simple case study showed that DyPS achieves 30% energy saving while sustaining
the decoding performanc
Profile Guided Dataflow Transformation for FPGAs and CPUs
This paper proposes a new high-level approach for optimising field programmable gate array (FPGA) designs. FPGA designs are commonly implemented in low-level hardware description languages (HDLs), which lack the abstractions necessary for identifying opportunities for significant performance improvements. Using a computer vision case study, we show that modelling computation with dataflow abstractions enables substantial restructuring of FPGA designs before lowering to the HDL level, and also improve CPU performance. Using the CPU transformations, runtime is reduced by 43 %. Using the FPGA transformations, clock frequency is increased from 67MHz to 110MHz. Our results outperform commercial low-level HDL optimisations, showcasing dataflow program abstraction as an amenable computation model for highly effective FPGA optimisation
Design of multimedia processor based on metric computation
Media-processing applications, such as signal processing, 2D and 3D graphics
rendering, and image compression, are the dominant workloads in many embedded
systems today. The real-time constraints of those media applications have
taxing demands on today's processor performances with low cost, low power and
reduced design delay. To satisfy those challenges, a fast and efficient
strategy consists in upgrading a low cost general purpose processor core. This
approach is based on the personalization of a general RISC processor core
according the target multimedia application requirements. Thus, if the extra
cost is justified, the general purpose processor GPP core can be enforced with
instruction level coprocessors, coarse grain dedicated hardware, ad hoc
memories or new GPP cores. In this way the final design solution is tailored to
the application requirements. The proposed approach is based on three main
steps: the first one is the analysis of the targeted application using
efficient metrics. The second step is the selection of the appropriate
architecture template according to the first step results and recommendations.
The third step is the architecture generation. This approach is experimented
using various image and video algorithms showing its feasibility
- âŠ