2,927 research outputs found
Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device
Currently, most designers face a daunting task to
research different design flows and learn the intricacies of
specific software from various manufacturers in
hardware/software co-design. An urgent need of creating a
scalable hardware/software co-design platform has become a key
strategic element for developing hardware/software integrated
systems. In this paper, we propose a new design flow for building
a scalable co-design platform on FPGA-based system-on-chip.
We employ an integrated approach to implement a histogram
oriented gradients (HOG) and a support vector machine (SVM)
classification on a programmable device for pedestrian tracking.
Not only was hardware resource analysis reported, but the
precision and success rates of pedestrian tracking on nine open
access image data sets are also analysed. Finally, our proposed
design flow can be used for any real-time image processingrelated
products on programmable ZYNQ-based embedded
systems, which benefits from a reduced design time and provide a
scalable solution for embedded image processing products
Design Methodology for Face Detection Acceleration
A design methodology to accelerate the face
detection for embedded systems is described, starting from high
level (algorithm optimization) and ending with low level
(software and hardware codesign) by addressing the issues and
the design decisions made at each level based on the performance
measurements and system limitations. The implemented
embedded face detection system consumes very little power
compared with the traditional PC software implementations
while maintaining the same detection accuracy. The proposed
face detection acceleration methodology is suitable for real time
applications.Ministerio español de Ciencia y Tecnología TEC2011-24319Junta de Andalucía FEDER P08-TIC-0367
A scalable, portable, FPGA-based implementation of the Unscented Kalman Filter
Sustained technological progress has come to a point where robotic/autonomous systems may well soon become ubiquitous. In order for these systems to actually be useful, an increase in autonomous capability is necessary for aerospace, as well as other, applications. Greater aerospace autonomous capability means there is a need for high performance state estimation. However, the desire to reduce costs through simplified development processes and compact form factors can limit performance. A hardware-based approach, such as using a Field Programmable Gate Array (FPGA), is common when high performance is required, but hardware approaches tend to have a more complicated development process when compared to traditional software approaches; greater development complexity, in turn, results in higher costs. Leveraging the advantages of both hardware-based and software-based approaches, a hardware/software (HW/SW) codesign of the Unscented Kalman Filter (UKF), based on an FPGA, is presented. The UKF is split into an application-specific part, implemented in software to retain portability, and a non-application-specific part, implemented in hardware as a parameterisable IP core to increase performance. The codesign is split into three versions (Serial, Parallel and Pipeline) to provide flexibility when choosing the balance between resources and performance, allowing system designers to simplify the development process. Simulation results demonstrating two possible implementations of the design, a nanosatellite application and a Simultaneous Localisation and Mapping (SLAM) application, are presented. These results validate the performance of the HW/SW UKF and demonstrate its portability, particularly in small aerospace systems. Implementation (synthesis, timing, power) details for a variety of situations are presented and analysed to demonstrate how the HW/SW codesign can be scaled for any application
Intelligent Embedded Software: New Perspectives and Challenges
Intelligent embedded systems (IES) represent a novel and promising generation of embedded systems (ES). IES have the capacity of reasoning about their external environments and adapt their behavior accordingly. Such systems are situated in the intersection of two different branches that are the embedded computing and the intelligent computing. On the other hand, intelligent embedded software (IESo) is becoming a large part of the engineering cost of intelligent embedded systems. IESo can include some artificial intelligence (AI)-based systems such as expert systems, neural networks and other sophisticated artificial intelligence (AI) models to guarantee some important characteristics such as self-learning, self-optimizing and self-repairing. Despite the widespread of such systems, some design challenging issues are arising. Designing a resource-constrained software and at the same time intelligent is not a trivial task especially in a real-time context. To deal with this dilemma, embedded system researchers have profited from the progress in semiconductor technology to develop specific hardware to support well AI models and render the integration of AI with the embedded world a reality
HW-Flow: A Multi-Abstraction Level HW-CNN Codesign Pruning Methodology
Convolutional neural networks (CNNs) have produced unprecedented accuracy for many computer vision problems in the recent past. In power and compute-constrained embedded platforms, deploying modern CNNs can present many challenges. Most CNN architectures do not run in real-time due to the high number of computational operations involved during the inference phase. This emphasizes the role of CNN optimization techniques in early design space exploration. To estimate their efficacy in satisfying the target constraints, existing techniques are either hardware (HW) agnostic, pseudo-HW-aware by considering parameter and operation counts, or HW-aware through inflexible hardware-in-the-loop (HIL) setups. In this work, we introduce HW-Flow, a framework for optimizing and exploring CNN models based on three levels of hardware abstraction: Coarse, Mid and Fine. Through these levels, CNN design and optimization can be iteratively refined towards efficient execution on the target hardware platform. We present HW-Flow in the context of CNN pruning by augmenting a reinforcement learning agent with key metrics to understand the influence of its pruning actions on the inference hardware. With 2× reduction in energy and latency, we prune ResNet56, ResNet50, and DeepLabv3 with minimal accuracy degradation on the CIFAR-10, ImageNet, and CityScapes datasets, respectively
- …