2,927 research outputs found

    Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device

    Get PDF
    Currently, most designers face a daunting task to research different design flows and learn the intricacies of specific software from various manufacturers in hardware/software co-design. An urgent need of creating a scalable hardware/software co-design platform has become a key strategic element for developing hardware/software integrated systems. In this paper, we propose a new design flow for building a scalable co-design platform on FPGA-based system-on-chip. We employ an integrated approach to implement a histogram oriented gradients (HOG) and a support vector machine (SVM) classification on a programmable device for pedestrian tracking. Not only was hardware resource analysis reported, but the precision and success rates of pedestrian tracking on nine open access image data sets are also analysed. Finally, our proposed design flow can be used for any real-time image processingrelated products on programmable ZYNQ-based embedded systems, which benefits from a reduced design time and provide a scalable solution for embedded image processing products

    Design Methodology for Face Detection Acceleration

    Get PDF
    A design methodology to accelerate the face detection for embedded systems is described, starting from high level (algorithm optimization) and ending with low level (software and hardware codesign) by addressing the issues and the design decisions made at each level based on the performance measurements and system limitations. The implemented embedded face detection system consumes very little power compared with the traditional PC software implementations while maintaining the same detection accuracy. The proposed face detection acceleration methodology is suitable for real time applications.Ministerio español de Ciencia y Tecnología TEC2011-24319Junta de Andalucía FEDER P08-TIC-0367

    A scalable, portable, FPGA-based implementation of the Unscented Kalman Filter

    Get PDF
    Sustained technological progress has come to a point where robotic/autonomous systems may well soon become ubiquitous. In order for these systems to actually be useful, an increase in autonomous capability is necessary for aerospace, as well as other, applications. Greater aerospace autonomous capability means there is a need for high performance state estimation. However, the desire to reduce costs through simplified development processes and compact form factors can limit performance. A hardware-based approach, such as using a Field Programmable Gate Array (FPGA), is common when high performance is required, but hardware approaches tend to have a more complicated development process when compared to traditional software approaches; greater development complexity, in turn, results in higher costs. Leveraging the advantages of both hardware-based and software-based approaches, a hardware/software (HW/SW) codesign of the Unscented Kalman Filter (UKF), based on an FPGA, is presented. The UKF is split into an application-specific part, implemented in software to retain portability, and a non-application-specific part, implemented in hardware as a parameterisable IP core to increase performance. The codesign is split into three versions (Serial, Parallel and Pipeline) to provide flexibility when choosing the balance between resources and performance, allowing system designers to simplify the development process. Simulation results demonstrating two possible implementations of the design, a nanosatellite application and a Simultaneous Localisation and Mapping (SLAM) application, are presented. These results validate the performance of the HW/SW UKF and demonstrate its portability, particularly in small aerospace systems. Implementation (synthesis, timing, power) details for a variety of situations are presented and analysed to demonstrate how the HW/SW codesign can be scaled for any application

    Intelligent Embedded Software: New Perspectives and Challenges

    Get PDF
    Intelligent embedded systems (IES) represent a novel and promising generation of embedded systems (ES). IES have the capacity of reasoning about their external environments and adapt their behavior accordingly. Such systems are situated in the intersection of two different branches that are the embedded computing and the intelligent computing. On the other hand, intelligent embedded software (IESo) is becoming a large part of the engineering cost of intelligent embedded systems. IESo can include some artificial intelligence (AI)-based systems such as expert systems, neural networks and other sophisticated artificial intelligence (AI) models to guarantee some important characteristics such as self-learning, self-optimizing and self-repairing. Despite the widespread of such systems, some design challenging issues are arising. Designing a resource-constrained software and at the same time intelligent is not a trivial task especially in a real-time context. To deal with this dilemma, embedded system researchers have profited from the progress in semiconductor technology to develop specific hardware to support well AI models and render the integration of AI with the embedded world a reality

    HW-Flow: A Multi-Abstraction Level HW-CNN Codesign Pruning Methodology

    Get PDF
    Convolutional neural networks (CNNs) have produced unprecedented accuracy for many computer vision problems in the recent past. In power and compute-constrained embedded platforms, deploying modern CNNs can present many challenges. Most CNN architectures do not run in real-time due to the high number of computational operations involved during the inference phase. This emphasizes the role of CNN optimization techniques in early design space exploration. To estimate their efficacy in satisfying the target constraints, existing techniques are either hardware (HW) agnostic, pseudo-HW-aware by considering parameter and operation counts, or HW-aware through inflexible hardware-in-the-loop (HIL) setups. In this work, we introduce HW-Flow, a framework for optimizing and exploring CNN models based on three levels of hardware abstraction: Coarse, Mid and Fine. Through these levels, CNN design and optimization can be iteratively refined towards efficient execution on the target hardware platform. We present HW-Flow in the context of CNN pruning by augmenting a reinforcement learning agent with key metrics to understand the influence of its pruning actions on the inference hardware. With 2× reduction in energy and latency, we prune ResNet56, ResNet50, and DeepLabv3 with minimal accuracy degradation on the CIFAR-10, ImageNet, and CityScapes datasets, respectively
    corecore