47 research outputs found

    Thread-level Parallelism in Fault Simulation of Deep Neural Networks on Multi-Processor Systems

    Get PDF
    High-performance fault simulation is one of the essential and preliminary tasks in the process of online and offline testing of machine learning (ML) hardware. Deep neural networks (DNN), as one of the essential parts of ML programs, are widely used in many critical and non-critical applications in Systems-on-Chip and ASIC designs. Through fault simulation for DNNs, by increasing the number of neurons, the fault simulation time increases exponentially. However, the software architecture of neural networks and the lack of dependency between neurons in each inference layer provide significant opportunity for parallelism of the fault simulation time in a multi-processor platform. In this paper, a multi-thread technique for hierarchical fault simulation of neural network is proposed, targeting both permanent and transient faults. During the process of fault simulation the neurons for each inference layer will be distributed among the executing threads. Since in the process of hierarchical fault simulation, the faulty neuron demands proportionally enormous computation comparing to behavioural model of non-faulty neurons, the faulty neuron will be assigned to one thread while the rest of the neurons will be divided among the remaining threads. Experimental results confirm the time efficiency of the proposed fault simulation technique on multi-processor architectures

    Comparison of Linear and Nonlinear Methods for Distributed Control of a Hierarchical Formation of UAVs

    Get PDF
    A key problem in cooperative robotics is the maintenance of a geometric configuration during movement. As a solution for this, a multi-layered and distributed control system is proposed for the swarm of drones in the formation of hierarchical levels based on the leader & x2013;follower approach. The complexity of developing a large system can be reduced in this way. To ensure the tracking performance and response time of the ensemble system, nonlinear and linear control designs are presented; (a) Sliding Mode Control connected with Proportional-Derivative controller and (b) Linear Quadratic Regular with integral action respectively. The safe travel distance strategy for collision avoidance is introduced and integrated into the control designs for maintaining the hierarchical states in the formation. Both designs provide a rapid adoption with respect to their settling time without introducing oscillations for the dynamic flight movement of vehicles in the cases of (a) nominal, (b) plant-model mismatch, and (c) external disturbance inputs. Also, the nominal settling time of the swarm is improved by 44 & x0025; on average when using the nonlinear method as compared to the linear method. Furthermore, the proposed methods are fully distributed so that each UAV autonomously performs the feedback laws in order to achieve better modularity and scalability

    FORESAIL-1 cubesat mission to measure radiation belt losses and demonstrate de-orbiting

    Get PDF
    Abstract Today, the near-Earth space is facing a paradigm change as the number of new spacecraft is literally sky-rocketing. Increasing numbers of small satellites threaten the sustainable use of space, as without removal, space debris will eventually make certain critical orbits unusable. A central factor affecting small spacecraft health and leading to debris is the radiation environment, which is unpredictable due to an incomplete understanding of the near-Earth radiation environment itself and its variability driven by the solar wind and outer magnetosphere. This paper presents the FORESAIL-1 nanosatellite mission, having two scientific and one technological objectives. The first scientific objective is to measure the energy and flux of energetic particle loss to the atmosphere with a representative energy and pitch angle resolution over a wide range of magnetic local times. To pave the way to novel model - in situ data comparisons, we also show preliminary results on precipitating electron fluxes obtained with the new global hybrid-Vlasov simulation Vlasiator. The second scientific objective of the FORESAIL-1 mission is to measure energetic neutral atoms (ENAs) of solar origin. The solar ENA flux has the potential to contribute importantly to the knowledge of solar eruption energy budget estimations. The technological objective is to demonstrate a satellite de-orbiting technology, and for the first time, make an orbit manoeuvre with a propellantless nanosatellite. FORESAIL-1 will demonstrate the potential for nanosatellites to make important scientific contributions as well as promote the sustainable utilisation of space by using a cost-efficient de-orbiting technology.Peer reviewe

    Hierarchical Fault Simulation of Deep Neural Networks on Multi-Core Systems

    Get PDF
    In this paper, a hierarchical fault simulation technique for neural networks is proposed, supporting both permanent and temporary faults. In the proposed technique, different levels of hierarchy are used, forming a mixed-level simulation environment. In such an environment, the pre-synthesis behavioral specification of the network and the post-synthesis gate-level model are co-simulated. To accelerate the fault simulation process, faults are injected in the gate-level specification of the selected neurons while the behavioral model in different levels of abstraction is used to simulate the remaining neurons. Further speedup is obtained through event-driven simulation and parallelization. Experimental results confirm the time efficiency of the proposed fault simulation technique

    Thermal-Cycling-aware Dynamic Reliability Management in Many-Core System-on-Chip

    Get PDF
    Dynamic Reliability Management (DRM) is a common approach to mitigate aging and wear-out effects in multi- /many-core systems. State-of-the-art DRM approaches apply finegrained control on resource management to increase/balance the chip reliability while considering other system constraints, e.g., performance, and power budget. Such approaches, acting on various knobs such as workload mapping and scheduling, Dynamic Voltage/Frequency Scaling (DVFS) and Per-Core Power Gating (PCPG), demonstrated to work properly with the various aging mechanisms, such as electromigration, and Negative-Bias Temperature Instability (NBTI). However, we claim that they do not suffice for thermal cycling. Thus, we here propose a novel thermal-cycling-aware DRM approach for shared-memory many-core systems running multi-threaded applications. The approach applies a fine-grained control capable at reducing both temperature levels and variations. The experimental evaluations demonstrated that the proposed approach is able to achieve 39% longer lifetime than past approaches

    Pipelined Bidirectional Bus Architecture for Embedded Multimedia SoCs

    No full text

    Energy-Efficient Mobile Robot Control via Run-time Monitoring of Environmental Complexity and Computing Workload

    Get PDF
    We propose an energy-efficient controller to minimize the energy consumption of a mobile robot by dynamically manipulating the mechanical and computational actuators of the robot. The mobile robot performs real-time vision-based applications based on an event-based camera. The actuators of the controller are CPU voltage/frequency for the computation part and motor voltage for the mechanical part. We show that independently considering speed control of the robot and voltage/frequency control of the CPU does not necessarily result in an energy-efficient solution. In fact, to obtain the highest efficiency, the computation and mechanical parts should be controlled together in synergy. We propose a fast hill-climbing optimization algorithm to allow the controller to find the best CPU/motor configuration at run-time and whenever the mobile robot is facing a new environment during its travel. Experimental results on a robot with Brushless DC Motors, Jetson TX2 board as the computing unit, and a DAVIS-346 event-based camera show that the proposed control algorithm can save battery energy by an average of 50.5%, 41%, and 30%, in low-complexity, medium-complexity, and high-complexity environments, over baselines

    2014 International Symposium on Fundamentals of Electrical Engineering (ISFEE)

    No full text
    An Active Suspension System has the capacity to introduce, accumulate, and disperse energy to the system. Depending on the functional circumstances, the system may vary its parameters. This paper seeks to explain the designing of an Active Suspension System for heavy vehicles in the form of a case study and is focused on three methodological approaches: Proportional Integral Derivative control, Linear Quadratic Regulator control, and chattering free Sliding Mode Control. The findings should make an important contribution to the field of automation and control engineering. The upshots are also accentuated to evaluate the performances of control designs

    Dynamic resource-aware corner detection for bio-inspired vision sensors

    Get PDF
    Event-based cameras are vision devices that transmit only brightness changes with low latency and ultra-low power consumption. Such characteristics make event-based cameras attractive in the field of localization and object tracking in resource-constrained systems. Since the number of generated events in such cameras is huge, the selection and filtering of the incoming events are beneficial from both increasing the accuracy of the features and reducing the computational load. In this paper, we present an algorithm to detect asynchronous corners form a stream of events in real-time on embedded systems. The algorithm is called the Three Layer Filtering-Harris or TLF-Harris algorithm. The algorithm is based on an events' filtering strategy whose purpose is 1) to increase the accuracy by deliberately eliminating some incoming events, i.e., noise and 2) to improve the real-time performance of the system, i.e., preserving a constant throughput in terms of input events per second, by discarding unnecessary events with a limited accuracy loss. An approximation of the Harris algorithm, in turn, is used to exploit its high-quality detection capability with a low-complexity implementation to enable seamless real-time performance on embedded computing platforms. The proposed algorithm is capable of selecting the best corner candidate among neighbors and achieves an average execution time savings of 59% compared with the conventional Harris score. Moreover, our approach outperforms the competing methods, such as eFAST, eHarris, and FA-Harris, in terms of real-time performance, and surpasses Arc* in terms of accuracy
    corecore