32 research outputs found
Real-Time Power-Efficient Integration of Multi-Sensor Occupancy Grid on Many-Core
International audienceSafe Autonomous Vehicles (AVs) will emerge when comprehensive perception systems will be successfully integrated into vehicles. Advanced perception algorithms, estimating the position and speed of every obstacle in the environment by using data fusion from multiple sensors, were developed for AV prototypes. Computational requirements of such application prevent their integration into AVs on current low-power embedded hardware. However, recent emerging many-core architectures offer opportunities to fulfill the automotive market constraints and efficiently support advanced perception applications. This paper, explores the integration of the occupancy grid multi-sensor fusion algorithm into low power many-core architectures. The parallel properties of this function are used to achieve real-time performance at low-power consumption. The proposed implementation achieves an execution time of 6.26ms, 6× faster than typical sensor output rates and 9× faster than previous embedded prototypes
Optimal and robust control for a small-area FLL
International audienceFine-grain Dynamic Voltage and Frequency Scaling (DVFS) is becoming a requirement for Globally-Asynchronous Locally-Synchronous (GALS) architectures. However, the area overhead of adding voltage and frequency control engines in each voltage and frequency island must be taken into account to optimize the circuit. A small-area fast-reprogrammable Frequency-Locked Loop (FLL) engine is a suited option, since its implementation in 32nm represents 0.0016mm 2, being 4 to 20 times smaller than classical techniques used such as a Phase-Locked Loop (PLL) in the same technology. Another relevant aspect with respect to the FLL is the control design, which must be suited for low area hardware. In this paper, an analytical model of the system is deduced from accurate Spice simulations. It also takes into account the delay introduced by the sensor. From this model, an optimal and robust control law with a minimum implementation area is developed. The closed-loop system stability is also ensured
Energy Management via PI Control for Data Parallel Applications with Throughput Constraints
International audienceThis paper presents a new proportional-integral (PI) controller that sets the operating point of computing tiles in a system on chip (SoC). We address data-parallel applications with throughput constraints. The controller settings are investigated for application configurations with different QoS levels and different buffer sizes. The control method is evaluated on a test chip with four tiles executing a realistic HMAX object recognition application. Experimental results suggest that the proposed controller outperforms the state-of-the-art results: it attains, on average, 25% less number of frequency switches and has slightly higher energy savings. The reduction in number of frequency switches is important because it decreases the involved overhead. In addition, the PI controller meets the throughput constraint in cases where other approaches fail
Architecture and Control of a Digital Frequency-Locked Loop for Fine-Grain Dynamic Voltage and Frequency Scaling in Globally Asynchronous Locally Synchronous Structures
International audienceA small area fast-reprogrammable Digital Frequency-Locked Loop (DFLL) engine is presented as a solution for the Dynamic Voltage and Frequency Scaling (DVFS) circuitry in Globally Asynchronous Locally Synchronous (GALS) architectures implemented in 32 nm CMOS technology. The DFLL control is designed so that the closed-loop system is able to cope with process variability while it rejects temperature changes and supply voltage slow variations. Therefore the DFLL is made of three main blocks, namely a Digitally Controlled Oscillator (DCO), a "sensor" that measures the frequency of the signal at the output of the DCO and a controller. A strong emphasis is set on the loop filter architecture choice and the tuning of its parameters. An analytical model of the DCO is deduced from accurate Spice simulations. The delay introduced by the sensor is also taken into account to design. From these models, an optimal and robust controller with a minimum implementation area is developed. Here, "optimal" means that the controller is computed via the minimization of a given criterion while the "robustness" capability ensures that the closed-loop system is tolerant to process and temperature variations in a given range. Therefore, performances of the closed-loop system are ensured whatever the system characteristics are in a given range
Dynamic and Distributed Optimization of MPSoC Architectures Using Game Theory
La complexité des Systèmes-sur-Puce (SoCs) a exponentiellement augmenté, les technologies de pointe ne garantissent plus la stabilité des paramètres, et les contraintes des applications obligent à améliorer la performance architecturale. Nous considérons des SoCs intégrant plusieurs éléments de traitement (MPSoCs). Ces plates-formes sont conçues pour traiter des applications avec plusieurs contraintes, telles que télécom et multimédia. L'adaptabilité dynamique est alors obligatoire pour optimiser la puissance consommée face aux changements du système, par exemple l'évolution d'une application à une autre. L'optimisation distribuée est nécessaire pour assurer l'adaptabilité. L'absence des techniques dynamiques et distribuées pour les MPSoCs nous a conduits à proposer un modèle où chaque processeur peut prendre des décisions locales. Nous employons la théorie des jeux pour décrire l'optimisation distribuée. Les processeurs sont considérés comme joueurs essayant de trouver la meilleure configuration dans un jeu non coopératif, en optimisant plusieurs métriques (performance, puissance, latence, température). La théorie des jeux donne un ensemble de formalismes pour étudier la convergence d'un tel système vers une solution optimale ou quasi-optimale. Basé sur ces concepts, nous présentons une technique innovatrice pour optimiser les MPSoCs d'une manière distribuée et dynamique. Nos analyses vérifient l'adéquation d'un tel modèle pour décrire les MPSoCs. Une implémentation matérielle a été proposée et évaluée. Cette implémentation vise à proposer un bloc distribué et scalable avec une basse complexité capable d'être intégré dans de futures architectures MPSoCs adaptativesComplexity of Systems-on-Chip (SoCs) has exponentially increased, advanced technologies do not guarantee anymore the stability of parameters, and the application requirements oblige to improve architectural performance. We consider SoC integrating several processing elements (MPSoCs). These platforms are designed to support multiple constrained applications, such as telecom and multimedia. Dynamic adaptability is then mandatory to optimize performance and power consumption face to changes in the system, for instance the evolution from one application to another. Distributed optimization is required to provide scalability. The absence of dynamic and distributed adaptive techniques for MPSoCs led us to propose a model where each processor is able to make local decisions. We use Game theory to describe distributed optimization. Processors are considered as players trying to find the best configuration in a non-cooperative game, optimizing several metrics (performance, power consumption, latency, temperature). Game theory gives a set of formalisms to study the convergence of such a system towards an optimal or quasi-optimal solution. Based on these concepts, we present an innovative technique to optimize MPSoCs in a distributed and dynamic way. Our analyses prove the adequacy of such a model to describe MPSoCs. A hardware implementation was proposed and evaluated. This implementation aims at proposing a scalable distributed block with low complexity able to be integrated within future adaptive MPSoC architecturesMONTPELLIER-BU Sciences (341722106) / SudocSudocFranceF
Procédé d'optimisation du fonctionnement d'un circuit intégré multiprocesseurs, et circuit intégré correspondant
L'INVENTION CONCERNE UN PROCEDE D'OPTIMISATION DE FONCTIONNEMENT QUI S'APPLIQUE A UNE PUCE DE CIRCUIT INTEGRE MULTI-PROCESSEUR. CHAQUE PROCESSEUR TRAVAILLE AVEC UN PARAMETRE VARIABLE, PAR EXEMPLE SA FREQUENCE D'HORLOGE, ET L'OPTIMISATION COMPREND LA DETERMINATION EN TEMPS REEL D'AU MOINS UNE DONNEE CARACTERISTIQUE ASSOCIEE AU PROCESSEUR (TEMPERATURE, CONSOMMATION, LATENCE), LE TRANSFERT DES DONNEES CARACTERISTIQUES VERS LES AUTRES PROCESSEURS, LE CALCUL PAR CHAQUE PROCESSEUR DE DIFFERENTES VALEURS D'UNE FONCTION D'OPTIMISATION DEPENDANT DE LA DONNEE CARACTERISTIQUE DU BLOC, DES DONNEES CARACTERISTIQUES DES AUTRES BLOCS, ET DU PARAMETRE VARIABLE, LA FONCTION ETANT CALCULEE POUR LA VALEUR ACTUELLE DE CE PARAMETRE ET POUR D'AUTRES VALEURS POSSIBLES, PUIS LA SELECTION, PARMI LES DIFFERENTES VALEURS DE PARAMETRE, DE CELLE QUI DONNE LA MEILLEURE VALEUR DE LA FONCTION D'OPTIMISATION, ET ENFIN L'APPLICATION DE CETTE FREQUENCE AU PROCESSEUR POUR LA SUITE DE L'EXECUTION DE LA TACHE
A lightweight Steering Algorithm for Smart Scanning
International audienceThis work presents a lightweight steering algorithm specifically designed for active perception sensors (Lidars, Radars or Sonars) that support online configurable scanning beam. The algorithm aims to improve the sensor capabilities in terms of frame rate and accuracy in some regions of interests (ROIs) with a limited power budget typically less than 2 Watts
A lightweight Steering Algorithm for Smart Scanning
International audienceThis work presents a lightweight steering algorithm specifically designed for active perception sensors (Lidars, Radars or Sonars) that support online configurable scanning beam. The algorithm aims to improve the sensor capabilities in terms of frame rate and accuracy in some regions of interests (ROIs) with a limited power budget typically less than 2 Watts
Optimal and Robust Saturated Control for a Clock Generator
International audienceFine-grain Dynamic Voltage and Fre- quency Scaling (DVFS) is becoming a requirement for Globally-Asynchronous Locally-Synchronous (GALS) architectures. However, the area overhead of adding voltage and frequency control engines in each volt- age/frequency island must be taken into account to optimize the circuit. This paper focuses on the control for the frequency actuator. An optimal and robust saturated control law, with a minimum hardware implementation area is proposed for a Clock Gen- erator, taking into account the delay introduced by the sensor. This controller is designed with Lyapunov-Krasovskii theory that ensures asymptotic stability, disturbance rejection as well as system robustness with respect to delay presence and parameter uncertainties. The closed-loop system presents a regional stabilization due to the actuator saturation. An estimation of a maximum attraction domain is provided. The performance achieved with this controller are shown in simulation