9 research outputs found

    Recent Advances in Embedded Computing, Intelligence and Applications

    Get PDF
    The latest proliferation of Internet of Things deployments and edge computing combined with artificial intelligence has led to new exciting application scenarios, where embedded digital devices are essential enablers. Moreover, new powerful and efficient devices are appearing to cope with workloads formerly reserved for the cloud, such as deep learning. These devices allow processing close to where data are generated, avoiding bottlenecks due to communication limitations. The efficient integration of hardware, software and artificial intelligence capabilities deployed in real sensing contexts empowers the edge intelligence paradigm, which will ultimately contribute to the fostering of the offloading processing functionalities to the edge. In this Special Issue, researchers have contributed nine peer-reviewed papers covering a wide range of topics in the area of edge intelligence. Among them are hardware-accelerated implementations of deep neural networks, IoT platforms for extreme edge computing, neuro-evolvable and neuromorphic machine learning, and embedded recommender systems

    Finding Unexpected Events in Staring Continuous-Dwell Sensor Data Streams Via Adaptive Prediction

    Get PDF
    This research produced a Predictive Anomaly Detector (PAD). It is an adaptive prediction-based approach to detecting unexpected events in data streams drawn from staring continuous-dwell sensors. The underlying technology is spectrum independent and does not depend on correlated data (neither temporal nor spatial) to achieve improved detection and extraction in highly robust environments. ( robust environment refers to the data stream\u27s control law being variable and the spectral content covering a wide range of wavelengths.) The resulting approach uses a network of simple building-block equations (basis functions) to predict the non-event data and thereby present subtle sub-streams to a detection model as potential events of interest. The prediction model is automatically created from sequential observations of the data stream. Once model construction is complete, it continues to evolve as new samples arrive. Each sample value that is sufficiently different from the model\u27s predicted value is postulated as an unexpected event. A subsequent detection model uses a set of rules to confirm unexpected events while ignoring outliers. Intruder detection in robust video scenes is the main focus, although one demonstration achieved voice detection in a noisy audio signal. These demonstrations are coupled to a concept of operations that emphasizes the spectrum-independence of this approach and its integration with other processing requirements such as target recognition and tracking. Primary benefits delivered by this work include the ability to process large data volumes for obscured or buried information within highly active environments. The fully automated nature of this technique helps mitigate manning shortfalls typically associated with sorting through large volumes of surveillance data using trained analysts. This approach enables an organization to perform automated cueing for these analysts so that they spend less time examining data where nothing of interest exists. This maximizes the value of skilled personnel by using them to assess data with true potential. In this way, larger data volumes can be processed in a shorter period of time leading to a higher likelihood that important events and signals will be found, analyzed, and acted upon

    Contributions Ă  l'optimisation de programmes et Ă  la synthĂšse de circuits haut-niveau

    Get PDF
    Since the end of Dennard scaling, power efficiency is the limiting factor for large-scale computing. Hardware accelerators such as reconfigurable circuits (FPGA, CGRA) or Graphics Processing Units (GPUs) were introduced to improve the performance under a limited energy budget, resulting into complex heterogeneous platforms. This document presents a synthetic description of my research activities over the last decade on compilers for high-performance computing and high-level synthesis of circuits (HLS) for FPGA accelerators. Specifically, my contributions covers both theoretical and practical aspects of automatic parallelization and HLS in a general theoretical framework called the polyhedral model.A first chapter describes our contributions to loop tiling, a key program transformation for automatic parallelization which splits the computation atomic blocks called tiles.We rephrase loop tiling in the polyhedral model to enable any polyhedral tile shape whose size depends on a single parameter (monoparametric tiling), and we present a tiling transformation for programs with reductions – accumulations w.r.t. an associative/commutative operator. Our results open the way for semantic program transformations ; program transformations which does not preserve the computation but still lead to an equivalent program.A second chapter describes our contributions to algorithm recognition. A compiler optimization will never replace a good algorithm, hence the idea to recognize algorithm instances in a program and to substitute them by a call to a performance library. In our PhD thesis, we have addressed the recognition of templates – functionswith first-order variables – into programs and its application to program optimization. We propose a complementary algorithm recognition framework which leverages our monoparametric tiling and our reduction tiling transformations. This automates semantic tiling, a new semantic program transformation which increases the grain of operators (scalar → matrix).A third chapter presents our contributions to the synthesis of communications with an off-chip memory in the context of high-level circuit synthesis (HLS). We propose an execution model based on loop tiling, a pipelined architecture and a source-level compilation algorithm which, connected to the C2H HLS tool from Altera, ends up to a FPGA configuration with minimized data transfers. Our compilation algorithm is optimal – the data are loaded as late as possible and stored as soon as possible with a maximal reuse.A fourth chapter presents our contributions to design a unified polyhedral compilation model for high-level circuit synthesis.We present the Data-aware Process Networks (DPN), a dataflow intermediate representation which leverages the ideas developed in chapter 3 to explicit the data transfers with an off-chip memory. We propose an algorithm to compile a DPN from a sequential program, and we present our contribution to the synthesis of DPN to a circuit. In particular, we present our algorithms to compile the control, the channels and the synchronizations of a DPN. These results are used in the production compiler of the Xtremlogic start-up.Depuis la fin du Dennard scaling, l’efficacitĂ© Ă©nergĂ©tique est le facteur limitant pour le calcul haute performance. Les accĂ©lĂ©rateurs matĂ©riels comme les circuits reconfigurables (FPGA, CGRA) ou les accĂ©lĂ©rateurs graphiques (GPUs) ont Ă©tĂ© introduits pour amĂ©liorer les performances sous un budget Ă©nergĂ©tique limitĂ©, menant Ă  des plateformes hĂ©tĂ©rogĂšnes complexes.Mes travaux de recherche portent sur les compilateurs et la synthĂšse de circuits haut-niveau (High-Level Synthesis, HLS) pour le calcul haute-performance. Specifiquement, mes contributions couvrent les aspects thĂ©oriques etpratiques de la parallĂ©lisation automatique et la HLS dans le cadre gĂ©nĂ©ral du modĂšle polyĂ©drique.Un premier chapitre dĂ©crit mes contributions au tuilage de boucles, une transformation fondamentale pour la parallĂ©lisation automatique, qui dĂ©coupe le calcul en sous-calculs atomiques appelĂ©s tuiles. Nous reformulons le tuilage de boucles dans le modĂšle polyĂ©drique pour permettre n’importe tuile polytopique dont la taille dĂ©pend d’un facteur homothĂ©tique (tuilage monoparamĂ©trique), et nous dĂ©crivons une transformation de tuilage pour des programmes avec des rĂ©ductions – une accumulation selon un opĂ©rateur associative et commutatif. Nos rĂ©sultats ouvrent la voie Ă  des transformations de programme sĂ©mantiques ; qui ne prĂ©servent pas le calcul, mais produisent un programme Ă©quivalent.Un second chapitre dĂ©crit mes contributions Ă  la reconnaissance d’algorithmes. Une optimisation de compilateur ne remplacera jamais un bon algorithme, d’oĂč l’idĂ©e de reconnaĂźtre les instances d’un algorithme dans un programme et de les substituer par un appel vers une bibliothĂšque hauteperformance, chaque fois que c’est possible et utile.Dans notre thĂšse, nous avons traitĂ© la reconnaissance de templates – des fonctions avec des variables d’ordre 1 – dans un programme et son application Ă  l’optimisation de programes. Nous proposons une approche complĂ©mentaire qui s’appuie sur notre tuilage monoparamĂ©trique complĂ©tĂ© par une transformation pour tuiler les rĂ©ductions. Ceci automatise le tuilage sĂ©mantique, une nouvelle transformation sĂ©mantique qui augmente le grain des opĂ©rateurs (scalaire → matrice).Un troisiĂšme chapitre prĂ©sente mes contributions Ă  la synthĂšse des communications avec une mĂ©moire off-chip dans le contexte de la synthĂšse de circuits haut-niveau. Nous proposons un modĂšle d’exĂ©cution basĂ© sur le tuilage de boucles, une architecture pipelinĂ©e et un algorithme de compilation source-Ă -source qui, connectĂ© Ă  l’outil de HLS C2H d’Altera, produit une configuration de circuit FPGA qui rĂ©alise un volume minimal de transferts de donnĂ©es. Notre algorithme est optimal – les donnĂ©es sont chargĂ©es le plus tard possible et stockĂ©es le plus tĂŽt possible, avec une rĂ©utilisation maximale et sans redondances.Enfin, un quatriĂšme chapitre prĂ©sente mes contributions pour construire un modĂšle de compilation polyĂ©drique unifiĂ© pour la synthĂšse de circuits haut-niveau.Nous prĂ©sentons les rĂ©seaux de processus DPN (Data-aware Process Networks), une reprĂ©sentation intermĂ©diaire dataflow qui s’appuie sur les idĂ©es dĂ©veloppĂ©es au chapitre 3 pour expliciter les transferts de donnĂ©es entre le circuit et la mĂ©moire off-chip. Nous proposons une suite d’algorithmes pour compiler un DPN Ă  partir d’un programme sĂ©quentiel et nous prĂ©sentons nos contributions Ă  la synthĂšse d’un DPN en circuit. En particulier, nous prĂ©sentons nos algorithmes pour compiler le contrĂŽle, les canaux et les synchronisations d’un DPN. Ces rĂ©sultats sont utilisĂ©s dans le compilateur de production de la start-up XtremLogic

    University of Windsor Undergraduate Calendar 2001-2002

    Get PDF
    https://scholar.uwindsor.ca/universitywindsorundergraduatecalendars/1009/thumbnail.jp

    Clemson Graduate School Catalog, 2005-2006

    Get PDF
    https://tigerprints.clemson.edu/grad_anncmnt/1019/thumbnail.jp

    College of Arts and Sciences

    Full text link
    Cornell University Courses of Study Vol. 92 2000/200
    corecore