3,945 research outputs found

    Survey on Combinatorial Register Allocation and Instruction Scheduling

    Full text link
    Register allocation (mapping variables to processor registers or memory) and instruction scheduling (reordering instructions to increase instruction-level parallelism) are essential tasks for generating efficient assembly code in a compiler. In the last three decades, combinatorial optimization has emerged as an alternative to traditional, heuristic algorithms for these two tasks. Combinatorial optimization approaches can deliver optimal solutions according to a model, can precisely capture trade-offs between conflicting decisions, and are more flexible at the expense of increased compilation time. This paper provides an exhaustive literature review and a classification of combinatorial optimization approaches to register allocation and instruction scheduling, with a focus on the techniques that are most applied in this context: integer programming, constraint programming, partitioned Boolean quadratic programming, and enumeration. Researchers in compilers and combinatorial optimization can benefit from identifying developments, trends, and challenges in the area; compiler practitioners may discern opportunities and grasp the potential benefit of applying combinatorial optimization

    A 64mW DNN-based Visual Navigation Engine for Autonomous Nano-Drones

    Full text link
    Fully-autonomous miniaturized robots (e.g., drones), with artificial intelligence (AI) based visual navigation capabilities are extremely challenging drivers of Internet-of-Things edge intelligence capabilities. Visual navigation based on AI approaches, such as deep neural networks (DNNs) are becoming pervasive for standard-size drones, but are considered out of reach for nanodrones with size of a few cm2{}^\mathrm{2}. In this work, we present the first (to the best of our knowledge) demonstration of a navigation engine for autonomous nano-drones capable of closed-loop end-to-end DNN-based visual navigation. To achieve this goal we developed a complete methodology for parallel execution of complex DNNs directly on-bard of resource-constrained milliwatt-scale nodes. Our system is based on GAP8, a novel parallel ultra-low-power computing platform, and a 27 g commercial, open-source CrazyFlie 2.0 nano-quadrotor. As part of our general methodology we discuss the software mapping techniques that enable the state-of-the-art deep convolutional neural network presented in [1] to be fully executed on-board within a strict 6 fps real-time constraint with no compromise in terms of flight results, while all processing is done with only 64 mW on average. Our navigation engine is flexible and can be used to span a wide performance range: at its peak performance corner it achieves 18 fps while still consuming on average just 3.5% of the power envelope of the deployed nano-aircraft.Comment: 15 pages, 13 figures, 5 tables, 2 listings, accepted for publication in the IEEE Internet of Things Journal (IEEE IOTJ

    A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

    Full text link
    Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow

    Low Power Architectures for MPEG-4 AVC/H.264 Video Compression

    Get PDF

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Discrete Wavelet Transforms

    Get PDF
    The discrete wavelet transform (DWT) algorithms have a firm position in processing of signals in several areas of research and industry. As DWT provides both octave-scale frequency and spatial timing of the analyzed signal, it is constantly used to solve and treat more and more advanced problems. The present book: Discrete Wavelet Transforms: Algorithms and Applications reviews the recent progress in discrete wavelet transform algorithms and applications. The book covers a wide range of methods (e.g. lifting, shift invariance, multi-scale analysis) for constructing DWTs. The book chapters are organized into four major parts. Part I describes the progress in hardware implementations of the DWT algorithms. Applications include multitone modulation for ADSL and equalization techniques, a scalable architecture for FPGA-implementation, lifting based algorithm for VLSI implementation, comparison between DWT and FFT based OFDM and modified SPIHT codec. Part II addresses image processing algorithms such as multiresolution approach for edge detection, low bit rate image compression, low complexity implementation of CQF wavelets and compression of multi-component images. Part III focuses watermaking DWT algorithms. Finally, Part IV describes shift invariant DWTs, DC lossless property, DWT based analysis and estimation of colored noise and an application of the wavelet Galerkin method. The chapters of the present book consist of both tutorial and highly advanced material. Therefore, the book is intended to be a reference text for graduate students and researchers to obtain state-of-the-art knowledge on specific applications

    Low power JPEG2000 5/3 discrete wavelet transform algorithm and architecture

    Get PDF

    Sulautettu ohjelmistototeutus reaaliaikaiseen paikannusjärjestelmään

    Get PDF
    Asset tracking often necessitates wireless, radio-frequency identification (RFID). In practice, situations often arise where plain inventory operations are not sufficient, and methods to estimate movement trajectory are needed for making reliable observations, classification and report generation. In this thesis, an embedded software application for an industrial, resource-constrained off-the-shelf RFID reader device in the UHF frequency range is designed and implemented. The software is used to configure the reader and its air-interface operations, accumulate read reports and generate events to be reported over network connections. Integrating location estimation methods to the application facilitates the possibility to make deploying middleware RFID solutions more streamlined and robust while reducing network bandwidth requirements. The result of this thesis is a functional embedded software application running on top of an embedded Linux distribution on an ARM processor. The reader software is used commercially in industrial and logistics applications. Non-linear state estimation features are applied, and their performance is evaluated in empirical experiments.Tavaroiden seuranta edellyttää usein langatonta radiotaajuustunnistustekniikkaa (RFID). Käytännön sovelluksissa tulee monesti tilanteita joissa pelkkä inventointi ei riitä, vaan tarvitaan menetelmiä liikeradan estimointiin luotettavien havaintojen ja luokittelun tekemiseksi sekä raporttien generoimiseksi. Tässä työssä on suunniteltu ja toteutettu sulautettu ohjelmistosovellus teolliseen, resursseiltaan rajoitettuun ja kaupallisesti saatavaan UHF-taajuusalueen RFID-lukijalaitteeseen. Ohjelmistoa käytetään lukijalaitteen ja sen ilmarajapinnan toimintojen konfigurointiin, lukutapahtumien keräämiseen ja raporttien lähettämiseen verkkoyhteyksiä pitkin. Paikkatiedon estimointimenetelmien integroiminen ohjelmistoon mahdollistaa välitason RFID-sovellusten toteuttamisen aiempaa suoraviivaisemin ja luotettavammin, vähentäen samalla vaatimuksia tietoverkon kaistanleveydelle. Työn tuloksena on toimiva sulautettu ohjelmistosovellus, jota ajetaan sulautetussa Linux-käyttöjärjestelmässä ARM-arkkitehtuurilla. Lukijaohjelmistoa käytetään kaupallisesti teollisuuden ja logistiikan sovelluskohteissa. Epälineaarisia estimointiominaisuuksia hyödynnetään, ja niiden toimivuutta arvioidaan empiirisin kokein
    corecore