51 research outputs found

    Software Defined Multi-Spectral Imaging for Arctic Sensor Networks

    Get PDF
    Availability of off-the-shelf infrared sensors combined with high definition visible cameras has made possible the construction of a Software Defined Multi-Spectral Imager (SDMSI) combining long-wave, near-infrared and visible imaging. The SDMSI requires a real-time embedded processor to fuse images and to create real-time depth maps for opportunistic uplink in sensor networks. Researchers at Embry Riddle Aeronautical University working with University of Alaska Anchorage at the Arctic Domain Awareness Center and the University of Colorado Boulder have built several versions of a low-cost drop-in-place SDMSI to test alternatives for power efficient image fusion. The SDMSI is intended for use in field applications including marine security, search and rescue operations and environmental surveys in the Arctic region. Based on Arctic marine sensor network mission goals, the team has designed the SDMSI to include features to rank images based on saliency and to provide on camera fusion and depth mapping. A major challenge has been the design of the camera computing system to operate within a 10 to 20 Watt power budget. This paper presents a power analysis of three options: 1) multi-core, 2) field programmable gate array with multi-core, and 3) graphics processing units with multi-core. For each test, power consumed for common fusion workloads has been measured at a range of frame rates and resolutions. Detailed analyses from our power efficiency comparison for workloads specific to stereo depth mapping and sensor fusion are summarized. Preliminary mission feasibility results from testing with off-the-shelf long-wave infrared and visible cameras in Alaska and Arizona are also summarized to demonstrate the value of the SDMSI for applications such as ice tracking, ocean color, soil moisture, animal and marine vessel detection and tracking. The goal is to select the most power efficient solution for the SDMSI for use on UAVs (Unoccupied Aerial Vehicles) and other drop-in-place installations in the Arctic. The prototype selected will be field tested in Alaska in the summer of 2016

    Image-to-Geometry Registration on Mobile Devices – Concepts, Challenges and Applications

    Get PDF
    Registering natural photos to existing 3D surface models, particularly on low-power mobile devices, gathers increasing attention to a variety of application domains. The paper discusses up-to-date computation insights of the technique, condensing available literature and knowledge obtained from experiments across multiple research groups. Challenges like smartphone camera calibration or the sensor-based estimation of location and orientation are current research subjects, for which new data and experimental results are presented. Moreover, computing-related, practical challenges (e.g. device variability) are detailed to increase the technological understanding and reasoning on the limits of mobile devices. An overview of running projects utilising image-to-geometry registration methods shows the potential for mobile devices to, amongst others, improve flood hazard mitigation and hydrocarbon exploration with crowdsourced data

    Online Energy-Efficient Task-Graph Scheduling for Multicore Platforms

    Get PDF
    Numerous Directed-Acyclic Graph (DAG) schedulers have been developed to improve the energy efficiency of various multi-core platforms. However, these schedulers make a priori assumptions about the relationship between the task dependencies, and they are unable to adapt online to the characteristics of each application without offline profiling data. Therefore, we propose a novel energy-efficient online scheduling solution for the general DAG model to address the two aforementioned problems. Our proposed scheduler is able to adapt at runtime to the characteristics of each application by making smart foresighted decisions, which take into account the impact of current scheduling decisions on the present and future deadline miss rates and energy efficiency. Moreover, our scheduler is able to efficiently handle execution with very limited resources by avoiding scheduling tasks that are expected to miss their deadlines and do not have an impact on future deadlines. We validate our approach against state-of-the-art solutions. In our first set of experiments, our results with the H.264 video decoder demonstrate that the proposed low-complexity solution for the general DAG model reduces the energy consumption by up to 15% compared to an existing sophisticated and complex scheduler that was specifically built for the H.264 video decoder application. In our second set of experiments, our results with different configurations of synthetic DAGs demonstrate that our proposed solution is able to reduce the energy consumption by up to 55% and the deadline miss rates by up to 99% compared to a second existing scheduling solution. Finally, we show that our DFM and scheduler have low complexities on a real mobile platform and we show that our solution is resilient to workload prediction errors by using different estimator accuracies

    Enabling the use of embedded and mobile technologies for high-performance computing

    Get PDF
    In the late 1990s, powerful economic forces led to the adoption of commodity desktop processors in High-Performance Computing(HPC). This transformation has been so effective that the November 2016 TOP500 list is still dominated by x86 architecture. In 2016, the largest commodity market in computing is not PCs or servers, but mobile computing, comprising smartphones andtablets, most of which are built with ARM-based Systems on Chips (SoC). This suggests that once mobile SoCs deliver sufficient performance, mobile SoCs can help reduce the cost of HPC. This thesis addresses this question in detail.We analyze the trend in mobile SoC performance, comparing it with the similar trend in the 1990s. Through development of real system prototypes and their performance analysis we assess the feasibility of building an HPCsystem based on mobile SoCs. Through simulation of the future mobile SoC, we identify the missing features and suggest improvements that would enable theuse of future mobile SoCs in HPC environment. Thus, we present design guidelines for future generations mobile SoCs, and HPC systems built around them, enabling the newclass of cheap supercomputers.A finales de la década de los 90, razones económicas llevaron a la adopción de procesadores de uso general en sistemas de Computación de Altas Prestaciones (HPC). Esta transformación ha sido tan efectiva que la lista TOP500 de noviembre de 2016 sigue aun dominada por la arquitectura x86. En 2016, el mayor mercado de productos básicos en computación no son los ordenadores de sobremesa o los servidores, sino la computación móvil, que incluye teléfonos inteligentes y tabletas, la mayoría de los cuales están construidos con sistemas en chip(SoC) de arquitectura ARM. Esto sugiere que una vez que los SoC móviles ofrezcan un rendimiento suficiente, podrán utilizarse para reducir el costo desistemas HPC. Esta tesis aborda esta cuestión en detalle. Analizamos la tendencia del rendimiento de los SoC para móvil, comparándola con la tendencia similar ocurrida en los añosnoventa. A través del desarrollo de prototipos de sistemas reales y su análisis de rendimiento, evaluamos la factibilidad de construir unsistema HPC basado en SoCs móviles. A través de la simulación de SoCs móviles futuros, identificamos las características que faltan y sugerimos mejoras quepermitirían su uso en entornos HPC. Por lo tanto, presentamos directrices de diseño para futuras generaciones de SoCs móviles y sistemas HPC construidos a sualrededor, para permitir la construcción de una nueva clase de supercomputadores de coste reducido

    Deep learning in edge: evaluation of models and frameworks in ARM architecture

    Get PDF
    The boom and popularization of edge devices have molded its market due to stiff compe tition that provides better functionalities at low energy costs. The ARM architecture has been unanimously unopposed in the huge market segment of smartphones and still makes a presence beyond that: in drones, surveillance systems, cars, and robots. Also, it has been used successfully for the development of solutions for chains that supply food, fuel, and other services. Up until recently, ARM did not show much promise for high-level compu tation, i.e., thanks to its limited RISC instruction set, it was considered power efficient but weak in performance compared to x86 architecture. However, most recent advancements in ARM architecture pivoted that inflection point up thanks to the introduction of embed ded GPUs with DMA into LPDDR memory boards. Since this development in boards such as NVIDIA TK1, NVIDIA Jetson TX1, and NVIDIA TX2, perhaps it finally be came feasible to study and perform more challenging parallel and distributed workloads directly on a RISC-based architecture. On the other hand, the novelty of this technology poses a fundamental question of whether these boards are gaining a meaningful ratio be tween processing power and power consumption over conventional architectures or if they are bound to have reached their limitations. This work explores the Parallel Processing of Deep Learning on embedded GPUs of NVIDIA Jetson TX2 to evaluate the question above comprehensively. Thus, it uses 4 ARM boards, with 2 Deep Learning frameworks, 7 CNN models, and one medium-sized dataset combined into six board settings to con duct experiments. The experiments were conducted under similar environments, all built from the source. Altogether, the experiments ran for a total of 4,804 hours and revealed a slight advantage for MxNet on GPU-reliant training and a PyTorch overall advantage in total execution time and power, but especially for CPU-only executions. The experi ments also showed that the NVIDIA Jetson TX2 already makes feasible some complex workloads directly on its SoC

    Reducing Power Consumption and Latency in Mobile Devices using a Push Event Stream Model, Kernel Display Server, and GUI Scheduler

    Get PDF
    The power consumed by mobile devices can be dramatically reduced by improving how mobile operating systems handle events and display management. Currently, mobile operating systems use a pull model that employs a polling loop to constantly ask the operating system if an event exists. This constant querying prevents the CPU from entering a deep sleep, which unnecessarily consumes power. We’ve improved this process by switching to a push model which we refer to as the event stream model (ESM). This model leverages modern device interrupt controllers which automatically notify an application when events occur, thus removing the need to constantly rouse the CPU in order to poll for events. Since the CPU rests while no events are occurring, power consumption is reduced. Furthermore, an application is immediately notified when an event occurs, as opposed to waiting for a polling loop to recognize when an event has occurred. This immediate notification reduces latency, which is the elapsed time between the occurrence of an event and the beginning of its processing by an application. We further improved the benefits of the ESM by moving the display server, a central piece of the graphical user interface (GUI), into the kernel. Existing display servers duplicate some of the kernel code. They contain important information about an application that can assist the kernel with scheduling, such as whether the application is visible and able to receive events. However, they do not share such information with the kernel. Our new kernel-level display server (KDS) interacts directly with the process scheduler to determine when applications are allowed to use the CPU. For example, when an application is idle and not visible on the screen, the KDS prevents that application from using the CPU, thus conserving power. These combined improvements have reduced power consumption by up to 31.2% and latency by up to 17.1 milliseconds in our experimental applications. This improvement in power consumption roughly increases battery life by one to four hours when the device is being actively used or fifty to three-hundred hours when the device is idle
    • …
    corecore