1,421 research outputs found

    A dynamic reconfigurable architecture for hybrid spiking and convolutional FPGA-based neural network designs

    Get PDF
    This work presents a dynamically reconfigurable architecture for Neural Network (NN) accelerators implemented in Field-Programmable Gate Array (FPGA) that can be applied in a variety of application scenarios. Although the concept of Dynamic Partial Reconfiguration (DPR) is increasingly used in NN accelerators, the throughput is usually lower than pure static designs. This work presents a dynamically reconfigurable energy-efficient accelerator architecture that does not sacrifice throughput performance. The proposed accelerator comprises reconfigurable processing engines and dynamically utilizes the device resources according to model parameters. Using the proposed architecture with DPR, different NN types and architectures can be realized on the same FPGA. Moreover, the proposed architecture maximizes throughput performance with design optimizations while considering the available resources on the hardware platform. We evaluate our design with different NN architectures for two different tasks. The first task is the image classification of two distinct datasets, and this requires switching between Convolutional Neural Network (CNN) architectures having different layer structures. The second task requires switching between NN architectures, namely a CNN architecture with high accuracy and throughput and a hybrid architecture that combines convolutional layers and an optimized Spiking Neural Network (SNN) architecture. We demonstrate throughput results from quickly reprogramming only a tiny part of the FPGA hardware using DPR. Experimental results show that the implemented designs achieve a 7× faster frame rate than current FPGA accelerators while being extremely flexible and using comparable resources

    Emerging research directions in computer science : contributions from the young informatics faculty in Karlsruhe

    Get PDF
    In order to build better human-friendly human-computer interfaces, such interfaces need to be enabled with capabilities to perceive the user, his location, identity, activities and in particular his interaction with others and the machine. Only with these perception capabilities can smart systems ( for example human-friendly robots or smart environments) become posssible. In my research I\u27m thus focusing on the development of novel techniques for the visual perception of humans and their activities, in order to facilitate perceptive multimodal interfaces, humanoid robots and smart environments. My work includes research on person tracking, person identication, recognition of pointing gestures, estimation of head orientation and focus of attention, as well as audio-visual scene and activity analysis. Application areas are humanfriendly humanoid robots, smart environments, content-based image and video analysis, as well as safety- and security-related applications. This article gives a brief overview of my ongoing research activities in these areas

    Design and evaluation of a self-configuring wireless mesh network architecture

    Get PDF
    Wireless network connectivity plays an increasingly important role in supporting our everyday private and professional lives. For over three decades, self-organizing wireless multi-hop ad-hoc networks have been investigated as a decentralized replacement for the traditional forms of wireless networks that rely on a wired infrastructure. However, despite the tremendous efforts of the international wireless research community and widespread availability of devices that are able to support these networks, wireless ad-hoc networks are hardly ever used. In this work, the reasons behind this discrepancy are investigated. It is found that several basic theoretical assumptions on ad-hoc networks prove to be wrong when solutions are deployed in reality, and that several basic functionalities are still missing. It is argued that a hierarchical wireless mesh network architecture, in which specialized, multi-interfaced mesh nodes form a reliable multi-hop wireless backbone for the less capable end-user clients is an essential step in bringing the ad-hoc networking concept one step closer to reality. Therefore, in a second part of this work, algorithms increasing the reliability and supporting the deployment and management of these wireless mesh networks are developed, implemented and evaluated, while keeping the observed limitations and practical considerations in mind. Furthermore, the feasibility of the algorithms is verified by experiment. The performance analysis of these protocols and the ability to deploy the developed algorithms on current generation off-the-shelf hardware indicates the successfulness of the followed research approach, which combines theoretical considerations with practical implementations and observations. However, it was found that there are also many pitfalls to using real-life implementation as a research technique. Therefore, in the last part of this work, a methodology for wireless network research using real-life implementation is developed, allowing researchers to generate more reliable protocols and performance analysis results with less effort

    Design and implementation of simulation tools, protocols and architectures to support service platforms on vehicular networks

    Full text link
    Tesis por compendioProducts related with Intelligent Transportation Systems (ITS) are becoming a reality on our roads. All car manufacturers are starting to include Internet access in their vehicles and to integrate smartphones directly from the dashboard, but more and more services will be introduced in the near future. Connectivity through "vehicular networks" will become a cornerstone of every new proposal, and offering an adequate quality of service is obviously desirable. However, a lot of work is needed for vehicular networks to offer performances similar to those of the wired networks. Vehicular networks can be characterized by two main features: high variability due to mobility levels that can reach up to 250 kilometers per hour, and heterogeneity, being that various competing versions from different vendors have and will be released. Therefore, to make the deployment of efficient services possible, an extensive study must be carried out and adequate tools must be proposed and developed. This PhD thesis addresses the service deployment problem in these networks at three different levels: (i) the physical and link layer, showing an exhaustive analysis of the physical channel and models; (ii) the network layer, proposing a forwarding protocol for IP packets; and (iii) the transport layer, where protocols are proposed to improve data delivery. First of all, the two main wireless technologies used in vehicular networks where studied and modeled, namely the 802.11 family of standards, particularly 802.11p, and the cellular networks focusing on LTE. Since 802.11p is a quite mature standard, we defined (i) a propagation and attenuation model capable of replicating the transmission range and the fading behavior of real 802.11p devices, both in line-of-sight conditions and when obstructed by small obstacles, and (ii) a visibility model able to deal with large obstacles, such as buildings and houses, in a realistic manner. Additionally, we proposed a model based on high-level performance indicators (bandwidth and delay) for LTE, which makes application validation and evaluation easier. At the network layer, a hybrid protocol called AVE is proposed for packet forwarding by switching among a set of standard routing strategies. Depending on the specific scenario, AVE selects one out of four different routing solutions: a) two-hop direct delivery, b) Dynamic MANET On-demand (DYMO), c) greedy georouting, and d) store-carry-and-forward technique, to dynamically adapt its behavior to the specific situation. At the transport layer, we proposed a content delivery protocol for reliable and bidirectional unicast communication in lossy links that improves content delivery in situations where the wireless network is the bottleneck. It has been designed, validated, optimized, and its performance has been analyzed in terms of throughput and resource efficiency. Finally, at system level, we propose an edge-assisted computing model that allows reducing the response latency of several queries by placing a computing unit at the network edge. This way, traffic traversal through the Internet is avoided when not needed. This scheme could be used in both 802.11p and cellular networks, and in this thesis we decided to focus on its evaluation using LTE networks. The platform presented in this thesis combines all the individual efforts to create a single efficient platform. This new environment could be used by any provider to improve the quality of the user experience obtainable through the proposed vehicular network-based services.Los productos relacionados con los Sistemas Inteligentes de Transporte (ITS) se están transformando en una realidad en nuestras carreteras. Todos los fabricantes de coches comienzan a incluir acceso a internet en sus vehículos y a facilitar su integración con los teléfonos móviles, pero más y más servicios se introducirán en el futuro. La conectividad usando las "redes vehiculares" se convertirá en la piedra angular de cada nueva propuesta, y ofrecer una calidad de servicio adecuada será, obviamente, deseable. Sin embargo, se necesita una gran cantidad de trabajo para que las redes vehiculares ofrezcan un rendimiento similar al de las redes cableadas. Las redes vehiculares quedan definidas por sus dos características básicas: alto dinamismo, pues los nodos pueden alcanzar una velocidad relativa de más de 250 km/h; y heterogeneidad, por la gran cantidad de propuestas diferentes que los fabricantes están lanzando al mercado. Por ello, para hacer posible el despliegue de servicios sobre ellas, se impone la necesidad de hacer un estudio en profundidad de este entorno, y deben de proponerse y desarrollarse las herramientas adecuadas. Esta tesis ataca la problemática del despliegue de servicios en estas redes a tres niveles diferentes: (i) el nivel físico y de enlace, mostrando varios análisis en profundidad del medio físico y modelos derivados para su simulación; (ii) el nivel de red, proponiendo un protocolo de difusión de la información para los paquetes IP; y (iii) el nivel de transporte, donde otros protocolos son propuestos para mejorar el rendimiento del transporte de datos. En primer lugar, se han estudiado y modelado las dos principales tecnologías inalámbricas que se utilizan para la comunicación en redes vehiculares, la rama de estándares 802.11, en concreto 802.11p; y la comunicación celular, en particular LTE. Dado que el estándar 802.11p es un estándar bastante maduro, nos centramos en crear (i) un modelo de propagación y atenuación capaz de replicar el rango de transmisión de dispositivos 802.11p reales, en condiciones de visión directa y obstrucción por pequeños obstáculos, y (ii) un modelo de visibilidad capaz de simular el efecto de grandes obstáculos, como son los edifcios, de una manera realista. Además, proponemos un modelo basado en indicadores de rendimiento de alto nivel (ancho de banda y retardo) para LTE, que facilita la validación y evaluación de aplicaciones. En el plano de red, se propone un protocolo híbrido, llamado AVE, para el encaminamiento y reenvío de paquetes usando un conjunto de estrategias estándar de enrutamiento. Dependiendo del escenario, AVE elige entre cuatro estrategias diferentes: a) entrega directa a dos saltos, b) Dynamic MANET On-demand (DYMO) c) georouting voraz, y d) una técnica store-carry-and- forward, para adaptar su comportamiento dinámicamente a cada situación. En el plano de transporte, se propone un protocolo bidireccional de distribución de contenidos en canales con pérdidas que mejora la entrega de contenidos en situaciones en las que la red es un cuello de botella, como las redes inalámbricas. Ha sido diseñado, validado, optimizado, y su rendimiento ha sido analizado en términos de productividad y eficiencia en la utilización de recursos. Finalmente, a nivel de sistema, proponemos un modelo de computación asistida que permite reducir la latencia en la respuesta a muchas consultas colocando una unidad de computación en el borde de la red, i.e., la red de acceso. Este esquema podría ser usado en redes basadas en 802.11p y en redes celulares, si bien en esta tesis decidimos centrarnos en su evaluación usando redes LTE. La plataforma presentada en esta tesis combina todos los esfuerzos individuales para crear una plataforma única y eficiente. Este nuevo entorno puede ser usado por cualquier proveedor para mejorar la calidad de la experiencia de usuario en los servicios desplegados sobre redes vehiculares.Els productes relacionats amb els sistemes intel · ligents de transport (ITS) s'estan transformant en una realitat en les nostres carreteres. Tots els fabri- cants de cotxes comencen a incloure accés a internet en els vehicles i a facilitar- ne la integració amb els telèfons mòbils, però en el futur més i més serveis s'hi introduiran. La connectivitat usant les xarxes vehicular esdevindrà la pedra angular de cada nova proposta, i oferir una qualitat de servei adequada serà, òbviament, desitjable. No obstant això, es necessita una gran quantitat de treball perquè les xarxes vehiculars oferisquen un rendiment similar al de les xarxes cablejades. Les xarxes vehiculars queden definides per dues característiques bàsiques: alt dinamisme, ja que els nodes poden arribar a una velocitat relativa de més de 250 km/h; i heterogeneïtat, per la gran quantitat de propostes diferents que els fabricants estan llançant al mercat. Per això, per a fer possible el desplegament de serveis sobre aquestes xarxes, s'imposa la necessitat de fer un estudi en profunditat d'aquest entorn, i cal proposar i desenvolupar les eines adequades. Aquesta tesi ataca la problemàtica del desplegament de serveis en aquestes xarxes a tres nivells diferents: (i) el nivell físic i d'enllaç , mostrant diverses anàlisis en profunditat del medi físic i models derivats per simular-lo; (ii) el nivell de xarxa, proposant un protocol de difusió de la informació per als paquets IP; i (iii) el nivell de transport, on es proposen altres protocols per a millorar el rendiment del transport de dades. En primer lloc, s'han estudiat i modelat les dues principals tecnologies sense fils que s'utilitzen per a la comunicació en xarxes vehiculars, la branca d'estàndards 802.11, en concret 802.11p; i la comunicació cel · lular, en partic- ular LTE. Atès que l'estàndard 802.11p és un estàndard bastant madur, ens centrem a crear (i) un model de propagació i atenuació capaç de replicar el rang de transmissió de dispositius 802.11p reals, en condicions de visió directa i obstrucció per petits obstacles, i (ii) un model de visibilitat capaç de simular l'efecte de grans obstacles, com són els edificis, d'una manera realista. A més, proposem un model basat en indicadors de rendiment d'alt nivell (ample de banda i retard) per a LTE, que facilita la validació i l'avaluació d'aplicacions. En el pla de xarxa, es proposa un protocol híbrid, anomenat AVE, per a l'encaminament i el reenviament de paquets usant un conjunt d'estratègies estàndard d'encaminament. Depenent de l'escenari , AVE tria entre quatre estratègies diferents: a) lliurament directe a dos salts, b) Dynamic MANET On-demand (DYMO) c) georouting voraç, i d) una tècnica store-carry-and- forward, per a adaptar-ne el comportament dinàmicament a cada situació. En el pla de transport, es proposa un protocol bidireccional de distribució de continguts en canals amb pèrdues que millora el lliurament de continguts en situacions en què la xarxa és un coll de botella, com les xarxes sense fils. Ha sigut dissenyat, validat, optimitzat, i el seu rendiment ha sigut analitzat en termes de productivitat i eficiència en la utilització de recursos. Finalment, a nivell de sistema, proposem un model de computació assistida que permet reduir la latència en la resposta a moltes consultes col · locant una unitat de computació a la vora de la xarxa, és a dir, la xarxa d'accés. Aquest esquema podria ser usat en xarxes basades en 802.11p i en xarxes cel · lulars, si bé en aquesta tesi decidim centrar-nos en la seua avaluació usant xarxes LTE. La plataforma presentada en aquesta tesi combina tots els esforços indi- viduals per a crear una plataforma única i eficient. Aquest nou entorn pot ser usat per qualsevol proveïdor per a millorar la qualitat de l'experiència d'usuari en els serveis desplegats sobre xarxes vehiculars.Báguena Albaladejo, M. (2017). Design and implementation of simulation tools, protocols and architectures to support service platforms on vehicular networks [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/85333TESISCompendi

    Optimizing energy-efficiency for multi-core packet processing systems in a compiler framework

    Get PDF
    Network applications become increasingly computation-intensive and the amount of traffic soars unprecedentedly nowadays. Multi-core and multi-threaded techniques are thus widely employed in packet processing system to meet the changing requirement. However, the processing power cannot be fully utilized without a suitable programming environment. The compilation procedure is decisive for the quality of the code. It can largely determine the overall system performance in terms of packet throughput, individual packet latency, core utilization and energy efficiency. The thesis investigated compilation issues in networking domain first, particularly on energy consumption. And as a cornerstone for any compiler optimizations, a code analysis module for collecting program dependency is presented and incorporated into a compiler framework. With that dependency information, a strategy based on graph bi-partitioning and mapping is proposed to search for an optimal configuration in a parallel-pipeline fashion. The energy-aware extension is specifically effective in enhancing the energy-efficiency of the whole system. Finally, a generic evaluation framework for simulating the performance and energy consumption of a packet processing system is given. It accepts flexible architectural configuration and is capable of performingarbitrary code mapping. The simulation time is extremely short compared to full-fledged simulators. A set of our optimization results is gathered using the framework

    Tiny Machine Learning Environment: Enabling Intelligence on Constrained Devices

    Get PDF
    Running machine learning algorithms (ML) on constrained devices at the extreme edge of the network is problematic due to the computational overhead of ML algorithms, available resources on the embedded platform, and application budget (i.e., real-time requirements, power constraints, etc.). This required the development of specific solutions and development tools for what is now referred to as TinyML. In this dissertation, we focus on improving the deployment and performance of TinyML applications, taking into consideration the aforementioned challenges, especially memory requirements. This dissertation contributed to the construction of the Edge Learning Machine environment (ELM), a platform-independent open-source framework that provides three main TinyML services, namely shallow ML, self-supervised ML, and binary deep learning on constrained devices. In this context, this work includes the following steps, which are reflected in the thesis structure. First, we present the performance analysis of state-of-the-art shallow ML algorithms including dense neural networks, implemented on mainstream microcontrollers. The comprehensive analysis in terms of algorithms, hardware platforms, datasets, preprocessing techniques, and configurations shows similar performance results compared to a desktop machine and highlights the impact of these factors on overall performance. Second, despite the assumption that TinyML only permits models inference provided by the scarcity of resources, we have gone a step further and enabled self-supervised on-device training on microcontrollers and tiny IoT devices by developing the Autonomous Edge Pipeline (AEP) system. AEP achieves comparable accuracy compared to the typical TinyML paradigm, i.e., models trained on resource-abundant devices and then deployed on microcontrollers. Next, we present the development of a memory allocation strategy for convolutional neural networks (CNNs) layers, that optimizes memory requirements. This approach reduces the memory footprint without affecting accuracy nor latency. Moreover, e-skin systems share the main requirements of the TinyML fields: enabling intelligence with low memory, low power consumption, and low latency. Therefore, we designed an efficient Tiny CNN architecture for e-skin applications. The architecture leverages the memory allocation strategy presented earlier and provides better performance than existing solutions. A major contribution of the thesis is given by CBin-NN, a library of functions for implementing extremely efficient binary neural networks on constrained devices. The library outperforms state of the art NN deployment solutions by drastically reducing memory footprint and inference latency. All the solutions proposed in this thesis have been implemented on representative devices and tested in relevant applications, of which results are reported and discussed. The ELM framework is open source, and this work is clearly becoming a useful, versatile toolkit for the IoT and TinyML research and development community

    Ad Hoc Mobility Notification in Wireless Infrastructure Networks

    Get PDF
    Hybrid networks composed of a wireless infrastructure network providing Internet access to an underlying ad hoc network are more and more attractive due to their low installation cost. In these all-wireless environments, performance is a key issue as radio bandwidth is scarce. Handoffs management is particularly important as these networks are likely to be highly mobile. Mobility notification should therefore be optimized in order to limit signaling overhead while keeping a good reactivity against terminals mobility. This article presents and studies by simulation different level optimizations applied to a modified Cellular IP protocol
    corecore