674 research outputs found

    NeuroFlow: A General Purpose Spiking Neural Network Simulation Platform using Customizable Processors

    Get PDF
    © 2016 Cheung, Schultz and Luk.NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation

    Cloudifying Desktops – A Taxonomy for Desktop Virtualization

    Get PDF
    Compared to traditional desktops, the implementation of desktop virtualization can leverage cost reductions and enable desktop access via mobile devices. Consequently, researchers and practitioners increasingly focus on virtualized desktops and Desktop as a Service (DaaS). However, a consistent definition for these technologies and the related delivery models does not exist yet. Therefore, we conducted a literature analysis which revealed that optimized resource allocation and performant DaaS infrastructures are the primary topics in research. Afterward, we developed a taxonomy to categorize extant virtual desktop delivery models and propose a holistic definition as theoretical framework for DaaS

    Real-Time Big Data: the JUNIPER Approach

    Get PDF
    REACTION 2014. 3rd International Workshop on Real-time and Distributed Computing in Emerging Applications. Rome, Italy. December 2nd, 2014.Cloud computing offers the possibility for Cyber-Physical Systems (CPS) to offload computation and utilise large stored data sets in order to increase the overall system utility. However, for cloud platforms and applications to be effective for CPS, they need to exhibit real-time behaviour so that some level of performance can be guaranteed to the CPS. This paper considers the infrastructure developed by the EU JUNIPER project for enabling real-time big data systems to be built so that appropriate guarantees can be given to the CPS components. The technologies developed include a real-time Java programming approach, hardware acceleration to provide performance, and operating system resource manage-ment (time and disk) based upon resource reservation in order to enhance timeliness.This work is partially funded by the European Union’s Seventh Framework Programme under grant agreement FP7-ICT-611731Publicad

    Analysis, characterization and optimization of the energy efficiency on softwarized mobile platforms

    Get PDF
    Mención Internacional en el título de doctorLa inminente 5ª generación de sistemas móviles (5G) está a punto de revolucionar la industria, trayendo una nueva arquitectura orientada a los nuevos mercados verticales y servicios. Debido a esto, el 5G Infrastructure Public Private Partnership (5G-PPP) ha especificado una lista de Indicadores de Rendimiento Clave (KPI) que todo sistema 5G tiene que soportar, por ejemplo incrementar por 1000 el volumen de datos, de 10 a 100 veces m´as dispositivos conectados o consumos energéticos 10 veces inferiores. Con el fin de conseguir estos requisitos, se espera expandir los despligues actuales usando mas Puntos de Acceso (PoA) incrementando así su densidad con múltiples tecnologías inalámbricas. Esta estrategia de despliegue masivo tiene una contrapartida en la eficiencia energética, generando un conflicto con el KPI de reducir por 10 el consumo energético. En este contexto, la comunidad investigadora ha propuesto nuevos paradigmas para alcanzar los requisitos impuestos para los sistemas 5G, siendo materializados en tecnologías como Redes Definidas por Software (SDN) y Virtualización de Funciones de Red (NFV). Estos nuevos paradigmas son el primer paso hacia la softwarización de los despliegues móviles, incorporando nuevos grados de flexibilidad y reconfigurabilidad de la Red de Acceso Radio (RAN). En esta tesis, presentamos primero un análisis detallado y caracterización de las redes móviles softwarizadas. Consideramos el software como la base de la nueva generación de redes celulares y, por lo tanto, analizaremos y caracterizaremos el impacto en la eficiencia energética de estos sistemas. La primera meta de este trabajo es caracterizar las plataformas software disponibles para Radios Definidas por Software (SDR), centrándonos en las dos soluciones principales de código abierto: OpenAirInterface (OAI) y srsLTE. Como resultado, proveemos una metodología para analizar y caracterizar el rendimiento de estas soluciones en función del uso de la CPU, rendimiento de red, compatibilidad y extensibilidad de dicho software. Una vez hemos entendido qué rendimiento podemos esperar de este tipo de soluciones, estudiamos un prototipo SDR construido con aceleración hardware, que emplea una plataformas basada en FPGA. Este prototipo está diseñado para incluir capacidad de ser consciente de la energía, permiento al sistema ser reconfigurado para minimizar la huella energética cuando sea posible. Con el fin de validar el diseño de nuestro sistema, más tarde presentamos una plataforma para caracterizar la energía que será empleada para medir experimentalmente el consumo energético de dispositivos reales. En nuestro enfoque, realizamos dos tipos de análisis: a pequeña escala de tiempo y a gran escala de tiempo. Por lo tanto, para validar nuestro entorno de medidas, caracterizamos a través de análisis numérico los algoritmos para la Adaptación de la Tasa (RA) en IEEE 802.11, para entonces comparar nuestros resultados teóricos con los experimentales. A continuación extendemos nuestro análisis a la plataforma SDR acelerada por hardware previamente mencionada. Nuestros resultados experimentales muestran que nuestra sistema puede en efecto reducir la huella energética reconfigurando el despligue del sistema. Entonces, la escala de tiempos es elevada y presentamos los esquemas para Recursos bajo Demanda (RoD) en despliegues de red ultra-densos. Esta estrategia está basada en apagar/encender dinámicamente los elementos que forman la red con el fin de reducir el total del consumo energético. Por lo tanto, presentamos un modelo analítico en dos sabores, un modelo exacto que predice el comportamiento del sistema con precisión pero con un alto coste computacional y uno simplificado que es más ligero en complejidad mientras que mantiene la precisión. Nuestros resultados muestran que estos esquemas pueden efectivamente mejorar la eficiencia energética de los despliegues y mantener la Calidad de Servicio (QoS). Con el fin de probar la plausibilidad de los esquemas RoD, presentamos un plataforma softwarizada que sigue el paradigma SDN, OFTEN (OpenFlow framework for Traffic Engineering in mobile Network with energy awareness). Nuestro diseño está basado en OpenFlow con funcionalidades para hacerlo consciente de la energía. Finalmente, un prototipo real con esta plataforma es presentando, probando así la plausibilidad de los RoD en despligues reales.The upcoming 5th Generation of mobile systems (5G) is about to revolutionize the industry, bringing a new architecture oriented to new vertical markets and services. Due to this, the 5G-PPP has specified a list of Key Performance Indicator (KPI) that 5G systems need to support e.g. increasing the 1000 times higher data volume, 10 to 100 times more connected devices or 10 times lower power consumption. In order to achieve these requirements, it is expected to expand the current deployments using more Points of Attachment (PoA) by increasing their density and by using multiple wireless technologies. This massive deployment strategy triggers a side effect in the energy efficiency though, generating a conflict with the “10 times lower power consumption” KPI. In this context, the research community has proposed novel paradigms to achieve the imposed requirements for 5G systems, being materialized in technologies such as Software Defined Networking (SDN) and Network Function Virtualization (NFV). These new paradigms are the first step to softwarize the mobile network deployments, enabling new degrees of flexibility and reconfigurability of the Radio Access Network (RAN). In this thesis, we first present a detailed analysis and characterization of softwarized mobile networking. We consider software as a basis for the next generation of cellular networks and hence, we analyze and characterize the impact on the energy efficiency of these systems. The first goal of this work is to characterize the available software platforms for Software Defined Radio (SDR), focusing on the two main open source solutions: OAI and srsLTE. As result, we provide a methodology to analyze and characterize the performance of these solutions in terms of CPU usage, network performance, compatibility and extensibility of the software. Once we have understood the expected performance for such platformsc, we study an SDR prototype built with hardware acceleration, that employs a FPGA based platform. This prototype is designed to include energy-awareness capabilites, allowing the system to be reconfigured to minimize the energy footprint when possible. In order to validate our system design, we later present an energy characterization platform that we will employ to experimentally measure the energy consumption of real devices. In our approach, we perform two kind of analysis: at short time scale and large time scale. Thus, to validate our approach in short time scale and the energy framework, we have characterized though numerical analysis the Rate Adaptation (RA) algorithms in IEEE 802.11, and then compare our theoretical results to the obtained ones through experimentation. Next we extend our analysis to the hardware accelerated SDR prototype previously mentioned. Our experimental results show that our system can indeed reduce the energy footprint reconfiguring the system deployment. Then, the time scale of our analysis is elevated and we present Resource-on-Demand (RoD) schemes for ultradense network deployments. This strategy is based on dynamically switch on/off the elements that form the network to reduce the overall energy consumption. Hence, we present a analytic model in two flavors, an exact model that accurately predicts the system behaviour but high computational cost and a simplified one that is lighter in complexity while keeping the accuracy. Our results show that these schemes can effectively enhance the energy efficiency of the deployments and mantaining the Quality of Service (QoS). In order to prove the feasibility of RoD, we present a softwarized platform that follows the SDN paradigm, the OFTEN (Open Flow framework for Traffic Engineering in mobile Networks with energy awareness) framework. Our design is based on OpenFlow with energy-awareness functionalities. Finally, a real prototype of this framework is presented, proving the feasibility of the RoD in real deployments.FP7-CROWD (2013-2015) CROWD (Connectivity management for eneRgy Optimised Wireless Dense networks).-- H2020-Flex5GWare (2015-2017) Flex5GWare (Flexible and efficient hardware/software platforms for 5G network elements and devices).Programa de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Gramaglia , Marco.- Secretario: José Nuñez.- Vocal: Fabrizio Giulian

    Reconfigurable Antenna Systems: Platform implementation and low-power matters

    Get PDF
    Antennas are a necessary and often critical component of all wireless systems, of which they share the ever-increasing complexity and the challenges of present and emerging trends. 5G, massive low-orbit satellite architectures (e.g. OneWeb), industry 4.0, Internet of Things (IoT), satcom on-the-move, Advanced Driver Assistance Systems (ADAS) and Autonomous Vehicles, all call for highly flexible systems, and antenna reconfigurability is an enabling part of these advances. The terminal segment is particularly crucial in this sense, encompassing both very compact antennas or low-profile antennas, all with various adaptability/reconfigurability requirements. This thesis work has dealt with hardware implementation issues of Radio Frequency (RF) antenna reconfigurability, and in particular with low-power General Purpose Platforms (GPP); the work has encompassed Software Defined Radio (SDR) implementation, as well as embedded low-power platforms (in particular on STM32 Nucleo family of micro-controller). The hardware-software platform work has been complemented with design and fabrication of reconfigurable antennas in standard technology, and the resulting systems tested. The selected antenna technology was antenna array with continuously steerable beam, controlled by voltage-driven phase shifting circuits. Applications included notably Wireless Sensor Network (WSN) deployed in the Italian scientific mission in Antarctica, in a traffic-monitoring case study (EU H2020 project), and into an innovative Global Navigation Satellite Systems (GNSS) antenna concept (patent application submitted). The SDR implementation focused on a low-cost and low-power Software-defined radio open-source platform with IEEE 802.11 a/g/p wireless communication capability. In a second embodiment, the flexibility of the SDR paradigm has been traded off to avoid the power consumption associated to the relevant operating system. Application field of reconfigurable antenna is, however, not limited to a better management of the energy consumption. The analysis has also been extended to satellites positioning application. A novel beamforming method has presented demonstrating improvements in the quality of signals received from satellites. Regarding those who deal with positioning algorithms, this advancement help improving precision on the estimated position

    Energy Aware Runtime Systems for Elastic Stream Processing Platforms

    Get PDF
    Following an invariant growth in the required computational performance of processors, the multicore revolution started around 20 years ago. This revolution was mainly an answer to power dissipation constraints restricting the increase of clock frequency in single-core processors. The multicore revolution not only brought in the challenge of parallel programming, i.e. being able to develop software exploiting the entire capabilities of manycore architectures, but also the challenge of programming heterogeneous platforms. The question of “on which processing element to map a specific computational unit?”, is well known in the embedded community. With the introduction of general-purpose graphics processing units (GPGPUs), digital signal processors (DSPs) along with many-core processors on different system-on-chip platforms, heterogeneous parallel platforms are nowadays widespread over several domains, from consumer devices to media processing platforms for telecom operators. Finding mapping together with a suitable hardware architecture is a process called design-space exploration. This process is very challenging in heterogeneous many-core architectures, which promise to offer benefits in terms of energy efficiency. The main problem is the exponential explosion of space exploration. With the recent trend of increasing levels of heterogeneity in the chip, selecting the parameters to take into account when mapping software to hardware is still an open research topic in the embedded area. For example, the current Linux scheduler has poor performance when mapping tasks to computing elements available in hardware. The only metric considered is CPU workload, which as was shown in recent work does not match true performance demands from the applications. Doing so may produce an incorrect allocation of resources, resulting in a waste of energy. The origin of this research work comes from the observation that these approaches do not provide full support for the dynamic behavior of stream processing applications, especially if these behaviors are established only at runtime. This research will contribute to the general goal of developing energy-efficient solutions to design streaming applications on heterogeneous and parallel hardware platforms. Streaming applications are nowadays widely spread in the software domain. Their distinctive characiteristic is the retrieving of multiple streams of data and the need to process them in real time. The proposed work will develop new approaches to address the challenging problem of efficient runtime coordination of dynamic applications, focusing on energy and performance management.Efter en oföränderlig tillväxt i prestandakrav hos processorer, började den flerkärniga processor-revolutionen för ungefär 20 år sedan. Denna revolution skedde till största del som en lösning till begränsningar i energieffekten allt eftersom klockfrekvensen kontinuerligt höjdes i en-kärniga processorer. Den flerkärniga processor-revolutionen medförde inte enbart utmaningen gällande parallellprogrammering, m.a.o. förmågan att utveckla mjukvara som använder sig av alla delelement i de flerkärniga processorerna, men också utmaningen med programmering av heterogena plattformar. Frågeställningen ”på vilken processorelement skall en viss beräkning utföras?” är väl känt inom ramen för inbyggda datorsystem. Efter introduktionen av grafikprocessorer för allmänna beräkningar (GPGPU), signalprocesserings-processorer (DSP) samt flerkärniga processorer på olika system-on-chip plattformar, är heterogena parallella plattformar idag omfattande inom många domäner, från konsumtionsartiklar till mediaprocesseringsplattformar för telekommunikationsoperatörer. Processen att placera beräkningarna på en passande hårdvaruplattform kallas för utforskning av en designrymd (design-space exploration). Denna process är mycket utmanande för heterogena flerkärniga arkitekturer, och kan medföra fördelar när det gäller energieffektivitet. Det största problemet är att de olika valmöjligheterna i designrymden kan växa exponentiellt. Enligt den nuvarande trenden som förespår ökad heterogeniska aspekter i processorerna är utmaningen att hitta den mest passande placeringen av beräkningarna på hårdvaran ännu en forskningsfråga inom ramen för inbyggda datorsystem. Till exempel, den nuvarande schemaläggaren i Linux operativsystemet är inkapabel att hitta en effektiv placering av beräkningarna på den underliggande hårdvaran. Det enda mätsättet som används är processorns belastning vilket, som visats i tidigare forskning, inte motsvarar den verkliga prestandan i applikationen. Användning av detta mätsätt vid resursallokering resulterar i slöseri med energi. Denna forskning härstammar från observationerna att dessa tillvägagångssätt inte stöder det dynamiska beteendet hos ström-processeringsapplikationer (stream processing applications), speciellt om beteendena bara etableras vid körtid. Denna forskning kontribuerar till det allmänna målet att utveckla energieffektiva lösningar för ström-applikationer (streaming applications) på heterogena flerkärniga hårdvaruplattformar. Ström-applikationer är numera mycket vanliga i mjukvarudomän. Deras distinkta karaktär är inläsning av flertalet dataströmmar, och behov av att processera dem i realtid. Arbetet i denna forskning understöder utvecklingen av nya sätt för att lösa det utmanade problemet att effektivt koordinera dynamiska applikationer i realtid och fokus på energi- och prestandahantering

    Tiny Machine Learning Environment: Enabling Intelligence on Constrained Devices

    Get PDF
    Running machine learning algorithms (ML) on constrained devices at the extreme edge of the network is problematic due to the computational overhead of ML algorithms, available resources on the embedded platform, and application budget (i.e., real-time requirements, power constraints, etc.). This required the development of specific solutions and development tools for what is now referred to as TinyML. In this dissertation, we focus on improving the deployment and performance of TinyML applications, taking into consideration the aforementioned challenges, especially memory requirements. This dissertation contributed to the construction of the Edge Learning Machine environment (ELM), a platform-independent open-source framework that provides three main TinyML services, namely shallow ML, self-supervised ML, and binary deep learning on constrained devices. In this context, this work includes the following steps, which are reflected in the thesis structure. First, we present the performance analysis of state-of-the-art shallow ML algorithms including dense neural networks, implemented on mainstream microcontrollers. The comprehensive analysis in terms of algorithms, hardware platforms, datasets, preprocessing techniques, and configurations shows similar performance results compared to a desktop machine and highlights the impact of these factors on overall performance. Second, despite the assumption that TinyML only permits models inference provided by the scarcity of resources, we have gone a step further and enabled self-supervised on-device training on microcontrollers and tiny IoT devices by developing the Autonomous Edge Pipeline (AEP) system. AEP achieves comparable accuracy compared to the typical TinyML paradigm, i.e., models trained on resource-abundant devices and then deployed on microcontrollers. Next, we present the development of a memory allocation strategy for convolutional neural networks (CNNs) layers, that optimizes memory requirements. This approach reduces the memory footprint without affecting accuracy nor latency. Moreover, e-skin systems share the main requirements of the TinyML fields: enabling intelligence with low memory, low power consumption, and low latency. Therefore, we designed an efficient Tiny CNN architecture for e-skin applications. The architecture leverages the memory allocation strategy presented earlier and provides better performance than existing solutions. A major contribution of the thesis is given by CBin-NN, a library of functions for implementing extremely efficient binary neural networks on constrained devices. The library outperforms state of the art NN deployment solutions by drastically reducing memory footprint and inference latency. All the solutions proposed in this thesis have been implemented on representative devices and tested in relevant applications, of which results are reported and discussed. The ELM framework is open source, and this work is clearly becoming a useful, versatile toolkit for the IoT and TinyML research and development community
    corecore