Search CORE

41,760 research outputs found

Benchmarking for wireless sensor networks

Author: Bouckaert Stefan
Demeester Piet
Moerman Ingrid
Vanhie-Van Gerwen Jono
Publication venue: International Academy, Research, and Industry Association (IARIA)
Publication date: 01/01/2011
Field of study

Comparing Computing Platforms for Deep Learning on a Humanoid Robot

Author: Abdul Jabbar
D Speck
HA Pierson
Sergey Levine
SK Chalup
T Houliston
Victor W. Lee
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/01/2019
Field of study

The goal of this study is to test two different computing platforms with respect to their suitability for running deep networks as part of a humanoid robot software system. One of the platforms is the CPU-centered Intel NUC7i7BNH and the other is a NVIDIA Jetson TX2 system that puts more emphasis on GPU processing. The experiments addressed a number of benchmarking tasks including pedestrian detection using deep neural networks. Some of the results were unexpected but demonstrate that platforms exhibit both advantages and disadvantages when taking computational performance and electrical power requirements of such a system into account.Comment: 12 pages, 5 figure

arXiv.org e-Print Archive

Crossref

MLPerf Inference Benchmark

Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.Comment: ISCA 202

arXiv.org e-Print Archive

Crossref

Optimum Selection of DNN Model and Framework for Edge Inference

Author: Carmona Galán Ricardo
Fernández Berni Jorge
Rodríguez Vázquez Ángel Benito
Velasco Montero Delia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

This paper describes a methodology to select the optimum combination of deep neuralnetwork and software framework for visual inference on embedded systems. As a first step, benchmarkingis required. In particular, we have benchmarked six popular network models running on four deep learningframeworks implemented on a low-cost embedded platform. Three key performance metrics have beenmeasured and compared with the resulting 24 combinations: accuracy, throughput, and power consumption.Then, application-level specifications come into play. We propose a figure of merit enabling the evaluationof each network/framework pair in terms of relative importance of the aforementioned metrics for a targetedapplication. We prove through numerical analysis and meaningful graphical representations that only areduced subset of the combinations must actually be considered for real deployment. Our approach can beextended to other networks, frameworks, and performance parameters, thus supporting system-level designdecisions in the ever-changing ecosystem of embedded deep learning technology.Ministerio de Economía y Competitividad (TEC2015-66878-C3-1-R)Junta de Andalucía (TIC 2338-2013)European Union Horizon 2020 (Grant 765866

idUS. Depósito de Investigación Universidad de Sevilla

Portability, compatibility and reuse of MAC protocols across different IoT radio platforms

Author: Bauwens Jan
De Poorter Eli
Giannoulis Spilios
Jabandžić Irfan
Jooris Bart
Moerman Ingrid
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

To cope with the diversity of Internet of Things (loT) requirements, a large number of Medium Access Control (MAC) protocols have been proposed in scientific literature, many of which are designed for specific application domains. However, for most of these MAC protocols, no multi-platform software implementation is available. In fact, the path from conceptual MAC protocol proposed in theoretical papers, towards an actual working implementation is rife with pitfalls. (i) A first problem is the timing bugs, frequently encountered in MAC implementations. (ii) Furthermore, once implemented, many MAC protocols are strongly optimized for specific hardware, thereby limiting the potential of software reuse or modifications. (iii) Finally, in real-life conditions, the performance of the MAC protocol varies strongly depending on the actual underlying radio chip. As a result, the same MAC protocol implementation acts differently per platform, resulting in unpredictable/asymmetrical behavior when multiple platforms are combined in the same network. This paper describes in detail the challenges related to multi-platform MAC development, and experimentally quantifies how the above issues impact the MAC protocol performance when running MAC protocols on multiple radio chips. Finally, an overall methodology is proposed to avoid the previously mentioned cross-platform compatibility issues. (C) 2018 Elsevier B.V. All rights reserved

Ghent University Academic Bibliography

Archivsystem Ask23

The Brain on Low Power Architectures - Efficient Simulation of Cortical Slow Waves and Asynchronous States

Author: Ammendola Roberto
Biagioni Andrea
Capuani Fabrizio
Cicero Francesca Lo
Cretaro Paolo
De Bonis Giulia
Lonardo Alessandro
Martinelli Michele
Paolucci Pier Stanislao
Pastorelli Elena
Pontisso Luca
Simula Francesco
Vicini Piero
Publication venue: 'IOS Press'
Publication date: 01/01/2018
Field of study

Efficient brain simulation is a scientific grand challenge, a parallel/distributed coding challenge and a source of requirements and suggestions for future computing architectures. Indeed, the human brain includes about 10^15 synapses and 10^11 neurons activated at a mean rate of several Hz. Full brain simulation poses Exascale challenges even if simulated at the highest abstraction level. The WaveScalES experiment in the Human Brain Project (HBP) has the goal of matching experimental measures and simulations of slow waves during deep-sleep and anesthesia and the transition to other brain states. The focus is the development of dedicated large-scale parallel/distributed simulation technologies. The ExaNeSt project designs an ARM-based, low-power HPC architecture scalable to million of cores, developing a dedicated scalable interconnect system, and SWA/AW simulations are included among the driving benchmarks. At the joint between both projects is the INFN proprietary Distributed and Plastic Spiking Neural Networks (DPSNN) simulation engine. DPSNN can be configured to stress either the networking or the computation features available on the execution platforms. The simulation stresses the networking component when the neural net - composed by a relatively low number of neurons, each one projecting thousands of synapses - is distributed over a large number of hardware cores. When growing the number of neurons per core, the computation starts to be the dominating component for short range connections. This paper reports about preliminary performance results obtained on an ARM-based HPC prototype developed in the framework of the ExaNeSt project. Furthermore, a comparison is given of instantaneous power, total energy consumption, execution time and energetic cost per synaptic event of SWA/AW DPSNN simulations when executed on either ARM- or Intel-based server platforms

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza