26,244 research outputs found
DPP-PMRF: Rethinking Optimization for a Probabilistic Graphical Model Using Data-Parallel Primitives
We present a new parallel algorithm for probabilistic graphical model
optimization. The algorithm relies on data-parallel primitives (DPPs), which
provide portable performance over hardware architecture. We evaluate results on
CPUs and GPUs for an image segmentation problem. Compared to a serial baseline,
we observe runtime speedups of up to 13X (CPU) and 44X (GPU). We also compare
our performance to a reference, OpenMP-based algorithm, and find speedups of up
to 7X (CPU).Comment: LDAV 2018, October 201
A methodology for full-system power modeling in heterogeneous data centers
The need for energy-awareness in current data centers has encouraged the use of power modeling to estimate their power consumption. However, existing models present noticeable limitations, which make them application-dependent, platform-dependent, inaccurate, or computationally complex. In this paper, we propose a platform-and application-agnostic methodology for full-system power modeling in heterogeneous data centers that overcomes those limitations. It derives a single model per platform, which works with high accuracy for heterogeneous applications with different patterns of resource usage and energy consumption, by systematically selecting a minimum set of resource usage indicators and extracting complex relations among them that capture the impact on energy consumption of all the resources in the system. We demonstrate our methodology by generating power models for heterogeneous platforms with very different power consumption profiles. Our validation experiments with real Cloud applications show that such models provide high accuracy (around 5% of average estimation error).This work is supported by the Spanish Ministry of Economy and Competitiveness under contract TIN2015-65316-P, by the Gener-
alitat de Catalunya under contract 2014-SGR-1051, and by the European Commission under FP7-SMARTCITIES-2013 contract 608679 (RenewIT) and FP7-ICT-2013-10 contracts 610874 (AS- CETiC) and 610456 (EuroServer).Peer ReviewedPostprint (author's final draft
Efficient Neural Network Implementations on Parallel Embedded Platforms Applied to Real-Time Torque-Vectoring Optimization Using Predictions for Multi-Motor Electric Vehicles
The combination of machine learning and heterogeneous embedded platforms enables new potential for developing sophisticated control concepts which are applicable to the field of vehicle dynamics and ADAS. This interdisciplinary work provides enabler solutions -ultimately implementing fast predictions using neural networks (NNs) on field programmable gate arrays (FPGAs) and graphical processing units (GPUs)- while applying them to a challenging application: Torque Vectoring on a multi-electric-motor vehicle for enhanced vehicle dynamics. The foundation motivating this work is provided by discussing multiple domains of the technological context as well as the constraints related to the automotive field, which contrast with the attractiveness of exploiting the capabilities of new embedded platforms to apply advanced control algorithms for complex control problems. In this particular case we target enhanced vehicle dynamics on a multi-motor electric vehicle benefiting from the greater degrees of freedom and controllability offered by such powertrains. Considering the constraints of the application and the implications of the selected multivariable optimization challenge, we propose a NN to provide batch predictions for real-time optimization. This leads to the major contribution of this work: efficient NN implementations on two intrinsically parallel embedded platforms, a GPU and a FPGA, following an analysis of theoretical and practical implications of their different operating paradigms, in order to efficiently harness their computing potential while gaining insight into their peculiarities. The achieved results exceed the expectations and additionally provide a representative illustration of the strengths and weaknesses of each kind of platform. Consequently, having shown the applicability of the proposed solutions, this work contributes valuable enablers also for further developments following similar fundamental principles.Some of the results presented in this work are related to activities within the 3Ccar project, which has
received funding from ECSEL Joint Undertaking under grant agreement No. 662192. This Joint Undertaking
received support from the European Union’s Horizon 2020 research and innovation programme and Germany,
Austria, Czech Republic, Romania, Belgium, United Kingdom, France, Netherlands, Latvia, Finland, Spain, Italy,
Lithuania. This work was also partly supported by the project ENABLES3, which received funding from ECSEL
Joint Undertaking under grant agreement No. 692455-2
IETF standardization in the field of the Internet of Things (IoT): a survey
Smart embedded objects will become an important part of what is called the Internet of Things. However, the integration of embedded devices into the Internet introduces several challenges, since many of the existing Internet technologies and protocols were not designed for this class of devices. In the past few years, there have been many efforts to enable the extension of Internet technologies to constrained devices. Initially, this resulted in proprietary protocols and architectures. Later, the integration of constrained devices into the Internet was embraced by IETF, moving towards standardized IP-based protocols. In this paper, we will briefly review the history of integrating constrained devices into the Internet, followed by an extensive overview of IETF standardization work in the 6LoWPAN, ROLL and CoRE working groups. This is complemented with a broad overview of related research results that illustrate how this work can be extended or used to tackle other problems and with a discussion on open issues and challenges. As such the aim of this paper is twofold: apart from giving readers solid insights in IETF standardization work on the Internet of Things, it also aims to encourage readers to further explore the world of Internet-connected objects, pointing to future research opportunities
Hardware-based Security for Virtual Trusted Platform Modules
Virtual Trusted Platform modules (TPMs) were proposed as a software-based
alternative to the hardware-based TPMs to allow the use of their cryptographic
functionalities in scenarios where multiple TPMs are required in a single
platform, such as in virtualized environments. However, virtualizing TPMs,
especially virutalizing the Platform Configuration Registers (PCRs), strikes
against one of the core principles of Trusted Computing, namely the need for a
hardware-based root of trust. In this paper we show how strength of
hardware-based security can be gained in virtual PCRs by binding them to their
corresponding hardware PCRs. We propose two approaches for such a binding. For
this purpose, the first variant uses binary hash trees, whereas the other
variant uses incremental hashing. In addition, we present an FPGA-based
implementation of both variants and evaluate their performance
- …