Search CORE

16,946 research outputs found

Estimating the Potential Speedup of Computer Vision Applications on Embedded Multiprocessors

Author: Cleyet-Merle Sébastien
Issard Alain
Mancini Stéphane
Schwambach Vítor
Publication venue
Publication date: 26/02/2015
Field of study

Computer vision applications constitute one of the key drivers for embedded multicore architectures. Although the number of available cores is increasing in new architectures, designing an application to maximize the utilization of the platform is still a challenge. In this sense, parallel performance prediction tools can aid developers in understanding the characteristics of an application and finding the most adequate parallelization strategy. In this work, we present a method for early parallel performance estimation on embedded multiprocessors from sequential application traces. We describe its implementation in Parana, a fast trace-driven simulator targeting OpenMP applications on the STMicroelectronics' STxP70 Application-Specific Multiprocessor (ASMP). Results for the FAST key point detector application show an error margin of less than 10% compared to the reference cycle-approximate simulator, with lower modeling effort and up to 20x faster execution time.Comment: Presented at DATE Friday Workshop on Heterogeneous Architectures and Design Methods for Embedded Image Systems (HIS 2015) (arXiv:1502.07241

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

RELEASE: A High-level Paradigm for Reliable Large-scale Server Software

Author: Chechina Natalia
Trinder Phil
Publication venue
Publication date: 01/01/2012
Field of study

Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the rst six months. The project aim is to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the e ectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene

Enlighten

Programming MPSoC platforms: Road works ahead

Author: Bekooij Marco
Domer Rainer
Leupers Rainer
Nohl Achim
Soonhoi Ha
Vajda Andras
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2009
Field of study

This paper summarizes a special session on multicore/multi-processor system-on-chip (MPSoC) programming challenges. The current trend towards MPSoC platforms in most computing domains does not only mean a radical change in computer architecture. Even more important from a SW developer´s viewpoint, at the same time the classical sequential von Neumann programming model needs to be overcome. Efficient utilization of the MPSoC HW resources demands for radically new models and corresponding SW development tools, capable of exploiting the available parallelism and guaranteeing bug-free parallel SW. While several standards are established in the high-performance computing domain (e.g. OpenMP), it is clear that more innovations are required for successful\ud deployment of heterogeneous embedded MPSoC. On the other hand, at least for coming years, the freedom for disruptive programming technologies is limited by the huge amount of certified sequential code that demands for a more pragmatic, gradual tool and code replacement strategy

University of Twente Research Information

Enhancing Energy Production with Exascale HPC Methods

Author: Camata José J.
Cela José M.
Costa Danilo
Coutinho Alvaro LGA
Fernández-Galisteo Daniel
Jiménez Carmen
Kourdioumov Vadim
Mattoso Marta
Mayo-García Rafael
Miras Thomas
Moríñigo José A.
Navarro Jorge
Navaux Philippe O.A.
Oliveira Daniel de
Rodríguez-Pascual Manuel
Silva Vítor
Souza Renan
Valduriez Patrick
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

High Performance Computing (HPC) resources have become the key actor for achieving more ambitious challenges in many disciplines. In this step beyond, an explosion on the available parallelism and the use of special purpose processors are crucial. With such a goal, the HPC4E project applies new exascale HPC techniques to energy industry simulations, customizing them if necessary, and going beyond the state-of-the-art in the required HPC exascale simulations for different energy sources. In this paper, a general overview of these methods is presented as well as some specific preliminary results.The research leading to these results has received funding from the European Union's Horizon 2020 Programme (2014-2020) under the HPC4E Project (www.hpc4e.eu), grant agreement n° 689772, the Spanish Ministry of Economy and Competitiveness under the CODEC2 project (TIN2015-63562-R), and from the Brazilian Ministry of Science, Technology and Innovation through Rede Nacional de Pesquisa (RNP). Computer time on Endeavour cluster is provided by the Intel Corporation, which enabled us to obtain the presented experimental results in uncertainty quantification in seismic imagingPostprint (author's final draft

INRIA a CCSD electronic archive server

HAL-Rennes 1

Modeling the Internet of Things: a simulation perspective

Author: D'Angelo Gabriele
Ferretti Stefano
Ghini Vittorio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

This paper deals with the problem of properly simulating the Internet of Things (IoT). Simulating an IoT allows evaluating strategies that can be employed to deploy smart services over different kinds of territories. However, the heterogeneity of scenarios seriously complicates this task. This imposes the use of sophisticated modeling and simulation techniques. We discuss novel approaches for the provision of scalable simulation scenarios, that enable the real-time execution of massively populated IoT environments. Attention is given to novel hybrid and multi-level simulation techniques that, when combined with agent-based, adaptive Parallel and Distributed Simulation (PADS) approaches, can provide means to perform highly detailed simulations on demand. To support this claim, we detail a use case concerned with the simulation of vehicular transportation systems.Comment: Proceedings of the IEEE 2017 International Conference on High Performance Computing and Simulation (HPCS 2017

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Diluting the Scalability Boundaries: Exploring the Use of Disaggregated Architectures for High-Level Network Data Analysis

Author: Aracil Javier
Buedo Sergio Lopez
Meyer Hugo
Vega Carlos
Zazo Jose Fernando
Zyulkyarov Ferad
Publication venue
Publication date: 18/09/2017
Field of study

Traditional data centers are designed with a rigid architecture of fit-for-purpose servers that provision resources beyond the average workload in order to deal with occasional peaks of data. Heterogeneous data centers are pushing towards more cost-efficient architectures with better resource provisioning. In this paper we study the feasibility of using disaggregated architectures for intensive data applications, in contrast to the monolithic approach of server-oriented architectures. Particularly, we have tested a proactive network analysis system in which the workload demands are highly variable. In the context of the dReDBox disaggregated architecture, the results show that the overhead caused by using remote memory resources is significant, between 66\% and 80\%, but we have also observed that the memory usage is one order of magnitude higher for the stress case with respect to average workloads. Therefore, dimensioning memory for the worst case in conventional systems will result in a notable waste of resources. Finally, we found that, for the selected use case, parallelism is limited by memory. Therefore, using a disaggregated architecture will allow for increased parallelism, which, at the same time, will mitigate the overhead caused by remote memory.Comment: 8 pages, 6 figures, 2 tables, 32 references. Pre-print. The paper will be presented during the IEEE International Conference on High Performance Computing and Communications in Bangkok, Thailand. 18 - 20 December, 2017. To be published in the conference proceeding

arXiv.org e-Print Archive