Search CORE

921 research outputs found

Tuning the Level of Concurrency in Software Transactional Memory: An Overview of Recent Analytical, Machine Learning and Mixed Approaches

Author: B. Haagdorens
C.C. Minh
D. Dice
D. Didona
J. Ruppert
J.C. Bezdek
M. Ansari
M.F. Spear
M.P. Herlihy
P. Felber
P. Sanzo Di
Z. He
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Synchronization transparency offered by Software Transactional Memory (STM) must not come at the expense of run-time efficiency, thus demanding from the STM-designer the inclusion of mechanisms properly oriented to performance and other quality indexes. Particularly, one core issue to cope with in STM is related to exploiting parallelism while also avoiding thrashing phenomena due to excessive transaction rollbacks, caused by excessively high levels of contention on logical resources, namely concurrently accessed data portions. A means to address run-time efficiency consists in dynamically determining the best-suited level of concurrency (number of threads) to be employed for running the application (or specific application phases) on top of the STM layer. For too low levels of concurrency, parallelism can be hampered. Conversely, over-dimensioning the concurrency level may give rise to the aforementioned thrashing phenomena caused by excessive data contention—an aspect which has reflections also on the side of reduced energy-efficiency. In this chapter we overview a set of recent techniques aimed at building “application-specific” performance models that can be exploited to dynamically tune the level of concurrency to the best-suited value. Although they share some base concepts while modeling the system performance vs the degree of concurrency, these techniques rely on disparate methods, such as machine learning or analytic methods (or combinations of the two), and achieve different tradeoffs in terms of the relation between the precision of the performance model and the latency for model instantiation. Implications of the different tradeoffs in real-life scenarios are also discussed

Archivio della ricerca- Università di Roma La Sapienza

Analytical/ML Mixed Approach for Concurrency Regulation in Software Transactional Memory

Author: Ciciani Bruno
DI SANZO Pierangelo
Quaglia Francesco
Rughetti Diego
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

In this article we exploit a combination of analytical and Machine Learning (ML) techniques in order to build a performance model allowing to dynamically tune the level of concurrency of applications based on Software Transactional Memory (STM). Our mixed approach has the advantage of reducing the training time of pure machine learning methods, and avoiding approximation errors typically affecting pure analytical approaches. Hence it allows very fast construction of highly reliable performance models, which can be promptly and effectively exploited for optimizing actual application runs. We also present a real implementation of a concurrency regulation architecture, based on the mixed modeling approach, which has been integrated with the open source Tiny STM package, together with experimental data related to runs of applications taken from the STAMP benchmark suite demonstrating the effectiveness of our proposal. © 2014 IEEE

CiteSeerX

Archivio della ricerca- Università di Roma La Sapienza

ARM Wrestling with Big Data: A Study of Commodity ARM64 Server for Big Data Workloads

Author: Kalyanasundaram Jayanth
Simmhan Yogesh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/09/2017
Field of study

ARM processors have dominated the mobile device market in the last decade due to their favorable computing to energy ratio. In this age of Cloud data centers and Big Data analytics, the focus is increasingly on power efficient processing, rather than just high throughput computing. ARM's first commodity server-grade processor is the recent AMD A1100-series processor, based on a 64-bit ARM Cortex A57 architecture. In this paper, we study the performance and energy efficiency of a server based on this ARM64 CPU, relative to a comparable server running an AMD Opteron 3300-series x64 CPU, for Big Data workloads. Specifically, we study these for Intel's HiBench suite of web, query and machine learning benchmarks on Apache Hadoop v2.7 in a pseudo-distributed setup, for data sizes up to

20GB

files,

5M

web pages and

500M

tuples. Our results show that the ARM64 server's runtime performance is comparable to the x64 server for integer-based workloads like Sort and Hive queries, and only lags behind for floating-point intensive benchmarks like PageRank, when they do not exploit data parallelism adequately. We also see that the ARM64 server takes

\frac{1}{3}^{rd}

the energy, and has an Energy Delay Product (EDP) that is

50-71\%

lower than the x64 server. These results hold promise for ARM64 data centers hosting Big Data workloads to reduce their operational costs, while opening up opportunities for further analysis.Comment: Accepted for publication in the Proceedings of the 24th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC), 201

arXiv.org e-Print Archive

Towards Power-Aware Data Pipelining on Multicores

Author: Aldinucci Marco
Daniele De Sensi
Gabriele Mencagli
Marco Danelutto
Massimo Torquati
Publication venue: 'Universidad de Valladolid'
Publication date: 01/01/2017
Field of study

DR.SGX: Hardening SGX Enclaves against Cache Attacks with Data Location Randomization

Author: Aline Souza (4218802)
Angélica Oliveira Pontes (4218796)
Camila Sanchez (4218811)
Diego Riaño-Pachón (3430481)
Eliane Santana (4218793)
Guilherme Zanini (4218805)
Gustavo Borin (4218808)
Gustavo Goldman (271482)
Juliana Oliveira (3747553)
Renato Santos (4218790)
Roberta Dal’Mas (4218799)
Publication venue
Publication date: 28/09/2017
Field of study

Recent research has demonstrated that Intel's SGX is vulnerable to various software-based side-channel attacks. In particular, attacks that monitor CPU caches shared between the victim enclave and untrusted software enable accurate leakage of secret enclave data. Known defenses assume developer assistance, require hardware changes, impose high overhead, or prevent only some of the known attacks. In this paper we propose data location randomization as a novel defensive approach to address the threat of side-channel attacks. Our main goal is to break the link between the cache observations by the privileged adversary and the actual data accesses by the victim. We design and implement a compiler-based tool called DR.SGX that instruments enclave code such that data locations are permuted at the granularity of cache lines. We realize the permutation with the CPU's cryptographic hardware-acceleration units providing secure randomization. To prevent correlation of repeated memory accesses we continuously re-randomize all enclave data during execution. Our solution effectively protects many (but not all) enclaves from cache attacks and provides a complementary enclave hardening technique that is especially useful against unpredictable information leakage

arXiv.org e-Print Archive

FigShare

Characterization and Optimization

Author
Publication venue: Springer
Publication date: 01/01/2015
Field of study

Springer - Publisher Connector