Search CORE

140 research outputs found

Fast, Accurate Processor Evaluation Through Heterogeneous, Sample-Based Benchmarking

Author: Abad Fidalgo Pablo
Gregorio Monasterio José Ángel
Prieto Torralbo Pablo
Puente Varona Valentín
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/05/2021
Field of study

Performance evaluation is a key task in computing and communication systems. Benchmarking is one of the most common techniques for evaluation purposes, where the performance of a set of representative applications is used to infer system responsiveness in a general usage scenario. Unfortunately, most benchmarking suites are limited to a reduced number of applications, and in some cases, rigid execution configurations. This makes it hard to extrapolate performance metrics for a general-purpose architecture, supposed to have a multi-year lifecycle, running dissimilar applications concurrently. The main culprit of this situation is that current benchmark-derived metrics lack generality, statistical soundness and fail to represent general-purpose environments. Previous attempts to overcome these limitations through random app mixes significantly increase computational cost (workload population shoots up), making the evaluation process barely affordable. To circumvent this problem, in this article we present a more elaborate performance evaluation methodology named BenchCast. Our proposal provides more representative performance metrics, but with a drastic reduction of computational cost, limiting app execution to a small and representative fraction marked through code annotation. Thanks to this labeling and making use of synchronization techniques, we generate heterogeneous workloads where every app runs simultaneously inside its Region Of Interest, making a few execution seconds highly representative of full application execution

Crossref

UCrea

Perceptron Learning in Cache Management and Prediction Techniques

Author: Bhatia Eshan
Publication venue
Publication date: 16/10/2019
Field of study

Hardware prefetching is an effective technique for hiding cache miss latencies in modern processor designs. An efficient prefetcher should identify complex memory access patterns during program execution. This ability enables the prefetcher to read a block ahead of its demand access, potentially preventing a cache miss. Accurately identifying the right blocks to prefetch is essential to achieving high performance from the prefetcher. Prefetcher performance can be characterized by two main metrics that are generally at odds with one another: coverage, the fraction of baseline cache misses which the prefetcher brings into the cache; and accuracy, the fraction of prefetches which are ultimately used. An overly aggressive prefetcher may improve coverage at the cost of reduced accuracy. Thus, performance may be harmed by this over-aggressiveness because many resources are wasted, including cache capacity and bandwidth. An ideal prefetcher would have both high coverage and accuracy. In this thesis, I propose Perceptron-based Prefetch Filtering (PPF) as a way to increase the coverage of the prefetches generated by a baseline prefetcher without negatively impacting accuracy. PPF enables more aggressive tuning of a given baseline prefetcher, leading to increased coverage by filtering out the growing numbers of inaccurate prefetches such an aggressive tuning implies. I also explore a range of features to use to train PPF’s perceptron layer to identify inaccurate prefetches. PPF improves performance on a memory-intensive subset of the SPEC CPU 2017 benchmarks by 3.78% for a single-core configuration, and by 11.4% for a 4-core configuration, compared to the baseline prefetcher alone

Co-scheduled VM interference in over-subscribed servers

Author: Kaffes Konstantinos
Καφφές Κωνσταντίνος
Publication venue
Publication date: 24/03/2016
Field of study

DSpace at NTUA

Memory hierarchy characterization of NoSQL applications through full-system simulation

Author: Abad Fidalgo Pablo
Colaso Diego Adrián
Gregorio Menezo Lucía
Gregorio Monasterio José Ángel
Herrero Velasco José Ángel
Prieto Torralbo Pablo
Puente Varona Valentín
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

In this work, we conduct a detailed memory characterization of a representative set of modern data-management software (Cassandra, MongoDB, OrientDB and Redis) running an illustrative NoSQL benchmark suite (YCSB). These applications are widely popular NoSQL databases with different data models and features such as in-memory storage. We compare how these data-serving applications behave with respect to other well-known benchmarks, such as SPEC CPU2006, PARSEC and NAS Parallel Benchmark. The methodology employed for evaluation relies on state-of-the-art full-system simulation tools, such as gem5. This allows us to explore configurations unattainable using performance monitoring units in actual hardware, being able to characterize memory properties. The results obtained suggest that NoSQL application behavior is not dissimilar to conventional workloads. Therefore, some of the optimizations present in state-of-the-art hardware might have a direct benefit. Nevertheless, there are some common aspects that are distinctive of conventional benchmarks that might be sufficiently relevant to be considered in architectural design. Strikingly, we also found that most database engines, independently of aspects such as workload or database size, exhibit highly uniform behavior. Finally, we show that different data-base engines make highly distinctive demands on the memory hierarchy, some being more stringent than others.This work was supported in part by the Spanish Government (Secretarıa de Estado de Investigacion, Desarrollo e Innovacion) under Grants TIN2015-66979-R and TIN2016-80512-R

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UCrea