Search CORE

145,428 research outputs found

Performance metrics for consolidated servers

Author: Eeckhout Lieven
Georges Andy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

In spite of the widespread adoption of virtualization and consol- idation, there exists no consensus with respect to how to bench- mark consolidated servers that run multiple guest VMs on the same physical hardware. For example, VMware proposes VMmark which basically computes the geometric mean of normalized throughput values across the VMs; Intel uses vConsolidate which reports a weighted arithmetic average of normalized throughput values. These benchmarking methodologies focus on total system through- put (i.e., across all VMs in the system), and do not take into account per-VM performance. We argue that a benchmarking methodology for consolidated servers should quantify both total system through- put and per-VM performance in order to provide a meaningful and precise performance characterization. We therefore present two performance metrics, Total Normalized Throughput (TNT) to characterize total system performance, and Average Normalized Reduced Throughput (ANRT) to characterize per-VM performance. We compare TNT and ANRT against VMmark using published performance numbers, and report several cases for which the VM- mark score is misleading. This is, VMmark says one platform yields better performance than another, however, TNT and ANRT show that both platforms represent different trade-offs in total system throughput versus per-VM performance. Or, even worse, in a cou- ple cases we observe that VMmark yields opposite conclusions than TNT and ANRT, i.e., VMmark says one system performs better than another one which is contradicted by TNT/ANRT performance characterization

Ghent University Academic Bibliography

True Performance Metrics in Electrochemical Energy Storage

Author: Gogotsi Yury
Simon Patrice
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 18/11/2011
Field of study

A dramatic expansion of research in the area of electrochemical energy storage (EES) during the past decade has been driven by the demand for EES in handheld electronic devices, transportation, and storage of renewable energy for the power grid (1–3). However, the outstanding properties reported for new electrode materials may not necessarily be applicable to performance of electrochemical capacitors (ECs). These devices, also called supercapacitors or ultra-capacitors (4), store charge with ions from solution at charged porous electrodes. Unlike batteries, which store large amounts of energy but deliver it slowly, ECs can deliver energy faster (develop high power), but only for a short time. However, recent work has claimed energy densities for ECs approaching (5) or even exceeding that of batteries. We show that even when some metrics seem to support these claims, actual device performance may be rather mediocre. We will focus here on ECs, but these considerations also apply to lithium (Li)—ion batteries

Crossref

Open Archive Toulouse Archive Ouverte

Hal-Diderot

Exploring Symmetry of Binary Classification Performance Metrics

Author: Aguayo-González Francisco (Coordinador)
Carrasco Muñoz Alejandro
Lama-Ruiz Juan Ramón
León de Mora Carlos (Coordinador)
Luque Sendra Amalia
Martín-Gómez Alejandro Manuel
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Selecting the proper performance metric constitutes a key issue for most classification problems in the field of machine learning. Although the specialized literature has addressed several topics regarding these metrics, their symmetries have yet to be systematically studied. This research focuses on ten metrics based on a binary confusion matrix and their symmetric behaviour is formally defined under all types of transformations. Through simulated experiments, which cover the full range of datasets and classification results, the symmetric behaviour of these metrics is explored by exposing them to hundreds of simple or combined symmetric transformations. Cross-symmetries among the metrics and statistical symmetries are also explored. The results obtained show that, in all cases, three and only three types of symmetries arise: labelling inversion (between positive and negative classes); scoring inversion (concerning good and bad classifiers); and the combination of these two inversions. Additionally, certain metrics have been shown to be independent of the imbalance in the dataset and two cross-symmetries have been identified. The results regarding their symmetries reveal a deeper insight into the behaviour of various performance metrics and offer an indicator to properly interpret their values and a guide for their selection for certain specific applications.University of Seville (Spain) by Telefónica Chair “Intelligence in Networks

idUS. Depósito de Investigación Universidad de Sevilla

Recommended from our members

Public Performance Metrics: Driving Physician Motivation and Performance

Author: Barton Erik D.
Bennett Kathryn
Gaubert Ronald
Han Vy
Jen Maxwell
Rudkin Scott E.
Wong Andrew C.
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Introduction: As providers transition from “fee-for-service” to “pay-for-performance” models, focus has shifted to improving performance. This trend extends to the emergency department (ED) where visits continue to increase across the United States. Our objective was to determine whether displaying public performance metrics of physician triage data could drive intangible motivators and improve triage performance in the ED.Methods: This is a single institution, time-series performance study on a physician-in-triage system. Individual physician baseline metrics—number of patients triaged and dispositioned per shift—were obtained and prominently displayed with identifiable labels during each quarterly physician group meeting. Physicians were informed that metrics would be collected and displayed quarterly and that there would be no bonuses, punishments, or required training; physicians were essentially free to do as they wished. It was made explicit that the goal was to increase the number triaged, and while the number dispositioned would also be displayed, it would not be a focus, thereby acting as this study’s control. At the end of one year, we analyzed metrics.Results: The group’s average number of patients triaged per shift were as follows: Q1-29.2; Q2-31.9; Q3-34.4; Q4-36.5 (Q1 vs Q4, p < 0.00001). The average numbers of patients dispositioned per shift were Q1-16.4; Q2-17.8; Q3-16.9; Q4-15.3 (Q1 vs Q4, p = 0.14). The top 25% of Q1 performers increased their average numbers triaged from Q1-36.5 to Q4-40.3 (ie, a statistically insignificant increase of 3.8 patients per shift [p = 0.07]). The bottom 25% of Q1 performers, on the other hand, increased their averages from Q1-22.4 to Q4-34.5 (ie, a statistically significant increase of 12.2 patients per shift [p = 0.0013]).Conclusion: Public performance metrics can drive intangible motivators (eg, purpose, mastery, and peer pressure), which can be an effective, low-cost strategy to improve individual performance, achieve institutional goals, and thrive in the pay-for-performance era

eScholarship - University of California

Surrogate regret bounds for generalized classification performance metrics

Author: Dembczyński Krzysztof
Kotłowski Wojciech
Publication venue
Publication date: 07/10/2016
Field of study

We consider optimization of generalized performance metrics for binary classification by means of surrogate losses. We focus on a class of metrics, which are linear-fractional functions of the false positive and false negative rates (examples of which include

F_{\beta}

-measure, Jaccard similarity coefficient, AM measure, and many others). Our analysis concerns the following two-step procedure. First, a real-valued function

f

is learned by minimizing a surrogate loss for binary classification on the training sample. It is assumed that the surrogate loss is a strongly proper composite loss function (examples of which include logistic loss, squared-error loss, exponential loss, etc.). Then, given

f

, a threshold

\widehat{\theta}

is tuned on a separate validation sample, by direct optimization of the target performance metric. We show that the regret of the resulting classifier (obtained from thresholding

f

\widehat{\theta}

) measured with respect to the target metric is upperbounded by the regret of

f

measured with respect to the surrogate loss. We also extend our results to cover multilabel classification and provide regret bounds for micro- and macro-averaging measures. Our findings are further analyzed in a computational study on both synthetic and real data sets.Comment: 22 page

arXiv.org e-Print Archive

Springer - Publisher Connector

Recommended from our members

Performance Metrics for the City of Los Angeles

Author: Brozen Madeline
Cushing Rachel
Huff Herbie
Ocanas Margot
Singh Chanda
Publication venue: eScholarship, University of California
Publication date: 01/07/2012
Field of study

eScholarship - University of California

Comparing performance metrics for multi-resource systems: the case of urban metabolism

Author: Keirstead J
Ravalde T
Publication venue: 'Elsevier BV'
Publication date: 11/12/2015
Field of study

We investigate different approaches to assessing the performance of multi-resource systems, i.e. networks of processes used to convert resource inputs to useful goods and services. For a given set of system outputs, alternative resource inputs are often possible so performance measures are needed to determine the best system configuration for a given goal. We define such performance measures according to a novel framework which categorises them into two types: those that can be calculated from a system's aggregate inputs and outputs (‘black-box’ metrics, e.g. carbon footprint); and those that require knowledge of resource conversion processes within the system (‘grey-box’ metrics). Urban areas are an important example application and metrics can be calculated from urban metabolism data. We calculate eight black-box metrics for fifteen global cities and find that performance is poorly correlated between the measures. This suggests that performance assessments should adopt grey-box approaches and consider flows at the level of individual processes within a city, using methods such as exergy analysis and ecological network analysis. We are led to suggest how to: (1) improve urban metabolism accounting to assist grey-box metric calculation, by including greater detail on conversion process and resource quality; and (2) promote these metrics amongst relevant decision makers

Spiral - Imperial College Digital Repository