145,428 research outputs found

    Performance metrics for consolidated servers

    Get PDF
    In spite of the widespread adoption of virtualization and consol- idation, there exists no consensus with respect to how to bench- mark consolidated servers that run multiple guest VMs on the same physical hardware. For example, VMware proposes VMmark which basically computes the geometric mean of normalized throughput values across the VMs; Intel uses vConsolidate which reports a weighted arithmetic average of normalized throughput values. These benchmarking methodologies focus on total system through- put (i.e., across all VMs in the system), and do not take into account per-VM performance. We argue that a benchmarking methodology for consolidated servers should quantify both total system through- put and per-VM performance in order to provide a meaningful and precise performance characterization. We therefore present two performance metrics, Total Normalized Throughput (TNT) to characterize total system performance, and Average Normalized Reduced Throughput (ANRT) to characterize per-VM performance. We compare TNT and ANRT against VMmark using published performance numbers, and report several cases for which the VM- mark score is misleading. This is, VMmark says one platform yields better performance than another, however, TNT and ANRT show that both platforms represent different trade-offs in total system throughput versus per-VM performance. Or, even worse, in a cou- ple cases we observe that VMmark yields opposite conclusions than TNT and ANRT, i.e., VMmark says one system performs better than another one which is contradicted by TNT/ANRT performance characterization

    True Performance Metrics in Electrochemical Energy Storage

    Get PDF
    A dramatic expansion of research in the area of electrochemical energy storage (EES) during the past decade has been driven by the demand for EES in handheld electronic devices, transportation, and storage of renewable energy for the power grid (1–3). However, the outstanding properties reported for new electrode materials may not necessarily be applicable to performance of electrochemical capacitors (ECs). These devices, also called supercapacitors or ultra-capacitors (4), store charge with ions from solution at charged porous electrodes. Unlike batteries, which store large amounts of energy but deliver it slowly, ECs can deliver energy faster (develop high power), but only for a short time. However, recent work has claimed energy densities for ECs approaching (5) or even exceeding that of batteries. We show that even when some metrics seem to support these claims, actual device performance may be rather mediocre. We will focus here on ECs, but these considerations also apply to lithium (Li)—ion batteries

    Exploring Symmetry of Binary Classification Performance Metrics

    Get PDF
    Selecting the proper performance metric constitutes a key issue for most classification problems in the field of machine learning. Although the specialized literature has addressed several topics regarding these metrics, their symmetries have yet to be systematically studied. This research focuses on ten metrics based on a binary confusion matrix and their symmetric behaviour is formally defined under all types of transformations. Through simulated experiments, which cover the full range of datasets and classification results, the symmetric behaviour of these metrics is explored by exposing them to hundreds of simple or combined symmetric transformations. Cross-symmetries among the metrics and statistical symmetries are also explored. The results obtained show that, in all cases, three and only three types of symmetries arise: labelling inversion (between positive and negative classes); scoring inversion (concerning good and bad classifiers); and the combination of these two inversions. Additionally, certain metrics have been shown to be independent of the imbalance in the dataset and two cross-symmetries have been identified. The results regarding their symmetries reveal a deeper insight into the behaviour of various performance metrics and offer an indicator to properly interpret their values and a guide for their selection for certain specific applications.University of Seville (Spain) by Telefónica Chair “Intelligence in Networks

    Surrogate regret bounds for generalized classification performance metrics

    Full text link
    We consider optimization of generalized performance metrics for binary classification by means of surrogate losses. We focus on a class of metrics, which are linear-fractional functions of the false positive and false negative rates (examples of which include FÎČF_{\beta}-measure, Jaccard similarity coefficient, AM measure, and many others). Our analysis concerns the following two-step procedure. First, a real-valued function ff is learned by minimizing a surrogate loss for binary classification on the training sample. It is assumed that the surrogate loss is a strongly proper composite loss function (examples of which include logistic loss, squared-error loss, exponential loss, etc.). Then, given ff, a threshold Ξ^\widehat{\theta} is tuned on a separate validation sample, by direct optimization of the target performance metric. We show that the regret of the resulting classifier (obtained from thresholding ff on Ξ^\widehat{\theta}) measured with respect to the target metric is upperbounded by the regret of ff measured with respect to the surrogate loss. We also extend our results to cover multilabel classification and provide regret bounds for micro- and macro-averaging measures. Our findings are further analyzed in a computational study on both synthetic and real data sets.Comment: 22 page

    Comparing performance metrics for multi-resource systems: the case of urban metabolism

    Get PDF
    We investigate different approaches to assessing the performance of multi-resource systems, i.e. networks of processes used to convert resource inputs to useful goods and services. For a given set of system outputs, alternative resource inputs are often possible so performance measures are needed to determine the best system configuration for a given goal. We define such performance measures according to a novel framework which categorises them into two types: those that can be calculated from a system's aggregate inputs and outputs (‘black-box’ metrics, e.g. carbon footprint); and those that require knowledge of resource conversion processes within the system (‘grey-box’ metrics). Urban areas are an important example application and metrics can be calculated from urban metabolism data. We calculate eight black-box metrics for fifteen global cities and find that performance is poorly correlated between the measures. This suggests that performance assessments should adopt grey-box approaches and consider flows at the level of individual processes within a city, using methods such as exergy analysis and ecological network analysis. We are led to suggest how to: (1) improve urban metabolism accounting to assist grey-box metric calculation, by including greater detail on conversion process and resource quality; and (2) promote these metrics amongst relevant decision makers
    • 

    corecore