41,517 research outputs found

    Toward sustainable data centers: a comprehensive energy management strategy

    Get PDF
    Data centers are major contributors to the emission of carbon dioxide to the atmosphere, and this contribution is expected to increase in the following years. This has encouraged the development of techniques to reduce the energy consumption and the environmental footprint of data centers. Whereas some of these techniques have succeeded to reduce the energy consumption of the hardware equipment of data centers (including IT, cooling, and power supply systems), we claim that sustainable data centers will be only possible if the problem is faced by means of a holistic approach that includes not only the aforementioned techniques but also intelligent and unifying solutions that enable a synergistic and energy-aware management of data centers. In this paper, we propose a comprehensive strategy to reduce the carbon footprint of data centers that uses the energy as a driver of their management procedures. In addition, we present a holistic management architecture for sustainable data centers that implements the aforementioned strategy, and we propose design guidelines to accomplish each step of the proposed strategy, referring to related achievements and enumerating the main challenges that must be still solved.Peer ReviewedPostprint (author's final draft

    Measuring Similarity in Large-Scale Folksonomies

    Get PDF
    Social (or folksonomic) tagging has become a very popular way to describe content within Web 2.0 websites. Unlike\ud taxonomies, which overimpose a hierarchical categorisation of content, folksonomies enable end-users to freely create and choose the categories (in this case, tags) that best\ud describe some content. However, as tags are informally de-\ud fined, continually changing, and ungoverned, social tagging\ud has often been criticised for lowering, rather than increasing, the efficiency of searching, due to the number of synonyms, homonyms, polysemy, as well as the heterogeneity of\ud users and the noise they introduce. To address this issue, a\ud variety of approaches have been proposed that recommend\ud users what tags to use, both when labelling and when looking for resources. As we illustrate in this paper, real world\ud folksonomies are characterized by power law distributions\ud of tags, over which commonly used similarity metrics, including the Jaccard coefficient and the cosine similarity, fail\ud to compute. We thus propose a novel metric, specifically\ud developed to capture similarity in large-scale folksonomies,\ud that is based on a mutual reinforcement principle: that is,\ud two tags are deemed similar if they have been associated to\ud similar resources, and vice-versa two resources are deemed\ud similar if they have been labelled by similar tags. We offer an efficient realisation of this similarity metric, and assess its quality experimentally, by comparing it against cosine similarity, on three large-scale datasets, namely Bibsonomy, MovieLens and CiteULike

    ClassTR: Classifying Within-Host Heterogeneity Based on Tandem Repeats with Application to Mycobacterium tuberculosis Infections.

    Get PDF
    Genomic tools have revealed genetically diverse pathogens within some hosts. Within-host pathogen diversity, which we refer to as "complex infection", is increasingly recognized as a determinant of treatment outcome for infections like tuberculosis. Complex infection arises through two mechanisms: within-host mutation (which results in clonal heterogeneity) and reinfection (which results in mixed infections). Estimates of the frequency of within-host mutation and reinfection in populations are critical for understanding the natural history of disease. These estimates influence projections of disease trends and effects of interventions. The genotyping technique MLVA (multiple loci variable-number tandem repeats analysis) can identify complex infections, but the current method to distinguish clonal heterogeneity from mixed infections is based on a rather simple rule. Here we describe ClassTR, a method which leverages MLVA information from isolates collected in a population to distinguish mixed infections from clonal heterogeneity. We formulate the resolution of complex infections into their constituent strains as an optimization problem, and show its NP-completeness. We solve it efficiently by using mixed integer linear programming and graph decomposition. Once the complex infections are resolved into their constituent strains, ClassTR probabilistically classifies isolates as clonally heterogeneous or mixed by using a model of tandem repeat evolution. We first compare ClassTR with the standard rule-based classification on 100 simulated datasets. ClassTR outperforms the standard method, improving classification accuracy from 48% to 80%. We then apply ClassTR to a sample of 436 strains collected from tuberculosis patients in a South African community, of which 92 had complex infections. We find that ClassTR assigns an alternate classification to 18 of the 92 complex infections, suggesting important differences in practice. By explicitly modeling tandem repeat evolution, ClassTR helps to improve our understanding of the mechanisms driving within-host diversity of pathogens like Mycobacterium tuberculosis

    Advances in forecasting with neural networks? Empirical evidence from the NN3 competition on time series prediction

    Get PDF
    This paper reports the results of the NN3 competition, which is a replication of the M3 competition with an extension of the competition towards neural network (NN) and computational intelligence (CI) methods, in order to assess what progress has been made in the 10 years since the M3 competition. Two masked subsets of the M3 monthly industry data, containing 111 and 11 empirical time series respectively, were chosen, controlling for multiple data conditions of time series length (short/long), data patterns (seasonal/non-seasonal) and forecasting horizons (short/medium/long). The relative forecasting accuracy was assessed using the metrics from the M3, together with later extensions of scaled measures, and non-parametric statistical tests. The NN3 competition attracted 59 submissions from NN, CI and statistics, making it the largest CI competition on time series data. Its main findings include: (a) only one NN outperformed the damped trend using the sMAPE, but more contenders outperformed the AutomatANN of the M3; (b) ensembles of CI approaches performed very well, better than combinations of statistical methods; (c) a novel, complex statistical method outperformed all statistical and Cl benchmarks; and (d) for the most difficult subset of short and seasonal series, a methodology employing echo state neural networks outperformed all others. The NN3 results highlight the ability of NN to handle complex data, including short and seasonal time series, beyond prior expectations, and thus identify multiple avenues for future research. (C) 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved
    corecore