292 research outputs found

    On the suitability of combining feature selection and resampling to manage data complexity

    Get PDF
    The effectiveness of a learning task depends on data com- plexity (class overlap, class imbalance, irrelevant features, etc.). When more than one complexity factor appears, two or more preprocessing techniques should be applied. Nevertheless, no much effort has been de- voted to investigate the importance of the order in which they can be used. This paper focuses on the joint use of feature reduction and bal- ancing techniques, and studies which could be the application order that leads to the best classification results. This analysis was made on a spe- cific problem whose aim was to identify the melodic track given a MIDI file. Several experiments were performed from different imbalanced 38- dimensional training sets with many more accompaniment tracks than melodic tracks, and where features were aggregated without any correla- tion study. Results showed that the most effective combination was the ordered use of resampling and feature reduction techniques

    Assessing the reliability of ensemble forecasting systems under serial dependence

    Get PDF
    The problem of testing the reliability of ensemble forecasting systems is revisited. A popular tool to assess the reliability of ensemble forecasting systems (for scalar verifications) is the rank histogram; this histogram is expected to be more or less flat, since for a reliable ensemble, the ranks are uniformly distributed among their possible outcomes. Quantitative tests for flatness (e.g. Pearson's goodness–of–fit test) have been suggested; without exception though, these tests assume the ranks to be a sequence of independent random variables, which is not the case in general as can be demonstrated with simple toy examples. In this paper, tests are developed that take the temporal correlations between the ranks into account. A refined analysis exploiting the reliability property shows that the ranks still exhibit strong decay of correlations. This property is key to the analysis, and the proposed tests are valid for general ensemble forecasting systems with minimal extraneous assumptions

    Electron-hadron shower discrimination in a liquid argon time projection chamber

    Get PDF
    By exploiting structural differences between electromagnetic and hadronic showers in a multivariate analysis we present an efficient Electron-Hadron discrimination algorithm for liquid argon time projection chambers, validated using Geant4 simulated data
    corecore