22,251 research outputs found

    Attribute Interactions in Medical Data Analysis

    Get PDF
    There is much empirical evidence about the success of naive Bayesian classification (NBC) in medical applications of attribute-based machine learning. NBC assumes conditional independence between attributes. In classification, such classifiers sum up the pieces of class-related evidence from individual attributes, independently of other attributes. The performance, however, deteriorates significantly when the ā€œinteractionsā€ between attributes become critical. We propose an approach to handling attribute interactions within the framework of ā€œvotingā€ classifiers, such as NBC. We propose an operational test for detecting interactions in learning data and a procedure that takes the detected interactions into account while learning. This approach induces a structuring of the domain of attributes, it may lead to improved classifierā€™s performance and may provide useful novel information for the domain expert when interpreting the results of learning. We report on its application in data analysis and model construction for the prediction of clinical outcome in hip arthroplasty

    Optimal model-free prediction from multivariate time series

    Get PDF
    Forecasting a time series from multivariate predictors constitutes a challenging problem, especially using model-free approaches. Most techniques, such as nearest-neighbor prediction, quickly suffer from the curse of dimensionality and overfitting for more than a few predictors which has limited their application mostly to the univariate case. Therefore, selection strategies are needed that harness the available information as efficiently as possible. Since often the right combination of predictors matters, ideally all subsets of possible predictors should be tested for their predictive power, but the exponentially growing number of combinations makes such an approach computationally prohibitive. Here a prediction scheme that overcomes this strong limitation is introduced utilizing a causal pre-selection step which drastically reduces the number of possible predictors to the most predictive set of causal drivers making a globally optimal search scheme tractable. The information-theoretic optimality is derived and practical selection criteria are discussed. As demonstrated for multivariate nonlinear stochastic delay processes, the optimal scheme can even be less computationally expensive than commonly used sub-optimal schemes like forward selection. The method suggests a general framework to apply the optimal model-free approach to select variables and subsequently fit a model to further improve a prediction or learn statistical dependencies. The performance of this framework is illustrated on a climatological index of El Ni\~no Southern Oscillation.Comment: 14 pages, 9 figure

    Exploiting road traffic data for very short term load forecasting in smart grids

    Get PDF
    If accurate short term prediction of electricity consumption is available, the Smart Grid infrastructure can rapidly and reliably react to changing conditions. The economic importance of accurate predictions justifies research for more complex forecasting algorithms. This paper proposes road traffic data as a new input dimension that can help improve very short term load forecasting. We explore the dependencies between power demand and road traffic data and evaluate the predictive power of the added dimension compared with other common features, such as historical load and temperature profiles
    • ā€¦
    corecore