5,051 research outputs found

    Efficient estimation of AUC in a sliding window

    Full text link
    In many applications, monitoring area under the ROC curve (AUC) in a sliding window over a data stream is a natural way of detecting changes in the system. The drawback is that computing AUC in a sliding window is expensive, especially if the window size is large and the data flow is significant. In this paper we propose a scheme for maintaining an approximate AUC in a sliding window of length kk. More specifically, we propose an algorithm that, given ϵ\epsilon, estimates AUC within ϵ/2\epsilon / 2, and can maintain this estimate in O((logk)/ϵ)O((\log k) / \epsilon) time, per update, as the window slides. This provides a speed-up over the exact computation of AUC, which requires O(k)O(k) time, per update. The speed-up becomes more significant as the size of the window increases. Our estimate is based on grouping the data points together, and using these groups to calculate AUC. The grouping is designed carefully such that (ii) the groups are small enough, so that the error stays small, (iiii) the number of groups is small, so that enumerating them is not expensive, and (iiiiii) the definition is flexible enough so that we can maintain the groups efficiently. Our experimental evaluation demonstrates that the average approximation error in practice is much smaller than the approximation guarantee ϵ/2\epsilon / 2, and that we can achieve significant speed-ups with only a modest sacrifice in accuracy

    Learning from medical data streams: an introduction

    Get PDF
    Clinical practice and research are facing a new challenge created by the rapid growth of health information science and technology, and the complexity and volume of biomedical data. Machine learning from medical data streams is a recent area of research that aims to provide better knowledge extraction and evidence-based clinical decision support in scenarios where data are produced as a continuous flow. This year's edition of AIME, the Conference on Artificial Intelligence in Medicine, enabled the sound discussion of this area of research, mainly by the inclusion of a dedicated workshop. This paper is an introduction to LEMEDS, the Learning from Medical Data Streams workshop, which highlights the contributed papers, the invited talk and expert panel discussion, as well as related papers accepted to the main conference