29 research outputs found

    Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees

    Full text link
    Deep Reinforcement Learning (DRL) has achieved impressive success in many applications. A key component of many DRL models is a neural network representing a Q function, to estimate the expected cumulative reward following a state-action pair. The Q function neural network contains a lot of implicit knowledge about the RL problems, but often remains unexamined and uninterpreted. To our knowledge, this work develops the first mimic learning framework for Q functions in DRL. We introduce Linear Model U-trees (LMUTs) to approximate neural network predictions. An LMUT is learned using a novel on-line algorithm that is well-suited for an active play setting, where the mimic learner observes an ongoing interaction between the neural net and the environment. Empirical evaluation shows that an LMUT mimics a Q function substantially better than five baseline methods. The transparent tree structure of an LMUT facilitates understanding the network's learned knowledge by analyzing feature influence, extracting rules, and highlighting the super-pixels in image inputs.Comment: This paper is accepted by ECML-PKDD 201

    Interval forecasts based on regression trees for streaming data

    Get PDF
    In forecasting, we often require interval forecasts instead of just a specific point forecast. To track streaming data effectively, this interval forecast should reliably cover the observed data and yet be as narrow as possible. To achieve this, we propose two methods based on regression trees: one ensemble method and one method based on a single tree. For the ensemble method, we use weighted results from the most recent models, and for the single-tree method, we retain one model until it becomes necessary to train a new model. We propose a novel method to update the interval forecast adaptively using root mean square prediction errors calculated from the latest data batch. We use wavelet-transformed data to capture long time variable information and conditional inference trees for the underlying regression tree model. Results show that both methods perform well, having good coverage without the intervals being excessively wide. When the underlying data generation mechanism changes, their performance is initially affected but can recover relatively quickly as time proceeds. The method based on a single tree performs the best in computational (CPU) time compared to the ensemble method. When compared to ARIMA and GARCH modelling, our methods achieve better or similar coverage and width but require considerably less CPU time

    Automated Adaptation Strategies for Stream Learning

    Get PDF
    Automation of machine learning model development is increasingly becoming an established research area. While automated model selection and automated data pre-processing have been studied in depth, there is, however, a gap concerning automated model adaptation strategies when multiple strategies are available. Manually developing an adaptation strategy can be time consuming and costly. In this paper we address this issue by proposing the use of flexible adaptive mechanism deployment for automated development of adaptation strategies. Experimental results after using the proposed strategies with five adaptive algorithms on 36 datasets confirm their viability. These strategies achieve better or comparable performance to the custom adaptation strategies and the repeated deployment of any single adaptive mechanism

    Adaptive Windowing for Online Learning from Multiple Inter-related Data Streams

    No full text

    Predictive regional trees to supplement geo-physical random fields

    No full text
    Nowadays ubiquitous sensor stations are deployed to measure geophysical fields for several ecological and environmental processes. Although these fields are measured at the specific location of stations, geo-statistical problems demand for inference processes to supplement, smooth and standardize recorded data. We study how predictive regional trees can supplement data sampled periodically in an ubiquitous sensing scenario. Data records that are similar one to each other are clustered according to a rectangular decomposition of the region of analysis; a predictive model is associated to the region covered by each cluster. The cluster model depicts the spatial variation of data over a map, the predictive model supplements any unknown record that is recognized belong to a cluster region. We illustrate an incremental algorithm to yield time-evolving predictive regional trees that account for the fact that the statistical properties of the recorded data may change over time. This algorithm is evaluated with spatio-temporal data collections.Nowadays ubiquitous sensor stations are deployed to measure geophysical fields for several ecological and environmental processes. Although these fields are measured at the specific location of stations, geo-statistical problems demand for inference processes to supplement, smooth and standardize recorded data. We study how predictive regional trees can supplement data sampled periodically in an ubiquitous sensing scenario. Data records that are similar one to each other are clustered according to a rectangular decomposition of the region of analysis; a predictive model is associated to the region covered by each cluster. The cluster model depicts the spatial variation of data over a map, the predictive model supplements any unknown record that is recognized belong to a cluster region. We illustrate an incremental algorithm to yield time-evolving predictive regional trees that account for the fact that the statistical properties of the recorded data may change over time. This algorithm is evaluated with spatio-temporal data collections. © Springer-Verlag Berlin Heidelberg 2013
    corecore