8,987 research outputs found

    Time series quantile regression using random forests

    Full text link
    We discuss an application of Generalized Random Forests (GRF) proposed by Athey et al.(2019) to quantile regression for time series data. We extracted the theoretical results of the GRF consistency for i.i.d. data to time series data. In particular, in the main theorem, based only on the general assumptions for time series data in Davis and Nielsen (2020), and trees in Athey et al.(2019), we show that the tsQRF (time series Quantile Regression Forests) estimator is consistent. Davis and Nielsen (2020) also discussed the estimation problem using Random Forests (RF) for time series data, but the construction procedure of the RF treated by the GRF is essentially different, and different ideas are used throughout the theoretical proof. In addition, a simulation and real data analysis were conducted.In the simulation, the accuracy of the conditional quantile estimation was evaluated under time series models. In the real data using the Nikkei Stock Average, our estimator is demonstrated to be more sensitive than the others in terms of volatility, thus preventing underestimation of risk

    Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees

    Full text link
    Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees, have been successfully used for regression in many applications and research studies. Furthermore, these methods have been extended in order to deal with uncertainty in the output variable, using for example a quantile loss in Random Forests (Meinshausen, 2006). To the best of our knowledge, no extension has been provided yet for dealing with uncertainties in the input variables, even though such uncertainties are common in practical situations. We propose here such an extension by showing how standard regression trees optimizing a quadratic loss can be adapted and learned while taking into account the uncertainties in the inputs. By doing so, one no longer assumes that an observation lies into a single region of the regression tree, but rather that it belongs to each region with a certain probability. Experiments conducted on several data sets illustrate the good behavior of the proposed extension.Comment: 9 page

    Explainable Contextual Anomaly Detection using Quantile Regression Forests

    Get PDF
    Traditional anomaly detection methods aim to identify objects that deviate from most other objects by treating all features equally. In contrast, contextual anomaly detection methods aim to detect objects that deviate from other objects within a context of similar objects by dividing the features into contextual features and behavioral features. In this paper, we develop connections between dependency-based traditional anomaly detection methods and contextual anomaly detection methods. Based on resulting insights, we propose a novel approach to inherently interpretable contextual anomaly detection that uses Quantile Regression Forests to model dependencies between features. Extensive experiments on various synthetic and real-world datasets demonstrate that our method outperforms state-of-the-art anomaly detection methods in identifying contextual anomalies in terms of accuracy and interpretability.Comment: Manuscript submitted to Data Mining and Knowledge Discovery in October 2022 for possible publication. This is the revised version submitted in April 202
    corecore