8,987 research outputs found
Time series quantile regression using random forests
We discuss an application of Generalized Random Forests (GRF) proposed by
Athey et al.(2019) to quantile regression for time series data. We extracted
the theoretical results of the GRF consistency for i.i.d. data to time series
data. In particular, in the main theorem, based only on the general assumptions
for time series data in Davis and Nielsen (2020), and trees in Athey et
al.(2019), we show that the tsQRF (time series Quantile Regression Forests)
estimator is consistent. Davis and Nielsen (2020) also discussed the estimation
problem using Random Forests (RF) for time series data, but the construction
procedure of the RF treated by the GRF is essentially different, and different
ideas are used throughout the theoretical proof. In addition, a simulation and
real data analysis were conducted.In the simulation, the accuracy of the
conditional quantile estimation was evaluated under time series models. In the
real data using the Nikkei Stock Average, our estimator is demonstrated to be
more sensitive than the others in terms of volatility, thus preventing
underestimation of risk
Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees
Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees,
have been successfully used for regression in many applications and research
studies. Furthermore, these methods have been extended in order to deal with
uncertainty in the output variable, using for example a quantile loss in Random
Forests (Meinshausen, 2006). To the best of our knowledge, no extension has
been provided yet for dealing with uncertainties in the input variables, even
though such uncertainties are common in practical situations. We propose here
such an extension by showing how standard regression trees optimizing a
quadratic loss can be adapted and learned while taking into account the
uncertainties in the inputs. By doing so, one no longer assumes that an
observation lies into a single region of the regression tree, but rather that
it belongs to each region with a certain probability. Experiments conducted on
several data sets illustrate the good behavior of the proposed extension.Comment: 9 page
Explainable Contextual Anomaly Detection using Quantile Regression Forests
Traditional anomaly detection methods aim to identify objects that deviate
from most other objects by treating all features equally. In contrast,
contextual anomaly detection methods aim to detect objects that deviate from
other objects within a context of similar objects by dividing the features into
contextual features and behavioral features. In this paper, we develop
connections between dependency-based traditional anomaly detection methods and
contextual anomaly detection methods. Based on resulting insights, we propose a
novel approach to inherently interpretable contextual anomaly detection that
uses Quantile Regression Forests to model dependencies between features.
Extensive experiments on various synthetic and real-world datasets demonstrate
that our method outperforms state-of-the-art anomaly detection methods in
identifying contextual anomalies in terms of accuracy and interpretability.Comment: Manuscript submitted to Data Mining and Knowledge Discovery in
October 2022 for possible publication. This is the revised version submitted
in April 202
- …