10 research outputs found
Mondrian Forests for Large-Scale Regression when Uncertainty Matters
Many real-world regression problems demand a measure of the uncertainty
associated with each prediction. Standard decision forests deliver efficient
state-of-the-art predictive performance, but high-quality uncertainty estimates
are lacking. Gaussian processes (GPs) deliver uncertainty estimates, but
scaling GPs to large-scale data sets comes at the cost of approximating the
uncertainty estimates. We extend Mondrian forests, first proposed by
Lakshminarayanan et al. (2014) for classification problems, to the large-scale
non-parametric regression setting. Using a novel hierarchical Gaussian prior
that dovetails with the Mondrian forest framework, we obtain principled
uncertainty estimates, while still retaining the computational advantages of
decision forests. Through a combination of illustrative examples, real-world
large-scale datasets, and Bayesian optimization benchmarks, we demonstrate that
Mondrian forests outperform approximate GPs on large-scale regression tasks and
deliver better-calibrated uncertainty assessments than decision-forest-based
methods.Comment: Proceedings of the 19th International Conference on Artificial
Intelligence and Statistics (AISTATS) 2016, Cadiz, Spain. JMLR: W&CP volume
5
Isolation Mondrian Forest for Batch and Online Anomaly Detection
We propose a new method, named isolation Mondrian forest (iMondrian forest),
for batch and online anomaly detection. The proposed method is a novel hybrid
of isolation forest and Mondrian forest which are existing methods for batch
anomaly detection and online random forest, respectively. iMondrian forest
takes the idea of isolation, using the depth of a node in a tree, and
implements it in the Mondrian forest structure. The result is a new data
structure which can accept streaming data in an online manner while being used
for anomaly detection. Our experiments show that iMondrian forest mostly
performs better than isolation forest in batch settings and has better or
comparable performance against other batch and online anomaly detection
methods.Comment: Accepted for presentation at the IEEE International Conference on
Systems, Man, and Cybernetics (SMC) 2020. The first three authors contributed
equally to this wor