10 research outputs found

    Mondrian Forests for Large-Scale Regression when Uncertainty Matters

    Full text link
    Many real-world regression problems demand a measure of the uncertainty associated with each prediction. Standard decision forests deliver efficient state-of-the-art predictive performance, but high-quality uncertainty estimates are lacking. Gaussian processes (GPs) deliver uncertainty estimates, but scaling GPs to large-scale data sets comes at the cost of approximating the uncertainty estimates. We extend Mondrian forests, first proposed by Lakshminarayanan et al. (2014) for classification problems, to the large-scale non-parametric regression setting. Using a novel hierarchical Gaussian prior that dovetails with the Mondrian forest framework, we obtain principled uncertainty estimates, while still retaining the computational advantages of decision forests. Through a combination of illustrative examples, real-world large-scale datasets, and Bayesian optimization benchmarks, we demonstrate that Mondrian forests outperform approximate GPs on large-scale regression tasks and deliver better-calibrated uncertainty assessments than decision-forest-based methods.Comment: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS) 2016, Cadiz, Spain. JMLR: W&CP volume 5

    Isolation Mondrian Forest for Batch and Online Anomaly Detection

    Full text link
    We propose a new method, named isolation Mondrian forest (iMondrian forest), for batch and online anomaly detection. The proposed method is a novel hybrid of isolation forest and Mondrian forest which are existing methods for batch anomaly detection and online random forest, respectively. iMondrian forest takes the idea of isolation, using the depth of a node in a tree, and implements it in the Mondrian forest structure. The result is a new data structure which can accept streaming data in an online manner while being used for anomaly detection. Our experiments show that iMondrian forest mostly performs better than isolation forest in batch settings and has better or comparable performance against other batch and online anomaly detection methods.Comment: Accepted for presentation at the IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2020. The first three authors contributed equally to this wor
    corecore