2,244 research outputs found
Random Tessellation Forests
Space partitioning methods such as random forests and the Mondrian process
are powerful machine learning methods for multi-dimensional and relational
data, and are based on recursively cutting a domain. The flexibility of these
methods is often limited by the requirement that the cuts be axis aligned. The
Ostomachion process and the self-consistent binary space partitioning-tree
process were recently introduced as generalizations of the Mondrian process for
space partitioning with non-axis aligned cuts in the two dimensional plane.
Motivated by the need for a multi-dimensional partitioning tree with non-axis
aligned cuts, we propose the Random Tessellation Process (RTP), a framework
that includes the Mondrian process and the binary space partitioning-tree
process as special cases. We derive a sequential Monte Carlo algorithm for
inference, and provide random forest methods. Our process is self-consistent
and can relax axis-aligned constraints, allowing complex inter-dimensional
dependence to be captured. We present a simulation study, and analyse gene
expression data of brain tissue, showing improved accuracies over other
methods.Comment: 11 pages, 4 figure
Mondrian Forests for Large-Scale Regression when Uncertainty Matters
Many real-world regression problems demand a measure of the uncertainty
associated with each prediction. Standard decision forests deliver efficient
state-of-the-art predictive performance, but high-quality uncertainty estimates
are lacking. Gaussian processes (GPs) deliver uncertainty estimates, but
scaling GPs to large-scale data sets comes at the cost of approximating the
uncertainty estimates. We extend Mondrian forests, first proposed by
Lakshminarayanan et al. (2014) for classification problems, to the large-scale
non-parametric regression setting. Using a novel hierarchical Gaussian prior
that dovetails with the Mondrian forest framework, we obtain principled
uncertainty estimates, while still retaining the computational advantages of
decision forests. Through a combination of illustrative examples, real-world
large-scale datasets, and Bayesian optimization benchmarks, we demonstrate that
Mondrian forests outperform approximate GPs on large-scale regression tasks and
deliver better-calibrated uncertainty assessments than decision-forest-based
methods.Comment: Proceedings of the 19th International Conference on Artificial
Intelligence and Statistics (AISTATS) 2016, Cadiz, Spain. JMLR: W&CP volume
5
- …