2,297 research outputs found
Mondrian Forests for Large-Scale Regression when Uncertainty Matters
Many real-world regression problems demand a measure of the uncertainty
associated with each prediction. Standard decision forests deliver efficient
state-of-the-art predictive performance, but high-quality uncertainty estimates
are lacking. Gaussian processes (GPs) deliver uncertainty estimates, but
scaling GPs to large-scale data sets comes at the cost of approximating the
uncertainty estimates. We extend Mondrian forests, first proposed by
Lakshminarayanan et al. (2014) for classification problems, to the large-scale
non-parametric regression setting. Using a novel hierarchical Gaussian prior
that dovetails with the Mondrian forest framework, we obtain principled
uncertainty estimates, while still retaining the computational advantages of
decision forests. Through a combination of illustrative examples, real-world
large-scale datasets, and Bayesian optimization benchmarks, we demonstrate that
Mondrian forests outperform approximate GPs on large-scale regression tasks and
deliver better-calibrated uncertainty assessments than decision-forest-based
methods.Comment: Proceedings of the 19th International Conference on Artificial
Intelligence and Statistics (AISTATS) 2016, Cadiz, Spain. JMLR: W&CP volume
5
Efficient Benchmarking of Algorithm Configuration Procedures via Model-Based Surrogates
The optimization of algorithm (hyper-)parameters is crucial for achieving
peak performance across a wide range of domains, ranging from deep neural
networks to solvers for hard combinatorial problems. The resulting algorithm
configuration (AC) problem has attracted much attention from the machine
learning community. However, the proper evaluation of new AC procedures is
hindered by two key hurdles. First, AC benchmarks are hard to set up. Second
and even more significantly, they are computationally expensive: a single run
of an AC procedure involves many costly runs of the target algorithm whose
performance is to be optimized in a given AC benchmark scenario. One common
workaround is to optimize cheap-to-evaluate artificial benchmark functions
(e.g., Branin) instead of actual algorithms; however, these have different
properties than realistic AC problems. Here, we propose an alternative
benchmarking approach that is similarly cheap to evaluate but much closer to
the original AC problem: replacing expensive benchmarks by surrogate benchmarks
constructed from AC benchmarks. These surrogate benchmarks approximate the
response surface corresponding to true target algorithm performance using a
regression model, and the original and surrogate benchmark share the same
(hyper-)parameter space. In our experiments, we construct and evaluate
surrogate benchmarks for hyperparameter optimization as well as for AC problems
that involve performance optimization of solvers for hard combinatorial
problems, drawing training data from the runs of existing AC procedures. We
show that our surrogate benchmarks capture overall important characteristics of
the AC scenarios, such as high- and low-performing regions, from which they
were derived, while being much easier to use and orders of magnitude cheaper to
evaluate
Distributional Regression for Data Analysis
Flexible modeling of how an entire distribution changes with covariates is an
important yet challenging generalization of mean-based regression that has seen
growing interest over the past decades in both the statistics and machine
learning literature. This review outlines selected state-of-the-art statistical
approaches to distributional regression, complemented with alternatives from
machine learning. Topics covered include the similarities and differences
between these approaches, extensions, properties and limitations, estimation
procedures, and the availability of software. In view of the increasing
complexity and availability of large-scale data, this review also discusses the
scalability of traditional estimation methods, current trends, and open
challenges. Illustrations are provided using data on childhood malnutrition in
Nigeria and Australian electricity prices.Comment: Accepted for publication in Annual Review of Statistics and its
Applicatio
Entity Personalized Talent Search Models with Tree Interaction Features
Talent Search systems aim to recommend potential candidates who are a good
match to the hiring needs of a recruiter expressed in terms of the recruiter's
search query or job posting. Past work in this domain has focused on linear and
nonlinear models which lack preference personalization in the user-level due to
being trained only with globally collected recruiter activity data. In this
paper, we propose an entity-personalized Talent Search model which utilizes a
combination of generalized linear mixed (GLMix) models and gradient boosted
decision tree (GBDT) models, and provides personalized talent recommendations
using nonlinear tree interaction features generated by the GBDT. We also
present the offline and online system architecture for the productionization of
this hybrid model approach in our Talent Search systems. Finally, we provide
offline and online experiment results benchmarking our entity-personalized
model with tree interaction features, which demonstrate significant
improvements in our precision metrics compared to globally trained
non-personalized models.Comment: This paper has been accepted for publication at ACM WWW 201
Towards Efficient and Scalable Acceleration of Online Decision Tree Learning on FPGA
Decision trees are machine learning models commonly used in various
application scenarios. In the era of big data, traditional decision tree
induction algorithms are not suitable for learning large-scale datasets due to
their stringent data storage requirement. Online decision tree learning
algorithms have been devised to tackle this problem by concurrently training
with incoming samples and providing inference results. However, even the most
up-to-date online tree learning algorithms still suffer from either high memory
usage or high computational intensity with dependency and long latency, making
them challenging to implement in hardware. To overcome these difficulties, we
introduce a new quantile-based algorithm to improve the induction of the
Hoeffding tree, one of the state-of-the-art online learning models. The
proposed algorithm is light-weight in terms of both memory and computational
demand, while still maintaining high generalization ability. A series of
optimization techniques dedicated to the proposed algorithm have been
investigated from the hardware perspective, including coarse-grained and
fine-grained parallelism, dynamic and memory-based resource sharing, pipelining
with data forwarding. We further present a high-performance, hardware-efficient
and scalable online decision tree learning system on a field-programmable gate
array (FPGA) with system-level optimization techniques. Experimental results
show that our proposed algorithm outperforms the state-of-the-art Hoeffding
tree learning method, leading to 0.05% to 12.3% improvement in inference
accuracy. Real implementation of the complete learning system on the FPGA
demonstrates a 384x to 1581x speedup in execution time over the
state-of-the-art design.Comment: appear as a conference paper in FCCM 201
- …