495,673 research outputs found
Breiman's "Two Cultures" Revisited and Reconciled
In a landmark paper published in 2001, Leo Breiman described the tense
standoff between two cultures of data modeling: parametric statistical and
algorithmic machine learning. The cultural division between these two
statistical learning frameworks has been growing at a steady pace in recent
years. What is the way forward? It has become blatantly obvious that this
widening gap between "the two cultures" cannot be averted unless we find a way
to blend them into a coherent whole. This article presents a solution by
establishing a link between the two cultures. Through examples, we describe the
challenges and potential gains of this new integrated statistical thinking.Comment: This paper celebrates the 70th anniversary of Statistical Machine
Learning--- how far we've come, and how far we have to go. Keywords:
Integrated statistical learning theory, Exploratory machine learning,
Uncertainty prediction machine, ML-powered modern applied statistics,
Information theor
Optimizing for Generalization in Machine Learning with Cross-Validation Gradients
Cross-validation is the workhorse of modern applied statistics and machine
learning, as it provides a principled framework for selecting the model that
maximizes generalization performance. In this paper, we show that the
cross-validation risk is differentiable with respect to the hyperparameters and
training data for many common machine learning algorithms, including logistic
regression, elastic-net regression, and support vector machines. Leveraging
this property of differentiability, we propose a cross-validation gradient
method (CVGM) for hyperparameter optimization. Our method enables efficient
optimization in high-dimensional hyperparameter spaces of the cross-validation
risk, the best surrogate of the true generalization ability of our learning
algorithm.Comment: 11 page
Minimal Achievable Sufficient Statistic Learning
We introduce Minimal Achievable Sufficient Statistic (MASS) Learning, a machine learning training objective for which the minima are minimal sufficient statistics with respect to a class of functions being optimized over (e.g., deep networks). In deriving MASS Learning, we also introduce Conserved Differential Information (CDI), an information-theoretic quantity that {—} unlike standard mutual information {—} can be usefully applied to deterministically-dependent continuous random variables like the input and output of a deep network. In a series of experiments, we show that deep networks trained with MASS Learning achieve competitive performance on supervised learning, regularization, and uncertainty quantification benchmarks
Machine Learning Methods Economists Should Know About
We discuss the relevance of the recent Machine Learning (ML) literature for
economics and econometrics. First we discuss the differences in goals, methods
and settings between the ML literature and the traditional econometrics and
statistics literatures. Then we discuss some specific methods from the machine
learning literature that we view as important for empirical researchers in
economics. These include supervised learning methods for regression and
classification, unsupervised learning methods, as well as matrix completion
methods. Finally, we highlight newly developed methods at the intersection of
ML and econometrics, methods that typically perform better than either
off-the-shelf ML or more traditional econometric methods when applied to
particular classes of problems, problems that include causal inference for
average treatment effects, optimal policy estimation, and estimation of the
counterfactual effect of price changes in consumer choice models
A review of homomorphic encryption and software tools for encrypted statistical machine learning
Recent advances in cryptography promise to enable secure statistical
computation on encrypted data, whereby a limited set of operations can be
carried out without the need to first decrypt. We review these homomorphic
encryption schemes in a manner accessible to statisticians and machine
learners, focusing on pertinent limitations inherent in the current state of
the art. These limitations restrict the kind of statistics and machine learning
algorithms which can be implemented and we review those which have been
successfully applied in the literature. Finally, we document a high performance
R package implementing a recent homomorphic scheme in a general framework.Comment: 21 pages, technical repor
MONEYBaRL: Exploiting pitcher decision-making using Reinforcement Learning
This manuscript uses machine learning techniques to exploit baseball
pitchers' decision making, so-called "Baseball IQ," by modeling the at-bat
information, pitch selection and counts, as a Markov Decision Process (MDP).
Each state of the MDP models the pitcher's current pitch selection in a
Markovian fashion, conditional on the information immediately prior to making
the current pitch. This includes the count prior to the previous pitch, his
ensuing pitch selection, the batter's ensuing action and the result of the
pitch.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS712 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A Bayesian Model of node interaction in networks
We are concerned with modeling the strength of links in networks by taking
into account how often those links are used. Link usage is a strong indicator
of how closely two nodes are related, but existing network models in Bayesian
Statistics and Machine Learning are able to predict only wether a link exists
at all. As priors for latent attributes of network nodes we explore the Chinese
Restaurant Process (CRP) and a multivariate Gaussian with fixed dimensionality.
The model is applied to a social network dataset and a word coocurrence
dataset
Semi-Stochastic Frank-Wolfe Algorithms with Away-Steps for Block-Coordinate Structure Problems
We propose a semi-stochastic Frank-Wolfe algorithm with away-steps for
regularized empirical risk minimization and extend it to problems with
block-coordinate structure. Our algorithms use adaptive step-size and we show
that they converge linearly in expectation. The proposed algorithms can be
applied to many important problems in statistics and machine learning including
regularized generalized linear models, support vector machines and many others.
In preliminary numerical tests on structural SVM and graph-guided fused LASSO,
our algorithms outperform other competing algorithms in both iteration cost and
total number of data passes
Optimal Margin Distribution Machine
Support vector machine (SVM) has been one of the most popular learning
algorithms, with the central idea of maximizing the minimum margin, i.e., the
smallest distance from the instances to the classification boundary. Recent
theoretical results, however, disclosed that maximizing the minimum margin does
not necessarily lead to better generalization performances, and instead, the
margin distribution has been proven to be more crucial. Based on this idea, we
propose a new method, named Optimal margin Distribution Machine (ODM), which
tries to achieve a better generalization performance by optimizing the margin
distribution. We characterize the margin distribution by the first- and
second-order statistics, i.e., the margin mean and variance. The proposed
method is a general learning approach which can be used in any place where SVM
can be applied, and their superiority is verified both theoretically and
empirically in this paper.Comment: arXiv admin note: substantial text overlap with arXiv:1311.098
How Developers Iterate on Machine Learning Workflows -- A Survey of the Applied Machine Learning Literature
Machine learning workflow development is anecdotally regarded to be an
iterative process of trial-and-error with humans-in-the-loop. However, we are
not aware of quantitative evidence corroborating this popular belief. A
quantitative characterization of iteration can serve as a benchmark for machine
learning workflow development in practice, and can aid the development of
human-in-the-loop machine learning systems. To this end, we conduct a
small-scale survey of the applied machine learning literature from five
distinct application domains. We collect and distill statistics on the role of
iteration within machine learning workflow development, and report preliminary
trends and insights from our investigation, as a starting point towards this
benchmark. Based on our findings, we finally describe desiderata for effective
and versatile human-in-the-loop machine learning systems that can cater to
users in diverse domains
- …