127,456 research outputs found
Orthogonal Statistical Learning
We provide non-asymptotic excess risk guarantees for statistical learning in
a setting where the population risk with respect to which we evaluate the
target parameter depends on an unknown nuisance parameter that must be
estimated from data. We analyze a two-stage sample splitting meta-algorithm
that takes as input two arbitrary estimation algorithms: one for the target
parameter and one for the nuisance parameter. We show that if the population
risk satisfies a condition called Neyman orthogonality, the impact of the
nuisance estimation error on the excess risk bound achieved by the
meta-algorithm is of second order. Our theorem is agnostic to the particular
algorithms used for the target and nuisance and only makes an assumption on
their individual performance. This enables the use of a plethora of existing
results from statistical learning and machine learning to give new guarantees
for learning with a nuisance component. Moreover, by focusing on excess risk
rather than parameter estimation, we can give guarantees under weaker
assumptions than in previous works and accommodate settings in which the target
parameter belongs to a complex nonparametric class. We provide conditions on
the metric entropy of the nuisance and target classes such that oracle
rates---rates of the same order as if we knew the nuisance parameter---are
achieved. We also derive new rates for specific estimation algorithms such as
variance-penalized empirical risk minimization, neural network estimation and
sparse high-dimensional linear model estimation. We highlight the applicability
of our results in four settings of central importance: 1) heterogeneous
treatment effect estimation, 2) offline policy optimization, 3) domain
adaptation, and 4) learning with missing data
Distributed Kernel Regression: An Algorithm for Training Collaboratively
This paper addresses the problem of distributed learning under communication
constraints, motivated by distributed signal processing in wireless sensor
networks and data mining with distributed databases. After formalizing a
general model for distributed learning, an algorithm for collaboratively
training regularized kernel least-squares regression estimators is derived.
Noting that the algorithm can be viewed as an application of successive
orthogonal projection algorithms, its convergence properties are investigated
and the statistical behavior of the estimator is discussed in a simplified
theoretical setting.Comment: To be presented at the 2006 IEEE Information Theory Workshop, Punta
del Este, Uruguay, March 13-17, 200
Identifying structural changes with unsupervised machine learning methods
Unsupervised machine learning methods are used to identify structural changes
using the melting point transition in classical molecular dynamics simulations
as an example application of the approach. Dimensionality reduction and
clustering methods are applied to instantaneous radial distributions of atomic
configurations from classical molecular dynamics simulations of metallic
systems over a large temperature range. Principal component analysis is used to
dramatically reduce the dimensionality of the feature space across the samples
using an orthogonal linear transformation that preserves the statistical
variance of the data under the condition that the new feature space is linearly
independent. From there, k-means clustering is used to partition the samples
into solid and liquid phases through a criterion motivated by the geometry of
the reduced feature space of the samples, allowing for an estimation of the
melting point transition. This pattern criterion is conceptually similar to how
humans interpret the data but with far greater throughput, as the shapes of the
radial distributions are different for each phase and easily distinguishable by
humans. The transition temperature estimates derived from this machine learning
approach produce comparable results to other methods on similarly small system
sizes. These results show that machine learning approaches can be applied to
structural changes in physical systems
Recommended from our members
How does predicate invention affect human comprehensibility?
During the 1980s Michie defined Machine Learning in terms of two orthogonal axes of performance: predictive accuracy and comprehensibility of generated hypotheses. Since predictive accuracy was readily measurable and comprehensibility not so, later definitions in the 1990s, such as that of Mitchell, tended to use a one-dimensional approach to Machine Learning based solely on predictive accuracy, ultimately favouring statistical over symbolic Machine Learning approaches. In this paper we provide a definition of comprehensibility of hypotheses which can be estimated using human participant trials. We present the results of experiments testing human comprehensibility of logic programs learned with and without predicate invention. Results indicate that comprehensibility is affected not only by the complexity of the presented program but also by the existence of anonymous predicate symbols
- …