41,820 research outputs found
Learning to select data for transfer learning with Bayesian Optimization
Domain similarity measures can be used to gauge adaptability and select
suitable data for transfer learning, but existing approaches define ad hoc
measures that are deemed suitable for respective tasks. Inspired by work on
curriculum learning, we propose to \emph{learn} data selection measures using
Bayesian Optimization and evaluate them across models, domains and tasks. Our
learned measures outperform existing domain similarity measures significantly
on three tasks: sentiment analysis, part-of-speech tagging, and parsing. We
show the importance of complementing similarity with diversity, and that
learned measures are -- to some degree -- transferable across models, domains,
and even tasks.Comment: EMNLP 2017. Code available at:
https://github.com/sebastianruder/learn-to-select-dat
Recommended from our members
Gaussian process regression for virtual metrology of microchip quality and the resulting strategic sampling scheme
Manufacturing of integrated circuits involves many sequential processes, often ex- ecuted to nanoscale tolerances, and the yield depends on the often unmeasured quality of intermediate steps. In the high-throughput industry of fabricating microelectronics on semi-conducting wafers, scheduling measurements of product quality before the electrical test of the complete IC can be expensive. We therefore seek to predict metrics of product quality based on sensor readings describing the environment within the relevant tool during the processing of each wafer, or to apply the concept of virtual metrology (VM) to monitor these intermediate steps. We model the data using Gaussian process regression (GPR), adapted to simultaneously learn the nonlinear dynamics that govern the quality characteristic, as well as their operating space, expressed by a linear embedding of the sensor traces’ features. Such Bayesian models predict a distribution for the target metric, such as a critical dimension, so one may assess the model’s credibility through its predictive uncertainty. Assuming measurements of the quality characteristic of interest are budgeted, we seek to hasten convergence of the GPR model to a credible form through an active sampling scheme, whereby the predictive uncertainty informs which wafer’s quality to measure next. We evaluate this convergence when predicting and updating online, as if in a factory, using a large dataset for plasma-enhanced chemical vapor deposition (PECVD), with measured thicknesses for ~32,000 wafers. By approximately optimizing the information extracted from this seemingly repetitive data describing a tightly controlled process, GPR achieves ~10% greater accuracy on average than a baseline linear model based on partial least squares (PLS). In a derivative study, we seek to discern the degree of drift in the process over the several months the data spans. We express this drift by how unusual the relevant features, as embedded by the GPR model, appear as the in- puts compensate for degrading conditions. This method detects the onset of consistently unusual behavior that extends to a bimodal thickness fault, anticipating its flagging by as much as two days.Mechanical Engineerin
Sample Efficient Optimization for Learning Controllers for Bipedal Locomotion
Learning policies for bipedal locomotion can be difficult, as experiments are
expensive and simulation does not usually transfer well to hardware. To counter
this, we need al- gorithms that are sample efficient and inherently safe.
Bayesian Optimization is a powerful sample-efficient tool for optimizing
non-convex black-box functions. However, its performance can degrade in higher
dimensions. We develop a distance metric for bipedal locomotion that enhances
the sample-efficiency of Bayesian Optimization and use it to train a 16
dimensional neuromuscular model for planar walking. This distance metric
reflects some basic gait features of healthy walking and helps us quickly
eliminate a majority of unstable controllers. With our approach we can learn
policies for walking in less than 100 trials for a range of challenging
settings. In simulation, we show results on two different costs and on various
terrains including rough ground and ramps, sloping upwards and downwards. We
also perturb our models with unknown inertial disturbances analogous with
differences between simulation and hardware. These results are promising, as
they indicate that this method can potentially be used to learn control
policies on hardware.Comment: To appear in International Conference on Humanoid Robots (Humanoids
'2016), IEEE-RAS. (Rika Antonova and Akshara Rai contributed equally
- …