713 research outputs found
Incremental online learning in high dimensions
this article, however, is problematic, as it requires a careful selection of initial ridge regression parameters to stabilize the highly rank-deficient full covariance matrix of the input data, and it is easy to create too much bias or too little numerical stabilization initially, which can trap the local distance metric adaptation in local minima.While the LWPR algorithm just computes about a factor 10 times longer for the 20D experiment in comparison to the 2D experiment, RFWR requires a 1000-fold increase of computation time, thus rendering this algorithm unsuitable for high-dimensional regression. In order to compare LWPR's results to other popular regression methods, we evaluated the 2D, 10D, and 20D cross data sets with gaussian process regression (GP) and support vector (SVM) regression in addition to our LWPR method. It should be noted that neither SVM nor GP methods is an incremental method, although they can be considered state-of-the-art for batch regression under relatively small numbers of training data and reasonable input dimensionality. The computational complexity of these methods is prohibitively high for real-time applications. The GP algorithm (Gibbs & MacKay, 1997) used a generic covariance function and optimized over the hyperparameters. The SVM regression was performed using a standard available package (Saunders et al., 1998) and optimized for kernel choices. Figure 6 compares the performance of LWPR and gaussian processes for the above-mentioned data sets using 100, 300, and 500 training data point
Deferring the learning for better generalization in radial basis neural networks
Proceeding of: International Conference Artificial Neural Networks — ICANN 2001. Vienna, Austria, August 21–25, 2001The level of generalization of neural networks is heavily dependent on the quality of the training data. That is, some of the training patterns can be redundant or irrelevant. It has been shown that with careful dynamic selection of training patterns, better generalization performance may be obtained. Nevertheless, generalization is carried out independently of the novel patterns to be approximated. In this paper, we present a learning method that automatically selects the most appropriate training patterns to the new sample to be predicted. The proposed method has been applied to Radial Basis Neural Networks, whose generalization capability is usually very poor. The learning strategy slows down the response of the network in the generalisation phase. However, this does not introduces a significance limitation in the application of the method because of the fast training of Radial Basis Neural Networks
Efficient Model Learning for Human-Robot Collaborative Tasks
We present a framework for learning human user models from joint-action
demonstrations that enables the robot to compute a robust policy for a
collaborative task with a human. The learning takes place completely
automatically, without any human intervention. First, we describe the
clustering of demonstrated action sequences into different human types using an
unsupervised learning algorithm. These demonstrated sequences are also used by
the robot to learn a reward function that is representative for each type,
through the employment of an inverse reinforcement learning algorithm. The
learned model is then used as part of a Mixed Observability Markov Decision
Process formulation, wherein the human type is a partially observable variable.
With this framework, we can infer, either offline or online, the human type of
a new user that was not included in the training set, and can compute a policy
for the robot that will be aligned to the preference of this new user and will
be robust to deviations of the human actions from prior demonstrations. Finally
we validate the approach using data collected in human subject experiments, and
conduct proof-of-concept demonstrations in which a person performs a
collaborative task with a small industrial robot
Linear Bellman combination for control of character animation
Controllers are necessary for physically-based synthesis of character animation. However, creating controllers requires either manual tuning or expensive computer optimization. We introduce linear Bellman combination as a method for reusing existing controllers. Given a set of controllers for related tasks, this combination creates a controller that performs a new task. It naturally weights the contribution of each component controller by its relevance to the current state and goal of the system. We demonstrate that linear Bellman combination outperforms naive combination often succeeding where naive combination fails. Furthermore, this combination is provably optimal for a new task if the component controllers are also optimal for related tasks. We demonstrate the applicability of linear Bellman combination to interactive character control of stepping motions and acrobatic maneuvers.Singapore-MIT GAMBIT Game LabNational Science Foundation (U.S.) (Grant 2007043041)National Science Foundation (U.S.) (Grant CCF-0810888)Adobe SystemsPixar (Firm
Searching High Redshift Large-Scale Structures: Photometry of Four Fields Around Quasar Pairs at z~1
We have studied the photometric properties of four fields around the
high-redshift quasar pairs QP1310+0007, QP1355-0032, QP0110-0219, and
QP0114-3140 at z ~ 1 with the aim of identifying large-scale structures- galaxy
clusters or groups- around them. This sample was observed with GMOS in Gemini
North and South telescopes in the g', r', i', and z' bands, and our photometry
is complete to a limiting magnitude of i' ~ 24 mag (corresponding to ~ M*_i' +
2 at the redshift of the pairs). Our analysis reveals that QP0110-0219 shows
very strong and QP1310+0007 and QP1355-0032 show some evidence for the presence
of rich galaxy clusters in direct vicinity of the pairs. On the other hand,
QP0114-3140 could be an isolated pair in a poor environment. This work suggest
that z ~ 1 quasar pairs are excellent tracers of high density environments and
this same technique may be useful to find clusters at higher redshifts.Comment: 29 pages, 7 figures, ApJ accepted. Added one figure and 3 references.
Some paragraphs was rewritten in sections 1, 3, 5, and 6, as suggested by
refere
Optimization And Learning For Rough Terrain Legged Locomotion
We present a novel approach to legged locomotion over rough terrain that is thoroughly rooted in optimization. This approach relies on a hierarchy of fast, anytime algorithms to plan a set of footholds, along with the dynamic body motions required to execute them. Components within the planning framework coordinate to exchange plans, cost-to-go estimates, and \u27certificates\u27 that ensure the output of an abstract high-level planner can be realized by lower layers of the hierarchy. The burden of careful engineering of cost functions to achieve desired performance is substantially mitigated by a simple inverse optimal control technique. Robustness is achieved by real-time re-planning of the full trajectory, augmented by reflexes and feedback control. We demonstrate the successful application of our approach in guiding the LittleDog quadruped robot over a variety of types of rough terrain. Other novel aspects of our past research efforts include a variety of pioneering inverse optimal control techniques as well as a system for planning using arbitrary pre-recorded robot behavior
Predicting human interruptibility with sensors, in
A person seeking someone else’s attention is normally able to quickly assess how interruptible they are. This assessment allows for behavior we perceive as natural, socially appropriate, or simply polite. On the other hand, today’s computer systems are almost entirely oblivious to the human world they operate in, and typically have no way to take into account the interruptibility of the user. This paper presents a Wizard of Oz study exploring whether, and how, robust sensor-based predictions of interruptibility might be constructed, which sensors might be most useful to such predictions, and how simple such sensors might be. The study simulates a range of possible sensors through human coding of audio and video recordings. Experience sampling is used to simultaneously collect randomly distributed self-reports of interruptibility. Based on these simulated sensors, we construct statistical models predicting human interruptibility and compare their predictions with the collected self-report data. The results of these models, although covering a demographically limited sample, are very promising, with the overall accuracy of several models reaching about 78%. Additionally, a model tuned to avoiding unwanted interruptions does so for 90 % of its predictions, while retaining 75 % overall accuracy
- …