68 research outputs found
Successor features for transfer in reinforcement learning
Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. Our focus is on transfer where the reward functions vary across tasks while the environment's dynamics remain the same. The method we propose rests on two key ideas: "successor features," a value function representation that decouples the dynamics of the environment from the rewards, and "generalized policy improvement," a generalization of dynamic programming's policy improvement step that considers a set of policies rather than a single one. Put together, the two ideas lead to an approach that integrates seamlessly within the reinforcement learning framework and allows transfer to take place between tasks without any restriction. The proposed method also provides performance guarantees for the transferred policy even before any learning has taken place. We derive two theorems that set our approach in firm theoretical ground and present experiments that show that it successfully promotes transfer in practice
General video game AI: Competition, challenges, and opportunities
The General Video Game AI framework and competition pose the problem of creating artificial intelligence that can play a wide, and in principle unlimited, range of games. Concretely, it tackles the problem of devising an algorithm that is able to play any game it is given, even if the game is not known a priori. This area of study can be seen as an approximation of General Artificial Intelligence, with very little room for game-dependent heuristics. This short paper summarizes the motivation, infrastructure, results and future plans of General Video Game AI, stressing the findings and first conclusions drawn after two editions of our competition, and outlining our future plans
Using humanoid robots to study human behavior
Our understanding of human behavior advances as our humanoid robotics work progresses-and vice versa. This team's work focuses on trajectory formation and planning, learning from demonstration, oculomotor control and interactive behaviors. They are programming robotic behavior based on how we humans âprogramâ behavior in-or train-each other
Quantification of depth of anesthesia by nonlinear time series analysis of brain electrical activity
We investigate several quantifiers of the electroencephalogram (EEG) signal
with respect to their ability to indicate depth of anesthesia. For 17 patients
anesthetized with Sevoflurane, three established measures (two spectral and one
based on the bispectrum), as well as a phase space based nonlinear correlation
index were computed from consecutive EEG epochs. In absence of an independent
way to determine anesthesia depth, the standard was derived from measured blood
plasma concentrations of the anesthetic via a pharmacokinetic/pharmacodynamic
model for the estimated effective brain concentration of Sevoflurane. In most
patients, the highest correlation is observed for the nonlinear correlation
index D*. In contrast to spectral measures, D* is found to decrease
monotonically with increasing (estimated) depth of anesthesia, even when a
"burst-suppression" pattern occurs in the EEG. The findings show the potential
for applications of concepts derived from the theory of nonlinear dynamics,
even if little can be assumed about the process under investigation.Comment: 7 pages, 5 figure
C-tests revisited: back and forth with complexity
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-21365-1_28We explore the aggregation of tasks by weighting them using a difficulty
function that depends on the complexity of the (acceptable) policy for the task (instead
of a universal distribution over tasks or an adaptive test). The resulting aggregations
and decompositions are (now retrospectively) seen as the natural (and trivial) interactive
generalisation of the C-tests.This work has been partially supported by the EU (FEDER) and the Spanish MINECO under grants TIN 2010-21062-C02-02, PCIN-2013-037 and TIN 2013-45732-C4-1-P, and by Generalitat Valenciana PROMETEOII 2015/013.HernĂĄndez Orallo, J. (2015). C-tests revisited: back and forth with complexity. En Artificial General Intelligence 8th International Conference, AGI 2015, AGI 2015, Berlin, Germany, July 22-25, 2015, Proceedings. Springer International Publishing. 272-282. https://doi.org/10.1007/978-3-319-21365-1_28S272282Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research 47, 253â279 (2013)HernĂĄndez-Orallo, J.: Beyond the Turing Test. J. Logic, Language & Information 9(4), 447â466 (2000)HernĂĄndez-Orallo, J.: Computational measures of information gain and reinforcement in inference processes. AI Communications 13(1), 49â50 (2000)HernĂĄndez-Orallo, J.: On the computational measurement of intelligence factors. In: Meystel, A. (ed.) Performance metrics for intelligent systems workshop, pp. 1â8. National Institute of Standards and Technology, Gaithersburg (2000)HernĂĄndez-Orallo, J.: AI evaluation: past, present and future (2014). arXiv preprint arXiv:1408.6908HernĂĄndez-Orallo, J.: On environment difficulty and discriminating power. Autonomous Agents and Multi-Agent Systems, 1â53 (2014). http://dx.doi.org/10.1007/s10458-014-9257-1HernĂĄndez-Orallo, J., Dowe, D.L.: Measuring universal intelligence: Towards an anytime intelligence test. Artificial Intelligence 174(18), 1508â1539 (2010)HernĂĄndez-Orallo, J., Dowe, D.L., HernĂĄndez-Lloreda, M.V.: Universal psychometrics: Measuring cognitive abilities in the machine kingdom. Cognitive Systems Research 27, 50â74 (2014)HernĂĄndez-Orallo, J., Minaya-Collado, N.: A formal definition of intelligence based on an intensional variant of Kolmogorov complexity. In: Proc. Intl. Symposium of Engineering of Intelligent Systems (EIS 1998), pp. 146â163. ICSC Press (1998)Hibbard, B.: Bias and no free lunch in formal measures of intelligence. Journal of Artificial General Intelligence 1(1), 54â61 (2009)Legg, S., Hutter, M.: Universal intelligence: A definition of machine intelligence. Minds and Machines 17(4), 391â444 (2007)Li, M., VitĂĄnyi, P.: An introduction to Kolmogorov complexity and its applications, 3 edn. Springer-Verlag (2008)Schaul, T.: An extensible description language for video games. IEEE Transactions on Computational Intelligence and AI in Games PP(99), 1â1 (2014)Solomonoff, R.J.: A formal theory of inductive inference. Part I. Information and control 7(1), 1â22 (1964
Neural Network Fusion of Color, Depth and Location for Object Instance Recognition on a Mobile Robot
International audienceThe development of mobile robots for domestic assistance re-quires solving problems integrating ideas from different fields of research like computer vision, robotic manipulation, localization and mapping. Semantic mapping, that is, the enrichment a map with high-level infor-mation like room and object identities, is an example of such a complex robotic task. Solving this task requires taking into account hard software and hardware constraints brought by the context of autonomous mobile robots, where short processing times and low energy consumption are mandatory. We present a light-weight scene segmentation and object in-stance recognition algorithm using an RGB-D camera and demonstrate it in a semantic mapping experiment. Our method uses a feed-forward neural network to fuse texture, color and depth information. Running at 3 Hz on a single laptop computer, our algorithm achieves a recognition rate of 97% in a controlled environment, and 87% in the adversarial con-ditions of a real robotic task. Our results demonstrate that state of the art recognition rates on a database does not guarantee performance in a real world experiment. We also show the benefit in these conditions of fusing several recognition decisions and data from different sources. The database we compiled for the purpose of this study is publicly available
Neural network generated parametrizations of deeply virtual Compton form factors
We have generated a parametrization of the Compton form factor (CFF) H based
on data from deeply virtual Compton scattering (DVCS) using neural networks.
This approach offers an essentially model-independent fitting procedure, which
provides realistic uncertainties. Furthermore, it facilitates propagation of
uncertainties from experimental data to CFFs. We assumed dominance of the CFF H
and used HERMES data on DVCS off unpolarized protons. We predict the beam
charge-spin asymmetry for a proton at the kinematics of the COMPASS II
experiment.Comment: 16 pages, 5 figure
Markov chain Monte Carlo with Gaussian processes for fast parameter estimation and uncertainty quantification in a 1D fluidâdynamics model of the pulmonary circulation
The past few decades have witnessed an explosive synergy between physics and the life sciences. In particular, physical modelling in medicine and physiology is a topical research area. The present work focuses on parameter inference and uncertainty quantification in a 1D fluidâdynamics model for quantitative physiology: the pulmonary blood circulation. The practical challenge is the estimation of the patientâspecific biophysical model parameters, which cannot be measured directly. In principle this can be achieved based on a comparison between measured and predicted data. However, predicting data requires solving a system of partial differential equations (PDEs), which usually have no closedâform solution, and repeated numerical integrations as part of an adaptive estimation procedure are computationally expensive. In the present article, we demonstrate how fast parameter estimation combined with sound uncertainty quantification can be achieved by a combination of statistical emulation and Markov chain Monte Carlo (MCMC) sampling. We compare a range of stateâofâtheâart MCMC algorithms and emulation strategies, and assess their performance in terms of their accuracy and computational efficiency. The longâterm goal is to develop a method for reliable disease prognostication in real time, and our work is an important step towards an automatic clinical decision support system
- âŠ