Search CORE

68 research outputs found

Successor features for transfer in reinforcement learning

Author: Barreto A
Munos R
Schaul T
Silver D
Publication venue: 'Center for Open Science'
Publication date: 16/06/2016
Field of study

Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. Our focus is on transfer where the reward functions vary across tasks while the environment's dynamics remain the same. The method we propose rests on two key ideas: "successor features," a value function representation that decouples the dynamics of the environment from the rewards, and "generalized policy improvement," a generalization of dynamic programming's policy improvement step that considers a set of policies rather than a single one. Put together, the two ideas lead to an approach that integrates seamlessly within the reinforcement learning framework and allows transfer to take place between tasks without any restriction. The proposed method also provides performance guarantees for the transferred policy even before any learning has taken place. We derive two theorems that set our approach in firm theoretical ground and present experiments that show that it successfully promotes transfer in practice

UCL Discovery

General video game AI: Competition, challenges, and opportunities

Author: Lucas SM
Perez-Liebana D
Samothrakis S
Schaul T
Togelius J
Publication venue: AAAI Press
Publication date: 01/01/2016
Field of study

The General Video Game AI framework and competition pose the problem of creating artificial intelligence that can play a wide, and in principle unlimited, range of games. Concretely, it tackles the problem of devising an algorithm that is able to play any game it is given, even if the game is not known a priori. This area of study can be seen as an approximation of General Artificial Intelligence, with very little room for game-dependent heuristics. This short paper summarizes the motivation, infrastructure, results and future plans of General Video Game AI, stressing the findings and first conclusions drawn after two editions of our competition, and outlining our future plans

University of Essex Research Repository

Association for the Advancement of Artificial Intelligence: AAAI Publications

Using humanoid robots to study human behavior

Author: Atkeson C.G.
Hale J.G.
Kawato E.
Kawato M.
Kotosaka S.
Pollick F.E.
Riley M.
Schaul S.
Shibata T.
Tevatia G.
Ude A.
Vijayakumar S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

Our understanding of human behavior advances as our humanoid robotics work progresses-and vice versa. This team's work focuses on trajectory formation and planning, learning from demonstration, oculomotor control and interactive behaviors. They are programming robotic behavior based on how we humans “program” behavior in-or train-each other

CiteSeerX

Crossref

Enlighten

Quantification of depth of anesthesia by nonlinear time series analysis of brain electrical activity

Author: A. Hoeft
B. Rehberg
C. E. Elger
D. Ruelle
D.J. Sheskin
G. Widman
G. Widman
G. Widman
H. Kantz
H. Schwilden
I.J. Rampil
I.J. Rampil
J. Muthuswamy
K. Lehnertz
L.D. Iasemidis
M. Steriade
N. Schaul
P. Grassberger
P. Grassberger
P.S. Sebel
T. Katoh
T. Schreiber
T. Schreiber
W.J. Levy
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2000
Field of study

We investigate several quantifiers of the electroencephalogram (EEG) signal with respect to their ability to indicate depth of anesthesia. For 17 patients anesthetized with Sevoflurane, three established measures (two spectral and one based on the bispectrum), as well as a phase space based nonlinear correlation index were computed from consecutive EEG epochs. In absence of an independent way to determine anesthesia depth, the standard was derived from measured blood plasma concentrations of the anesthetic via a pharmacokinetic/pharmacodynamic model for the estimated effective brain concentration of Sevoflurane. In most patients, the highest correlation is observed for the nonlinear correlation index D*. In contrast to spectral measures, D* is found to decrease monotonically with increasing (estimated) depth of anesthesia, even when a "burst-suppression" pattern occurs in the EEG. The findings show the potential for applications of concepts derived from the theory of nonlinear dynamics, even if little can be assumed about the process under investigation.Comment: 7 pages, 5 figure

arXiv.org e-Print Archive

Crossref

MPG.PuRe

C-tests revisited: back and forth with complexity

Author: B Hibbard
J Hernández-Orallo
J Hernández-Orallo
J Hernández-Orallo
J Hernández-Orallo
J Hernández-Orallo
MG Bellemare
RJ Solomonoff
S Legg
T Schaul
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/07/2015
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-21365-1_28We explore the aggregation of tasks by weighting them using a difficulty function that depends on the complexity of the (acceptable) policy for the task (instead of a universal distribution over tasks or an adaptive test). The resulting aggregations and decompositions are (now retrospectively) seen as the natural (and trivial) interactive generalisation of the C-tests.This work has been partially supported by the EU (FEDER) and the Spanish MINECO under grants TIN 2010-21062-C02-02, PCIN-2013-037 and TIN 2013-45732-C4-1-P, and by Generalitat Valenciana PROMETEOII 2015/013.Hernández Orallo, J. (2015). C-tests revisited: back and forth with complexity. En Artificial General Intelligence 8th International Conference, AGI 2015, AGI 2015, Berlin, Germany, July 22-25, 2015, Proceedings. Springer International Publishing. 272-282. https://doi.org/10.1007/978-3-319-21365-1_28S272282Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research 47, 253–279 (2013)Hernández-Orallo, J.: Beyond the Turing Test. J. Logic, Language & Information 9(4), 447–466 (2000)Hernández-Orallo, J.: Computational measures of information gain and reinforcement in inference processes. AI Communications 13(1), 49–50 (2000)Hernández-Orallo, J.: On the computational measurement of intelligence factors. In: Meystel, A. (ed.) Performance metrics for intelligent systems workshop, pp. 1–8. National Institute of Standards and Technology, Gaithersburg (2000)Hernández-Orallo, J.: AI evaluation: past, present and future (2014). arXiv preprint arXiv:1408.6908Hernández-Orallo, J.: On environment difficulty and discriminating power. Autonomous Agents and Multi-Agent Systems, 1–53 (2014). http://dx.doi.org/10.1007/s10458-014-9257-1Hernández-Orallo, J., Dowe, D.L.: Measuring universal intelligence: Towards an anytime intelligence test. Artificial Intelligence 174(18), 1508–1539 (2010)Hernández-Orallo, J., Dowe, D.L., Hernández-Lloreda, M.V.: Universal psychometrics: Measuring cognitive abilities in the machine kingdom. Cognitive Systems Research 27, 50–74 (2014)Hernández-Orallo, J., Minaya-Collado, N.: A formal definition of intelligence based on an intensional variant of Kolmogorov complexity. In: Proc. Intl. Symposium of Engineering of Intelligent Systems (EIS 1998), pp. 146–163. ICSC Press (1998)Hibbard, B.: Bias and no free lunch in formal measures of intelligence. Journal of Artificial General Intelligence 1(1), 54–61 (2009)Legg, S., Hutter, M.: Universal intelligence: A definition of machine intelligence. Minds and Machines 17(4), 391–444 (2007)Li, M., Vitányi, P.: An introduction to Kolmogorov complexity and its applications, 3 edn. Springer-Verlag (2008)Schaul, T.: An extensible description language for video games. IEEE Transactions on Computational Intelligence and AI in Games PP(99), 1–1 (2014)Solomonoff, R.J.: A formal theory of inductive inference. Part I. Information and control 7(1), 1–22 (1964

Crossref

RiuNet

Neural Network Fusion of Color, Depth and Location for Object Instance Recognition on a Mobile Robot

Author: A Anand
ART Gepperth
H Ali
MA Fischler
OJ Woodford
PA Viola
PF Felzenszwalb
RJ Campbell
T Schaul
Z Zhang
Publication venue: HAL CCSD
Publication date: 12/09/2014
Field of study

International audienceThe development of mobile robots for domestic assistance re-quires solving problems integrating ideas from different fields of research like computer vision, robotic manipulation, localization and mapping. Semantic mapping, that is, the enrichment a map with high-level infor-mation like room and object identities, is an example of such a complex robotic task. Solving this task requires taking into account hard software and hardware constraints brought by the context of autonomous mobile robots, where short processing times and low energy consumption are mandatory. We present a light-weight scene segmentation and object in-stance recognition algorithm using an RGB-D camera and demonstrate it in a semantic mapping experiment. Our method uses a feed-forward neural network to fuse texture, color and depth information. Running at 3 Hz on a single laptop computer, our algorithm achieves a recognition rate of 97% in a controlled environment, and 87% in the adversarial con-ditions of a real robotic task. Our results demonstrate that state of the art recognition rates on a database does not guarantee performance in a real world experiment. We also show the benefit in these conditions of fusing several recognition decisions and data from different sources. The database we compiled for the purpose of this study is publicly available

Crossref

INRIA a CCSD electronic archive server

Neural network generated parametrizations of deeply virtual Compton form factors

Author: A Airapetian
Andreas Schäfer
AV Belitsky
AV Belitsky
AV Radyushkin
B Blok
D Müller
Dieter Müller
DS Hwang
F James
H Honkanen
H Moutarde
J Pumplin
J Pumplin
J Rojo
JD Bratt
K Kumerički
K Kumerički
K Kumerički
KM Graczyk
Krešimir Kumerički
L Debbio Del
L Frankfurt
M Diehl
M Diehl
M Guidal
M Guidal
M Guidal
M Riedmiller
RD Ball
RD Ball
RD Ball
S Forte
S Haykin
SV Goloskokov
T Schaul
X-D Ji
X-D Ji
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/06/2011
Field of study

We have generated a parametrization of the Compton form factor (CFF) H based on data from deeply virtual Compton scattering (DVCS) using neural networks. This approach offers an essentially model-independent fitting procedure, which provides realistic uncertainties. Furthermore, it facilitates propagation of uncertainties from experimental data to CFFs. We assumed dominance of the CFF H and used HERMES data on DVCS off unpolarized protons. We predict the beam charge-spin asymmetry for a proton at the kinematics of the COMPASS II experiment.Comment: 16 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Markov chain Monte Carlo with Gaussian processes for fast parameter estimation and uncertainty quantification in a 1D fluid‐dynamics model of the pulmonary circulation

Author: Alden K
Bastos LS
Betancourt M
Bowman AW
Costabal FS
Cui T
Dondelinger F
Fielding M
Haario H
Hawkins A
Higdon D
Hoffman M
Kass R
Kennedy M
Krenz G
Lei CL
Macdonald B
McClarren RG
McKay MD
Mihaela Paun L
Mockus J
Nemeth C
Nguyen TV
Pruett W
Qureshi M
Rasmussen C
Schaul T
Snelson E
Taflanidis AA
Vanhatalo J
Vatsa V
Wang J‐X
Wilkinson R
Wu K
Publication venue: 'Wiley'
Publication date: 01/02/2021
Field of study

The past few decades have witnessed an explosive synergy between physics and the life sciences. In particular, physical modelling in medicine and physiology is a topical research area. The present work focuses on parameter inference and uncertainty quantification in a 1D fluid‐dynamics model for quantitative physiology: the pulmonary blood circulation. The practical challenge is the estimation of the patient‐specific biophysical model parameters, which cannot be measured directly. In principle this can be achieved based on a comparison between measured and predicted data. However, predicting data requires solving a system of partial differential equations (PDEs), which usually have no closed‐form solution, and repeated numerical integrations as part of an adaptive estimation procedure are computationally expensive. In the present article, we demonstrate how fast parameter estimation combined with sound uncertainty quantification can be achieved by a combination of statistical emulation and Markov chain Monte Carlo (MCMC) sampling. We compare a range of state‐of‐the‐art MCMC algorithms and emulation strategies, and assess their performance in terms of their accuracy and computational efficiency. The long‐term goal is to develop a method for reliable disease prognostication in real time, and our work is an important step towards an automatic clinical decision support system

Crossref

Enlighten

Os efeitos da estimulação elétrica funcional na assimetria cortical inter-hemisférica

Author: Andressa Pitanga S. da Silva
Chiarello C
De Toffol B
Erickson KI
Ferrari M
Fingelkurts AA
Gevins A
Ginter J Jr
Gross J
Heloisa Veiga
Jasper H
Kandel E
Letícia Ecard
Marijose Peçanha Neto
Maurício Cagy
Meinzer M
Miller A
Nelles G
Nudo RJ
Oldfield R
Oliviero A
Pedro Ribeiro
Penolazzi B
Pfurtscheller G
Plautz EJ
Roberto Piedade
Rossini PM
Schallert T
Schaul N
Smit DJ
Tecchio F
Veiga H
Vitenzon AS
Ward NS
Weeks DL
Wiedemann G
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Improving adaptive honeypot functionality with efficient reinforcement learning parameters for automated malware

Author: Pa YMP
Schaul T
Watson D
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref