Search CORE

66,212 research outputs found

Evaluating a reinforcement learning algorithm with a general intelligence test

Author: A.M. Turing
C.J.C.H. Watkins
D. Weyns
F. Woergoetter
J. Hernández-Orallo
J. Hernández-Orallo
L.A. Levin
M. Genesereth
R.J. Solomonoff
S. Legg
S. Legg
S. Whiteson
Z. Zatuchna
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

In this paper we apply the recent notion of anytime universal intelligence tests to the evaluation of a popular reinforcement learning algorithm, Q-learning. We show that a general approach to intelligence evaluation of AI algorithms is feasible. This top-down (theory-derived) approach is based on a generation of environments under a Solomonoff universal distribution instead of using a pre-defined set of specific tasks, such as mazes, problem repositories, etc. This first application of a general intelligence test to a reinforcement learning algorithm brings us to the issue of task-specific vs. general AI agents. This, in turn, suggests new avenues for AI agent evaluation and AI competitions, and also conveys some further insights about the performance of specific algorithms. © 2011 Springer-Verlag.We are grateful for the funding from the Spanish MEC and MICINN for projects TIN2009-06078-E/TIN, Consolider-Ingenio CSD2007-00022 and TIN2010-21062-C02, for MEC FPU grant AP2006-02323, and Generalitat Valenciana for Prometeo/2008/051.Insa Cabrera, J.; Dowe, DL.; Hernández Orallo, J. (2011). Evaluating a reinforcement learning algorithm with a general intelligence test. En Advances in Artificial Intelligence. Springer Verlag (Germany). 7023:1-11. https://doi.org/10.1007/978-3-642-25274-7_1S1117023Dowe, D.L., Hajek, A.R.: A non-behavioural, computational extension to the Turing Test. In: Intl. Conf. on Computational Intelligence & multimedia applications (ICCIMA 1998), Gippsland, Australia, pp. 101–106 (1998)Genesereth, M., Love, N., Pell, B.: General game playing: Overview of the AAAI competition. AI Magazine 26(2), 62 (2005)Hernández-Orallo, J.: Beyond the Turing Test. J. Logic, Language & Information 9(4), 447–466 (2000)Hernández-Orallo, J.: A (hopefully) non-biased universal environment class for measuring intelligence of biological and artificial systems. In: Hutter, M., et al. (eds.) 3rd Intl. Conf. on Artificial General Intelligence, Atlantis, pp. 182–183 (2010)Hernández-Orallo, J.: On evaluating agent performance in a fixed period of time. In: Hutter, M., et al. (eds.) 3rd Intl. Conf. on Artificial General Intelligence, pp. 25–30. Atlantis Press (2010)Hernández-Orallo, J., Dowe, D.L.: Measuring universal intelligence: Towards an anytime intelligence test. Artificial Intelligence 174(18), 1508–1539 (2010)Legg, S., Hutter, M.: A universal measure of intelligence for artificial agents. Intl. Joint Conf. on Artificial Intelligence, IJCAI 19, 1509 (2005)Legg, S., Hutter, M.: Universal intelligence: A definition of machine intelligence. Minds and Machines 17(4), 391–444 (2007)Levin, L.A.: Universal sequential search problems. Problems of Information Transmission 9(3), 265–266 (1973)Li, M., Vitányi, P.: An introduction to Kolmogorov complexity and its applications, 3rd edn. Springer-Verlag New York, Inc. (2008)Sanghi, P., Dowe, D.L.: A computer program capable of passing IQ tests. In: Proc. 4th ICCS International Conference on Cognitive Science (ICCS 2003), Sydney, Australia, pp. 570–575 (2003)Solomonoff, R.J.: A formal theory of inductive inference. Part I. Information and Control 7(1), 1–22 (1964)Strehl, A.L., Li, L., Wiewiora, E., Langford, J., Littman, M.L.: PAC model-free reinforcement learning. In: Proc. of the 23rd Intl. Conf. on Machine Learning, ICML 2006, New York, pp. 881–888 (2006)Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. The MIT press (1998)Turing, A.M.: Computing machinery and intelligence. Mind 59, 433–460 (1950)Veness, J., Ng, K.S., Hutter, M., Silver, D.: Reinforcement learning via AIXI approximation. In: Proc. 24th Conf. on Artificial Intelligence (AAAI 2010), pp. 605–611 (2010)Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine learning 8(3), 279–292 (1992)Weyns, D., Parunak, H.V.D., Michel, F., Holvoet, T., Ferber, J.: Environments for multiagent systems state-of-the-art and research challenges. In: Weyns, D., Van Dyke Parunak, H., Michel, F. (eds.) E4MAS 2004. LNCS (LNAI), vol. 3374, pp. 1–47. Springer, Heidelberg (2005)Whiteson, S., Tanner, B., White, A.: The Reinforcement Learning Competitions. The AI magazine 31(2), 81–94 (2010)Woergoetter, F., Porr, B.: Reinforcement learning. Scholarpedia 3(3), 1448 (2008)Zatuchna, Z., Bagnall, A.: Learning mazes with aliasing states: An LCS algorithm with associative perception. Adaptive Behavior 17(1), 28–57 (2009

RiuNet

Comparing humans and AI agents

Author: A.M. Turing
C.J.C.H. Watkins
D. Gordon
G. Oppy
J. Hernández-Orallo
J. Hernández-Orallo
J. Hernández-Orallo
J. Hernández-Orallo
J. Veness
L. Ahn von
M. Li
R.J. Solomonoff
R.S. Sutton
S. Legg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Comparing humans and machines is one important source of information about both machine and human strengths and limitations. Most of these comparisons and competitions are performed in rather specific tasks such as calculus, speech recognition, translation, games, etc. The information conveyed by these experiments is limited, since it portrays that machines are much better than humans at some domains and worse at others. In fact, CAPTCHAs exploit this fact. However, there have only been a few proposals of general intelligence tests in the last two decades, and, to our knowledge, just a couple of implementations and evaluations. In this paper, we implement one of the most recent test proposals, devise an interface for humans and use it to compare the intelligence of humans and Q-learning, a popular reinforcement learning algorithm. The results are highly informative in many ways, raising many questions on the use of a (universal) distribution of environments, on the role of measuring knowledge acquisition, and other issues, such as speed, duration of the test, scalability, etc.We thank the anonymous reviewers for their helpful comments. We also thank José Antonio Martín H. for helping us with several issues about the RL competition, RL-Glue and reinforcement learning in general. We are also grateful to all the subjects who took the test. We also thank the funding from the Spanish MEC and MICINN for projects TIN2009-06078- E/TIN, Consolider-Ingenio CSD2007-00022 and TIN2010-21062-C02, for MEC FPU grant AP2006-02323, and Generalitat Valenciana for Prometeo/2008/051Insa Cabrera, J.; Dowe, DL.; España Cubillo, S.; Henánez-Lloreda, MV.; Hernández Orallo, J. (2011). Comparing humans and AI agents. En Artificial General Intelligence. Springer Verlag (Germany). 6830:122-132. https://doi.org/10.1007/978-3-642-22887-2_13S1221326830Dowe, D.L., Hajek, A.R.: A non-behavioural, computational extension to the Turing Test. In: Intl. Conf. on Computational Intelligence & multimedia applications (ICCIMA 1998), Gippsland, Australia, pp. 101–106 (1998)Gordon, D., Subramanian, D.: A cognitive model of learning to navigate. In: Proc. 19th Conf. of the Cognitive Science Society, 1997, vol. 25, p. 271. Lawrence Erlbaum, Mahwah (1997)Hernández-Orallo, J.: Beyond the Turing Test. J. Logic, Language & Information 9(4), 447–466 (2000)Hernández-Orallo, J.: A (hopefully) non-biased universal environment class for measuring intelligence of biological and artificial systems. In: Hutter, M., et al. (eds.) 3rd Intl. Conf. on Artificial General Intelligence, pp. 182–183. Atlantis Press, London (2010) Extended report at, http://users.dsic.upv.es/proy/anynt/unbiased.pdfHernández-Orallo, J., Dowe, D.L.: Measuring universal intelligence: Towards an anytime intelligence test. Artificial Intelligence 174(18), 1508–1539 (2010)Hernández-Orallo, J., Dowe, D.L., España-Cubillo, S., Hernández-Lloreda, M.V., Insa-Cabrera, J.: On more realistic environment distributions for defining, evaluating and developing intelligence. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS(LNAI), pp. 81–90. Springer, Heidelberg (2011)Legg, S., Hutter, M.: A universal measure of intelligence for artificial agents. In: Intl Joint Conf on Artificial Intelligence, IJCAI, vol. 19, p. 1509 (2005)Legg, S., Hutter, M.: Universal intelligence: A definition of machine intelligence. Minds and Machines 17(4), 391–444 (2007)Li, M., Vitányi, P.: An introduction to Kolmogorov complexity and its applications, 3rd edn. Springer-Verlag New York, Inc., Heidelberg (2008)Oppy, G., Dowe, D.L.: The Turing Test. In: Zalta, E.N. (ed.) Stanford Encyclopedia of Philosophy, Stanford University, Stanford (2011), http://plato.stanford.edu/entries/turing-test/Sanghi, P., Dowe, D.L.: A computer program capable of passing IQ tests. In: 4th Intl. Conf. on Cognitive Science (ICCS 2003), Sydney, pp. 570–575 (2003)Solomonoff, R.J.: A formal theory of inductive inference. Part I. Information and control 7(1), 1–22 (1964)Strehl, A.L., Li, L., Wiewiora, E., Langford, J., Littman, M.L.: PAC model-free reinforcement learning. In: ICML 2006, pp. 881–888. New York (2006)Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. The MIT press, Cambridge (1998)Turing, A.M.: Computing machinery and intelligence. Mind 59, 433–460 (1950)Veness, J., Ng, K.S., Hutter, M., Silver, D.: A Monte Carlo AIXI Approximation. Journal of Artificial Intelligence Research, JAIR 40, 95–142 (2011)von Ahn, L., Blum, M., Langford, J.: Telling humans and computers apart automatically. Communications of the ACM 47(2), 56–60 (2004)Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. learning 8(3), 279–292 (1992

CiteSeerX

RiuNet

Data complexity in machine learning

Author: Abu-Mostafa Yaser S.
Li Ling
Publication venue: 'California Institute of Technology Library'
Publication date: 26/05/2006
Field of study

We investigate the role of data complexity in the context of binary classification problems. The universal data complexity is defined for a data set as the Kolmogorov complexity of the mapping enforced by the data set. It is closely related to several existing principles used in machine learning such as Occam's razor, the minimum description length, and the Bayesian approach. The data complexity can also be defined based on a learning model, which is more realistic for applications. We demonstrate the application of the data complexity in two learning problems, data decomposition and data pruning. In data decomposition, we illustrate that a data set is best approximated by its principal subsets which are Pareto optimal with respect to the complexity and the set size. In data pruning, we show that outliers usually have high complexity contributions, and propose methods for estimating the complexity contribution. Since in practice we have to approximate the ideal data complexity measures, we also discuss the impact of such approximations

Caltech Authors

Emerging Consciousness as a Result of Complex-Dynamical Interaction Process

Author: Kirilyuk Andrei
Publication venue
Publication date: 01/09/2004
Field of study

A quite general interaction process within a multi-component system is analysed by the extended effective potential method, liberated from usual limitations of perturbation theory or integrable model. The obtained causally complete solution of the many-body problem reveals the phenomenon of dynamic multivaluedness, or redundance, of emerging, incompatible system realisations and dynamic entanglement of system components within each realisation. The ensuing concept of dynamic complexity (and related intrinsic chaoticity) is absolutely universal and can be applied to the problem of consciousness that emerges now as a high enough, properly specified level of unreduced complexity of a suitable interaction process. This complexity level can be identified with the appearance of bound, permanently localised states in the multivalued brain dynamics from strongly chaotic states of unconscious intelligence, by analogy with classical behaviour emergence from quantum states at much lower levels of world dynamics. We show that the main properties of this dynamically emerging consciousness (and intelligence, at the preceding complexity level) correspond to empirically derived properties of natural versions and obtain causally substantiated conclusions about their artificial realisation, including the fundamentally justified paradigm of genuine machine consciousness. This rigorously defined machine consciousness is different from both natural consciousness and any mechanistic, dynamically single-valued imitation of the latter. We use then the same, truly universal concept of complexity to derive equally rigorous conclusions about mental and social implications of the machine consciousness paradigm, demonstrating its indispensable role in the next stage of civilisation development

CogPrints Cognitive Sciences Eprint Archive