    Behaviour design in microrobots:hierarchical reinforcement learning under resource constraints

    In order to verify models of collective behaviors of animals, robots could be manipulated to implement the model and interact with real animals in a mixed-society. This thesis describes design of the behavioral hierarchy of a miniature robot, that is able to interact with cockroaches, and participates in their collective decision makings. The robots are controlled via a hierarchical behavior-based controller in which, more complex behaviors are built by combining simpler behaviors through fusion and arbitration mechanisms. The experiments in the mixed-society confirms the similarity between the collective patterns of the mixed-society and those of the real society. Moreover, the robots are able to induce new collective patterns by modulation of some behavioral parameters. Difficulties in the manual extraction of the behavioral hierarchy and inability to revise it, direct us to benefit from machine learning techniques, in order to devise the composition hierarchy and coordination in an automated way. We derive a Compact Q-Learning method for micro-robots with processing and memory constraints, and try to learn behavior coordination through it. The behavior composition part is still done manually. However, the problem of the curse of dimensionality makes incorporation of this kind of flat-learning techniques unsuitable. Even though optimizing them could temporarily speed up the learning process and widen their range of applications, their scalability to real world applications remains under question. In the next steps, we apply hierarchical learning techniques to automate both behavior coordination and composition parts. In some situations, many features of the state space might be irrelevant to what the robot currently learns. Abstracting these features and discovering the hierarchy among them can help the robot learn the behavioral hierarchy faster. We formalize the automatic state abstraction problem with different heuristics, and derive three new splitting criteria that adapt decision tree learning techniques to state abstraction. Proof of performance is supported by strong evidences from simulation results in deterministic and non-deterministic environments. Simulation results show encouraging enhancements in the required number of learning trials, robot's performance, size of the learned abstraction trees, and computation time of the algorithms. In the other hand, learning in a group provides free sources of knowledge that, if communicated, can broaden the scales of learning, both temporally and spatially. We present two approaches to combine output or structure of abstraction trees. The trees are stored in different RL robots in a multi-robot system, or in the trees learned by the same robot but using different methods. Simulation results in a non-deterministic football learning task provide strong evidences for enhancement in convergence rate and policy performance, specially in heterogeneous cooperations

    Scaling-up reinforcement learning using parallelization and symbolic planning

    Towards a Universal Test of Social Intelligence

    [EN] Under the view of artificial intelligence, an intelligent agent is an autonomous entity which interacts in an environment through observations and actions, trying to achieve one or more goals with the aid of several signals called rewards. The creation of intelligent agents is proliferating during the last decades, and the evaluation of their intelligence is a fundamental issue for their understanding, construction and improvement. Social intelligence is recently obtaining special attention in the creation of intelligent agents due to the current view of human intelligence as highly social. Social intelligence in natural and artificial systems is usually measured by the evaluation of associated traits or tasks that are deemed to represent some facets of social behaviour. The amalgamation of these traits or tasks is then used to configure an operative notion of social intelligence. However, this operative notion does not truly represent what social intelligence is and a definition following this principle will not be precise. Instead, in this thesis we investigate the evaluation of social intelligence in a more formal and general way, by actually considering the evaluee's interaction with other agents. In this thesis we analyse the implications of evaluating social intelligence using a test that evaluates general intelligence. For this purpose, we include other agents into an initially single-agent environment to figure out the issues that appear when evaluating an agent in the context of other agents. From this analysis we obtain useful information for the evaluation of social intelligence. From the lessons learned, we identify the components that should be considered in order to measure social intelligence, and we provide a formal and parametrised definition of social intelligence. This definition calculates an agent's social intelligence as its expected performance in a set of environments with a set of other agents arranged in teams and participating in line-ups, with rewards being re-understood appropriately. This is conceived as a tool to define social intelligence testbeds where we can generate several degrees of competitive and cooperative behaviours. We test this definition by experimentally analysing the influence of teams and agent line-ups for several multi-agent systems with variants of Q-learning agents. However, not all testbeds are appropriate for the evaluation of social intelligence. To facilitate the analysis of a social intelligence testbed, we provide some formal property models about social intelligence in order to characterise the testbed and thus assess its suitability. Finally, we use the presented properties to characterise some social games and multi-agent environments, we make a comparison between them and discuss their strengths and weaknesses in order to evaluate social intelligence.[ES] Bajo la visi贸n de la inteligencia artificial, un agente inteligente es una entidad aut贸noma la cual interact煤a en un entorno a trav茅s de observaciones y acciones, tratando de lograr uno o m谩s objetivos con la ayuda de varias se帽ales llamadas recompensas. La creaci贸n de agentes inteligentes est谩 proliferando durante las 煤ltimas d茅cadas, y la evaluaci贸n de su inteligencia es un asunto fundamental para su entendimiento, construcci贸n y mejora. Recientemente la inteligencia social est谩 obteniendo especial atenci贸n en la creaci贸n de agentes inteligentes debido a la visi贸n actual de la inteligencia humana como altamente social. Normalmente la inteligencia social en sistemas naturales y artificiales se mide mediante la evaluaci贸n de rasgos asociados o tareas que se consideran que representan algunas facetas del comportamiento social. La agrupaci贸n de estos rasgos o tareas se utiliza entonces para configurar una noci贸n operacional de inteligencia social. Sin embargo, esta noci贸n operacional no representa fielmente a la inteligencia social y no ser铆a posible una definici贸n siguiendo este principio. En su lugar, en esta tesis investigamos la evaluaci贸n de la inteligencia social de un modo m谩s formal y general, considerando la interacci贸n del agente a evaluar con otros agentes. En esta tesis analizamos las implicaciones de evaluar la inteligencia social utilizando un test que eval煤e la inteligencia general. Con este objetivo incluimos otros agentes en un entorno inicialmente dise帽ado para un 煤nico agente con el fin de averiguar qu茅 cuestiones aparecen cuando evaluamos a un agente en un contexto con otros agentes. A partir de este an谩lisis obtenemos informaci贸n 煤til para la evaluaci贸n de la inteligencia social. A partir de las lecciones aprendidas identificamos los componentes que deber铆an considerarse al medir la inteligencia social y proporcionamos una definici贸n formal y parametrizada de esta inteligencia social. Esta definici贸n calcula la inteligencia social de un agente como su rendimiento esperado en un conjunto de entornos y con un conjunto de otros agentes organizados en equipos y distribuidos en alineaciones, reinterpretando apropiadamente las recompensas. Esto se concibe como una herramienta para definir bancos de prueba de inteligencia social donde podamos generar varios grados de comportamientos competitivos y cooperativos. Probamos esta definici贸n analizando experimentalmente la influencia de los equipos y las alineaciones de agentes en varios sistemas multiagente con variantes de agentes Q-learning. Sin embargo, no todos los bancos de prueba son apropiados para la evaluaci贸n de la inteligencia social. Para facilitar el an谩lisis de un banco de pruebas de inteligencia social, proporcionamos algunos modelos de propiedades formales sobre la inteligencia social con el objetivo de caracterizar el banco de pruebas y as铆 valorar su idoneidad. Finalmente, usamos las propiedades presentadas para caracterizar algunos juegos sociales y entornos multiagente, hacemos una comparaci贸n entre ellos y discutimos sus puntos fuertes y d茅biles para ser usados en la evaluaci贸n de la inteligencia social.[CA] Davall la visi贸 de la intel路lig猫ncia artificial, un agent intel路ligent 茅s una entitat aut貌noma la qual interactua en un entorn a trav茅s d'observacions i accions, tractant d'aconseguir un o m茅s objectius amb l'ajuda de diverses senyals anomenades recompenses. La creaci贸 d'agents intel路ligents est脿 proliferant durant les 煤ltimes d猫cades, i l'avaluaci贸 de la seua intel路lig猫ncia 茅s un assumpte fonamental per al seu enteniment, construcci贸 i millora. Recentment la intel路lig猫ncia social est脿 obtenint especial atenci贸 en la creaci贸 d'agents intel路ligents a causa de la visi贸 actual de la intel路lig猫ncia humana com altament social. Normalment la intel路lig猫ncia social en sistemes naturals i artificials es mesura per mitj脿 de l'avaluaci贸 de trets associats o tasques que es consideren que representen algunes facetes del comportament social. L'agrupaci贸 d'aquests trets o tasques s'utilitza llavors per a configurar una noci贸 operacional d'intel路lig猫ncia social. No obstant aix貌, aquesta noci贸 operacional no representa fidelment a la intel路lig猫ncia social i no seria possible una definici贸 seguint aquest principi. En el seu lloc, en aquesta tesi investiguem l'avaluaci贸 de la intel路lig猫ncia social d'una manera m茅s formal i general, considerant la interacci贸 de l'agent a avaluar amb altres agents. En aquesta tesi analitzem les implicacions d'avaluar la intel路lig猫ncia social utilitzant un test que avalue la intel路lig猫ncia general. Amb aquest objectiu incloem altres agents en un entorn inicialment dissenyat per a un 煤nic agent amb la finalitat d'esbrinar quines q眉estions apareixen quan avaluem un agent en un context amb altres agents. A partir d'aquesta an脿lisi obtenim informaci贸 煤til per a l'avaluaci贸 de la intel路lig猫ncia social. A partir de les lli莽ons apreses identifiquem els components que haurien de considerar-se al mesurar la intel路lig猫ncia social i proporcionem una definici贸 formal i parametrizada d'aquesta intel路lig猫ncia social. Aquesta definici贸 calcula la intel路lig猫ncia social d'un agent com el seu rendiment esperat en un conjunt d'entorns i amb un conjunt d'altres agents organitzats en equips i distribu茂ts en alineacions, reinterpretant apropiadament les recompenses. A莽貌 es concep com una ferramenta per a definir bancs de prova d'intel路lig猫ncia social on podem generar diversos graus de comportaments competitius i cooperatius. Provem aquesta definici贸 analitzant experimentalment la influ猫ncia dels equips i les alineacions d'agents en diversos sistemes multiagent amb variants d'agents Q-learning. No obstant aix貌, no tots els bancs de prova s贸n apropiats per a l'avaluaci贸 de la intel路lig猫ncia social. Per a facilitar l'an脿lisi d'un banc de proves d'intel路lig猫ncia social, proporcionem alguns models de propietats formals sobre la intel路lig猫ncia social amb l'objectiu de caracteritzar el banc de proves i aix铆 valorar la seua idone茂tat. Finalment, usem les propietats presentades per a caracteritzar alguns jocs socials i entorns multiagent, fem una comparaci贸 entre ells i discutim els seus punts forts i d猫bils per a ser usats en l'avaluaci贸 de la intel路lig猫ncia social.Insa Cabrera, J. (2016). Towards a Universal Test of Social Intelligence [Tesis doctoral no publicada]. Universitat Polit猫cnica de Val猫ncia. https://doi.org/10.4995/Thesis/10251/66080TESI