10 research outputs found

    Robot Awareness in Cooperative Mobile Robot Learning

    No full text
    International audienceMost of the straight-forward learning approaches in cooperative robotics imply for each learning robot a state space growth exponential in the number of team members. To remedy the exponentially large state space, we propose to investigate a less demanding cooperation mechanism—i.e., various levels of awareness—instead of communication. We define awareness as the perception of other robots locations and actions. We recognize four different levels (or degrees) of awareness which imply different amounts of additional information and therefore have different impacts on the search space size (Θ(0), Θ(1), Θ(N), o(N),1 where N is the number of robots in the team). There are trivial arguments in favor of avoiding binding the increase of the search space size to the number of team members. We advocate that, by studying the maximum number of neighbor robots in the application context, it is possible to tune the parameters associated with a Θ(1) increase of the search space size and allow good learning performance. We use the cooperative multi-robot observation of multiple moving targets (CMOMMT) application to illustrate our method. We verify that awareness allows cooperation, that cooperation shows better performance than a purely collective behavior and that learned cooperation shows better results than learned collective behavior

    Modeling and Simulation of Elementary Robot Behaviors using Associative Memories

    No full text
    International audienceToday, there are several drawbacks that impede the necessary and much needed use of robot learning techniques in real applications. First, the time needed to achieve the synthesis of any behavior is prohibitive. Second, the robot behavior during the learning phase is – by definition – bad, it may even be dangerous. Third, except within the lazy learning approach, a new behavior implies a new learning phase. We propose in this paper to use associative memories (self-organizing maps) to encode the non explicit model of the robot-world interaction sampled by the lazy memory, and then generate a robot behavior by means of situations to be achieved, i.e., points on the self-organizing maps. Any behavior can instantaneously be synthesized by the definition of a goal situation. Its performance will be minimal (not necessarily bad) and will improve by the mere repetition of the behavior

    Reinforcement in Cooperative Games

    Get PDF
    Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) “Επιστήμη Δεδομένων και Μηχανική Μάθηση

    A Comprehensive Survey of Multiagent Reinforcement Learning

    Full text link

    Análisis de arquitecturas existentes para robótica colectiva y desarrollo de nuevas soluciones que mejoren las identificadas

    Get PDF
    Programa Oficial de Doutoramento en Tecnoloxías da Información e as Comunicacións. 5032V01[Resumo] A presente tese está orientada a arquitecturas de sistemas multi-robot en xeral, sen restricións de tamaño do colectivo de robots ou do eido de aplicación. En primeiro lugar identifica e describe os atributos destes sistemas así como das arquitecturas que lles dan soporte para logo propoñer un sistema de avaliación que permite cuantificar as capacidades destas últimas. Este sistema está baseado en probas e simulacións de diferentes aspectos como son (entre outros) a coordinación, a adaptación ou a interoperabilidade. Os resultados do sistema de avaliación permiten a comparación e selección da arquitectura máis axeitada dependendo das necesidades do novo sistema a desenvolver. Ademais este sistema permite a avaliación de forma rápida de modificacións na arquitectura. A arquitectura proposta trata de cubrir todas as posibilidades útiles en sistemas multi-robot permitindo a reutilización de calquera compoñente do sistema. Ademais facilita a interoperabilidade con outros sistemas e define un protocolo de mensaxes (a modo de linguaxe) que deben usar os membros do colectivo. Esta linguaxe representa o conxunto mínimo de operacións para que un colectivo poida desenvolver calquera tarefa. A esta proposta aplícaselle o sistema de avaliación dando mellores resultados na maioría das métricas que as outras arquitecturas avaliadas.[Resumen] La presente tesis está orientada a las arquitecturas de sistemas multi-robot en general, sin restricciones de tamaño del colectivo de robots o del ámbito de aplicación. Primeramente identifica y describe los atributos de estos sistemas así como los de las arquitecturas que les dan soporte para luego desarrollar un sistema de evaluación que permita identificar las capacidades de estas últimas, basado en la prueba y simulación de diferentes aspectos como son (entre otros) la coordinación, la adaptación o la interoperabilidad. Los resultados del sistema de evaluación permiten comparar y seleccionar de entre las arquitecturas la más idónea según las necesidades un nuevo sistema a desarrollar. Además permite evaluar, de forma rápida, modificaciones. La arquitectura propuesta trata de cubrir todas las posibilidades útiles en sistemas muti-robot permitiendo la reutilización de cualquier componente del sistema. Además facilita la interoperabilidad con otros sistemas y define un protocolo de mensajes básico, a modo de idioma, que deben usar los miembros del colectivo multi-robot. Este idioma representa el conjunto de operaciones mínimo para que un colectivo pueda cooperar desarrollando cualquier tarea. A esta propuesta se le ha aplicado el sistema de evaluación con resultados que, en su mayoría, mejoran las arquitecturas existentes.[Abstract] This thesis is aimed to multi-robot architectures in general, without size of collective or application environment restrictions. The attributes that describe the architectures and multi-robot systems are identified and documented. These attributes contribute to generate metrics of the architectures’ relevant aspects that, in turn, are used to quantify the capacities. The metrics compose an evaluation system that rely on the simulation of the relevant aspects such as coordination, adaption, or interoperability. Results from the evaluation system allow architectures comparison and selection in order to implement a new multi-robot system. The evaluation system enables the quick evaluation of an architecture modifications. This thesis also propose a new architecture trying to cover all multi-robot systems based on reusing the system components as needed; This architecture makes easy to operate with other systems. A message protocol (a kind of language between the robots) is defined containing the minimal set of operations that the individuals must implement in order to carry any task. The evaluation system is applied to this new architecture yielding better results in most of the metrics than the other architectures evaluated

    Collective Machine Learning: Team Learning and Classification in Multi-Agent Systems

    Get PDF
    This dissertation focuses on the collaboration of multiple heterogeneous, intelligent agents (hardware or software) which collaborate to learn a task and are capable of sharing knowledge. The concept of collaborative learning in multi-agent and multi-robot systems is largely under studied, and represents an area where further research is needed to gain a deeper understanding of team learning. This work presents experimental results which illustrate the importance of heterogeneous teams of collaborative learning agents, as well as outlines heuristics which govern successful construction of teams of classifiers. A number of application domains are studied in this dissertation. One approach is focused on the effects of sharing knowledge and collaboration of multiple heterogeneous, intelligent agents (hardware or software) which work together to learn a task. As each agent employs a different machine learning technique, the system consists of multiple knowledge sources and their respective heterogeneous knowledge representations. Collaboration between agents involves sharing knowledge to both speed up team learning, as well as to refine the team's overall performance and group behavior. Experiments have been performed that vary the team composition in terms of machine learning algorithms, learning strategies employed by the agents, and sharing frequency for a predator-prey cooperative pursuit task. For lifelong learning, heterogeneous learning teams were more successful compared to homogeneous learning counterparts. Interestingly, sharing increased the learning rate, but sharing with higher frequency showed diminishing results. Lastly, knowledge conflicts are reduced over time, as more sharing takes place. These results support further investigation of the merits of heterogeneous learning. This dissertation also focuses on discovering heuristics for constructing successful teams of heterogeneous classifiers, including many aspects of team learning and collaboration. In one application, multi-agent machine learning and classifier combination are utilized to learn rock facies sequences from wireline well log data. Gas and oil reservoirs have been the focus of modeling efforts for many years as an attempt to locate zones with high volumes. Certain subsurface layers and layer sequences, such as those containing shale, are known to be impermeable to gas and/or liquid. Oil and natural gas then become trapped by these layers, making it possible to drill wells to reach the supply, and extract for use. The drilling of these wells, however, is costly. Here, the focus is on how to construct a successful set of classifiers, which periodically collaborate, to increase the classification accuracy. Utilizing multiple, heterogeneous collaborative learning agents is shown to be successful for this classification problem. We were able to obtain 84.5% absolute accuracy using the Multi-Agent Collaborative Learning Architecture, an improvement of about 6.5% over the best results achieved by Kansas Geological Survey with the same data set. Several heuristics are presented for constructing teams of multiple collaborative classifiers for predicting rock facies. Another application utilizes multi-agent machine learning and classifier combination to learn water presence using airborne polar radar data acquired from Greenland in 1999 and 2007. Ground and airborne depth-soundings of the Greenland and Antarctic ice sheets have been used for many years to determine characteristics such as ice thickness, subglacial topography, and mass balance of large bodies of ice. Ice coring efforts have supported these radar data to provide ground truth for validation of the state (wet or frozen) of the interface between the bottom of the ice sheet and the underlying bedrock. Subglacial state governs the friction, flow speed, transport of material, and overall change of the ice sheet. In this dissertation, we focus on how to construct a successful set of classifiers which periodically collaborate to increase classification accuracy. The underlying method results in radar independence, allowing model transfer from 1999 to 2007 to produce water presence maps of the Greenland ice sheet with differing radars. We were able to obtain 86% accuracy using the Multi-Agent Collaborative Learning Architecture with this data set. Utilizing multiple, heterogeneous collaborative learning agents is shown to be successful for this classification problem as well. Several heuristics, some of which agree with those found in the other applications, are presented for constructing teams of multiple collaborative classifiers for predicting subglacial water presence. General findings from these different experiments suggest that constructing a team of classifiers using a heterogeneous mixture of homogeneous teams is preferred. Larger teams generally perform better, as decisions from multiple learners can be combined to arrive at a consensus decision. Employing heterogeneous learning algorithms integrates different error models to arrive at higher accuracy classification from complementary knowledge bases. Collaboration, although not found to be universally useful, offers certain team configurations an advantage. Collaboration with low to medium frequency was found to be beneficial, while high frequency collaboration was found to be detrimental to team classification accuracy. Full mode learning, where each learner receives the entire training set for the learning phase, consistently outperforms independent mode learning, where the training set is distributed to all learners in a team in a non-overlapping fashion. Results presented in this dissertation support the application of multi-agent machine learning and collaboration to current challenging, real-world classification problems

    Modèles de la rationalité des acteurs sociaux

    Get PDF
    The work presented in this paper is part of the project SocLab, which proposes a formalization of the sociology of organized action (Crozier et Friedberg). This formalization is based on a meta-model of the structure of social organizations, which provides means to describe the structure of a particular organization, to develop an analytical study of its properties and mainly, to calculate by simulation the behaviors that the actors of the organization are likely to adopt one to each other. Under this approach, an organization is viewed as a system that, depending on the behavior of the actors to each other, gives every of them a certain capacity of action to achieve its objectives, without distinguishing those related to his role within the organization and those that are its own. These behaviors are relatively stable. This is an essential condition for the coordination of the actors so that they can coordinate in performing, at least partially, what constitutes the raison d'être of the organization. These behaviors appear also to be generally cooperative facilitating the achievement of personal objectives of each one as well as those of the collective as a whole. This thesis focuses on the modeling of the rationality which leads a social actor to adopt such behavior in the " social game " constituted by a context of organizational interactions. According to the sociology of organized action, this rationality is strategic, guided by the research of own interest, and it is exercised within the framework of a (very) limited rationality. The proposed model seeks to be plausible, from the social and the psycho-cognitive points of view, and it fits into the paradigm of reinforcement learning. Insofar as the structure of the organization allows it, the simulations converge towards configurations that can be described as Pareto optima. We also study variants of this algorithm corresponding to rationalities that drive an organization to regulate toward other configurations that are elitist, protective or egalitarian, or Nash equilibria.Le travail présenté dans ce mémoire s'inscrit dans le cadre du projet SocLab, qui propose une formalisation de la sociologie de l'action organisée de Crozier et Friedberg. Cette formalisation repose sur un méta-modèle de la structure des organisations sociales, à partir duquel il est possible de décrire la structure d'une organisation particulière, de développer une étude analytique de ses propriétés et surtout de calculer, par simulation, les comportements que les acteurs de cette organisation sont susceptibles d'adopter les uns vis-à-vis des autres. Selon cette approche, une organisation est vue comme un système qui, en fonction du comportement des acteurs les uns envers les autres, procure à chacun d'eux une certaine capacité d'action pour atteindre ses objectifs, sans distinguer ceux qui relèvent de son rôle et ceux qui lui sont propres. Ces comportements sont relativement stabilisés, condition indispensable à la coordination des acteurs dans l'accomplissement, au moins partiel, de ce qui constitue la raison d'être de l'organisation, et donc indispensable à l'existence même de cette organisation. Ces comportements s'avèrent de plus être globalement coopératifs, facilitant ainsi la réalisation des objectifs, aussi bien ceux propres à chacun que ceux du collectif dans son ensemble. Cette thèse porte sur la modélisation de la rationalité qui conduit un acteur social à adopter un tel comportement dans le " jeu social " que constitue un contexte d'interaction organisationnel. Selon la sociologie de l'action organisée, cette rationalité est stratégique, guidée par la recherche de son intérêt, et elle s'exerce dans le cadre d'une rationalité (très) limitée. Le modèle proposé cherche à être vraisemblable, du point de vue social et du point de vue psycho-cognitif, et il s'inscrit dans le paradigme de l'apprentissage par renforcement. Dans la mesure où la structure de l'organisation le permet, les simulations convergent donc vers des configurations que l'on peut qualifier d'optima Pareto-équitables. On étudie aussi diverses variantes de cet algorithme correspondant à des rationalités qui conduisent une organisation à se réguler vers des configurations élitistes, protectrices ou égalitaristes, ou encore vers un équilibre de Nash

    Modèles de la rationalité des acteurs sociaux

    Get PDF
    Cette thèse s inscrit dans le cadre du projet SocLab, qui propose une formalisation de la sociologie de l action organisée de Crozier et Friedberg. Cette formalisation repose sur un métamodèle de la structure des organisations sociales, et plus généralement des systèmes d'action collective, qui permet de décrire la structure d une organisation particulière, de développer une étude analytique de ses propriétés et surtout de calculer, par simulation, les comportements que les acteurs de cette organisation sont susceptibles d adopter les uns vis-à-vis des autres. Selon cette approche, une organisation est vue comme un système qui, en fonction du comportement des acteurs les uns envers les autres, procure à chacun d eux une certaine capacité d action pour atteindre ses objectifs, sans distinguer ceux qui relèvent de son rôle organisationnel et ceux qui lui sont propres. Ces comportements sont relativement stabilisés, condition indispensable à la coordination des acteurs dans l accomplissement, au moins partiel, de ce qui constitue la raison d être de l organisation, et donc indispensable à l existence même de cette organisation. Même s'ils s'écartent de ce qui est prescrit, ces comportements s avèrent de plus être globalement coopératifs, condition nécessaire au bon fonctionnement de l'organisation. Les caractéristiques de ces comportements sont un phénomène qui émerge des interactions entre les rationalités mises en œuvre par les acteurs dans le jeu social que constitue un contexte d interaction organisationnel. Selon la sociologie de l action organisée, cette rationalité est stratégique, guidée par la recherche de son intérêt, et elle s exerce dans le cadre d une rationalité (très) limitée. Le modèle qui en est proposé dans cette thèse cherche à être vraisemblable tant du point de vue social que du point de vue psycho-cognitif, et il s inscrit dans le paradigme de l apprentissage par renforcement. Dans la mesure où la structure de l organisation le permet, les simulations convergent donc vers des configurations que l on peut qualifier d optima Paretoéquitables. On étudie aussi diverses variantes de cet algorithme qui correspondent à des rationalités qui conduisent une organisation à se réguler vers un équilibre de Nash ou vers des configurations socialement bien typées - optimum social, (anti-)élitistes, (anti-)protectrices ou (anti-)égalitaristes..This thesis is part of the SocLab project, which proposes a formalization of the sociology of organized action (Crozier et Friedberg). This formalization is based on a meta-model of the structure of social organizations, and more generally collective action systems, which provides means to describe the structure of a particular organization, to develop an analytical study of its properties and mainly, to compute by simulation the behaviors that the actors of the organization are likely to adopt one to each other. Under this approach, an organization is viewed as a system that, depending on the behaviors of the actors to each other, gives every of them a certain capacity of action to achieve its objectives, without distinguishing those related to his role within the organization and those that are its own. These behaviors are relatively stable, which is essential for the coordination of the actors. Although they deviate from what is prescribed, these behaviors appear to be generally cooperative, which is necessary to the proper functioning of the organization and the achievement, at least partially, of its raison d être. The characteristics of these behaviors are a phenomenon that emerges from the interactions between the rationalities adopted by the actors in the social game that constitutes the context of their organizational interactions. According to the sociology of organized action, this rationality is strategic, guided by the research of own interest, and it is exercised within the framework of a (very) limited rationality. The model of social actors' rationality proposed in this thesis seeks to be plausible, from the social and the psycho-cognitive points of view, and it fits into the paradigm of reinforcement learning. Insofar the possibilities offered by structure of the organization, the simulations converge towards configurations that can be qualified as equitable Pareto optima. We also study variants of this algorithm corresponding to rationalities that drive an organization to regulate toward Nash equilibrium, or towards well-typed social configurations social optimum, (anti-) elitist, (anti-) protective or (anti-) egalitarian.TOULOUSE1-SCD-Bib. electronique (315559902) / SudocSudocFranceF
    corecore