7 research outputs found

    Towards Social Identity in Socio-Cognitive Agents

    Full text link
    Current architectures for social agents are designed around some specific units of social behaviour that address particular challenges. Although their performance might be adequate for controlled environments, deploying these agents in the wild is difficult. Moreover, the increasing demand for autonomous agents capable of living alongside humans calls for the design of more robust social agents that can cope with diverse social situations. We believe that to design such agents, their sociality and cognition should be conceived as one. This includes creating mechanisms for constructing social reality as an interpretation of the physical world with social meanings and selective deployment of cognitive resources adequate to the situation. We identify several design principles that should be considered while designing agent architectures for socio-cognitive systems. Taking these remarks into account, we propose a socio-cognitive agent model based on the concept of Cognitive Social Frames that allow the adaptation of an agent's cognition based on its interpretation of its surroundings, its Social Context. Our approach supports an agent's reasoning about other social actors and its relationship with them. Cognitive Social Frames can be built around social groups, and form the basis for social group dynamics mechanisms and construct of Social Identity

    Proceedings of DELbA 2020 - Workshop on Designing and Facilitating Educational Location-based Applications co-located with the Fifteenth European Conference on Technology Enhanced Learning (EC-TEL 2020)

    Get PDF
    There are several implicit benefits to formal school education. One the most important is the learning of social skills i.e. how to behave, interact meaningfully and form social bonds with other people. However, there are multiple situations where the learning of social skills can be disrupted, e.g. bullying or the recent COVID-19 pandemic that forced schools to transition into distance education. In this work, we investigate the potential of pervasive games (PGs) to teach social skills and help acquire social capital. Using the theoretical viewpoints of affordance lens and Pierre Bourdieu's theory of capital, we argue that PGs are able create meaningful activities that not only help learn social skills, but can scaffold social bonding and increase social capital. We identify six social affordances in the PG Pokémon GO and show the game teaches a wide variety of social skills ranging from negotiation and bartering to group interaction. Our findings have implications on designing educational pervasive games that teach social skills and accrue social capital. Keywords: pervasive games, implicit learning, social capital, social capital theory, social skills. </p

    COSMO: Contextualized Scene Modeling with Boltzmann Machines

    Get PDF
    Scene modeling is very crucial for robots that need to perceive, reason about and manipulate the objects in their environments. In this paper, we adapt and extend Boltzmann Machines (BMs) for contextualized scene modeling. Although there are many models on the subject, ours is the first to bring together objects, relations, and affordances in a highly-capable generative model. For this end, we introduce a hybrid version of BMs where relations and affordances are introduced with shared, tri-way connections into the model. Moreover, we contribute a dataset for relation estimation and modeling studies. We evaluate our method in comparison with several baselines on object estimation, out-of-context object detection, relation estimation, and affordance estimation tasks. Moreover, to illustrate the generative capability of the model, we show several example scenes that the model is able to generate.Comment: 40 pages, 15 figures, 9 tables, accepted to the Robotics and Autonomous Systems (RAS) special issue on Semantic Policy and Action Representations for Autonomous Robots (SPAR

    Action-oriented Scene Understanding

    Get PDF
    In order to allow robots to act autonomously it is crucial that they do not only describe their environment accurately but also identify how to interact with their surroundings. While we witnessed tremendous progress in descriptive computer vision, approaches that explicitly target action are scarcer. This cumulative dissertation approaches the goal of interpreting visual scenes “in the wild” with respect to actions implied by the scene. We call this approach action-oriented scene understanding. It involves identifying and judging opportunities for interaction with constituents of the scene (e.g. objects and their parts) as well as understanding object functions and how interactions will impact the future. All of these aspects are addressed on three levels of abstraction: elements, perception and reasoning. On the elementary level, we investigate semantic and functional grouping of objects by analyzing annotated natural image scenes. We compare object label-based and visual context definitions with respect to their suitability for generating meaningful object class representations. Our findings suggest that representations generated from visual context are on-par in terms of semantic quality with those generated from large quantities of text. The perceptive level concerns action identification. We propose a system to identify possible interactions for robots and humans with the environment (affordances) on a pixel level using state-of-the-art machine learning methods. Pixel-wise part annotations of images are transformed into 12 affordance maps. Using these maps, a convolutional neural network is trained to densely predict affordance maps from unknown RGB images. In contrast to previous work, this approach operates exclusively on RGB images during both, training and testing, and yet achieves state-of-the-art performance. At the reasoning level, we extend the question from asking what actions are possible to what actions are plausible. For this, we gathered a dataset of household images associated with human ratings of the likelihoods of eight different actions. Based on the judgement provided by the human raters, we train convolutional neural networks to generate plausibility scores from unseen images. Furthermore, having considered only static scenes previously in this thesis, we propose a system that takes video input and predicts plausible future actions. Since this requires careful identification of relevant features in the video sequence, we analyze this particular aspect in detail using a synthetic dataset for several state-of-the-art video models. We identify feature learning as a major obstacle for anticipation in natural video data. The presented projects analyze the role of action in scene understanding from various angles and in multiple settings while highlighting the advantages of assuming an action-oriented perspective. We conclude that action-oriented scene understanding can augment classic computer vision in many real-life applications, in particular robotics
    corecore