186 research outputs found

    Interactive generation and learning of semantic-driven robot behaviors

    Get PDF
    The generation of adaptive and reflexive behavior is a challenging task in artificial intelligence and robotics. In this thesis, we develop a framework for knowledge representation, acquisition, and behavior generation that explicitly incorporates semantics, adaptive reasoning and knowledge revision. By using our model, semantic information can be exploited by traditional planning and decision making frameworks to generate empirically effective and adaptive robot behaviors, as well as to enable complex but natural human-robot interactions. In our work, we introduce a model of semantic mapping, we connect it with the notion of affordances, and we use those concepts to develop semantic-driven algorithms for knowledge acquisition, update, learning and robot behavior generation. In particular, we apply such models within existing planning and decision making frameworks to achieve semantic-driven and adaptive robot behaviors in a generic environment. On the one hand, this work generalizes existing semantic mapping models and extends them to include the notion of affordances. On the other hand, this work integrates semantic information within well-defined long-term planning and situated action frameworks to effectively generate adaptive robot behaviors. We validate our approach by evaluating it on a number of problems and robot tasks. In particular, we consider service robots deployed in interactive and social domains, such as offices and domestic environments. To this end, we also develop prototype applications that are useful for evaluation purposes

    Action-oriented Scene Understanding

    Get PDF
    In order to allow robots to act autonomously it is crucial that they do not only describe their environment accurately but also identify how to interact with their surroundings. While we witnessed tremendous progress in descriptive computer vision, approaches that explicitly target action are scarcer. This cumulative dissertation approaches the goal of interpreting visual scenes “in the wild” with respect to actions implied by the scene. We call this approach action-oriented scene understanding. It involves identifying and judging opportunities for interaction with constituents of the scene (e.g. objects and their parts) as well as understanding object functions and how interactions will impact the future. All of these aspects are addressed on three levels of abstraction: elements, perception and reasoning. On the elementary level, we investigate semantic and functional grouping of objects by analyzing annotated natural image scenes. We compare object label-based and visual context definitions with respect to their suitability for generating meaningful object class representations. Our findings suggest that representations generated from visual context are on-par in terms of semantic quality with those generated from large quantities of text. The perceptive level concerns action identification. We propose a system to identify possible interactions for robots and humans with the environment (affordances) on a pixel level using state-of-the-art machine learning methods. Pixel-wise part annotations of images are transformed into 12 affordance maps. Using these maps, a convolutional neural network is trained to densely predict affordance maps from unknown RGB images. In contrast to previous work, this approach operates exclusively on RGB images during both, training and testing, and yet achieves state-of-the-art performance. At the reasoning level, we extend the question from asking what actions are possible to what actions are plausible. For this, we gathered a dataset of household images associated with human ratings of the likelihoods of eight different actions. Based on the judgement provided by the human raters, we train convolutional neural networks to generate plausibility scores from unseen images. Furthermore, having considered only static scenes previously in this thesis, we propose a system that takes video input and predicts plausible future actions. Since this requires careful identification of relevant features in the video sequence, we analyze this particular aspect in detail using a synthetic dataset for several state-of-the-art video models. We identify feature learning as a major obstacle for anticipation in natural video data. The presented projects analyze the role of action in scene understanding from various angles and in multiple settings while highlighting the advantages of assuming an action-oriented perspective. We conclude that action-oriented scene understanding can augment classic computer vision in many real-life applications, in particular robotics

    The affordance-based concept

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (p. 89-95).Natural language use relies on situational context. The meaning of words and utterances depend on the physical environment and the goals and plans of communication partners. These facts should be central to theories of language and automatic language understanding systems. Instead, they are often ignored, leading to partial theories and systems that cannot fully interpret linguistic meaning. I introduce a new computational theory of conceptual structure that has as its core claim that concepts are neither internal nor external to the language user, but instead span the objective-subjective boundary. This theory proposes interaction and prediction as a central theme, rather than solely emphasizing deducing, sensing or acting. To capture the possible interactions between subject and object, the theory relies on the notion of perceived affordances: structured units of interaction that can be used for prediction at certain levels of abstraction. By using perceived affordances as a basis for language understanding, the theory accounts for many aspects of the situated nature of human language use. It provides a unified solution to a number of other demands on a theory of language understanding including conceptual combination, prototypicality effects, and the generative nature of lexical items.(cont.) To support the theory, I describe an implementation that relies on probabilistic hierarchical plan recognition to predict possible interactions. The elements of a recognized plan provide an instance of perceived affordances which are used by a linguistic parser to ground the meaning of words and grammatical constituents. Evaluations performed in a multiuser role playing game environment show that this implementation captures the meaning of free-form spontaneous directive speech acts that cannot be understood without taking into account the intentional and physical situation of speaker and listener.by Peter John Gorniak.Ph.D

    Grounding the Linking Competence in Culture and Nature. How Action and Perception Shape the Syntax-Semantics Relationship

    Get PDF
    Part I of the book presents my basic assumptions about the syntax-semantics relationship as a competence of language users and compares them with those of the two paradigms that presently account for most theoretical linguistic projects, studies, and publications. I refer to them as Chomskyan Linguistics and Cognitive-Functional Linguistics. I will show that these approaches do not provide the means to accommodate the sociocultural origins of the “linking” competence, creating the need for an alternative approach. While considering these two approaches (sections 2.1 and 2.3), an alternative proposal will be sketched in section 2.2, using the notion of “research programme”. Thus, part I deals mainly with questions of the philosophy of science. Nevertheless, the model underlying the research programme gives structure to the procedure followed throughout the rest of the book, since it identifies the undertaking as multidisciplinary, following from the central roles of perception and action/attribution. This means that approaching the competence of relating form to content as characterized above requires looking into these sub-competences first, since the former draws upon the latter. Part I concludes with the formulation of an action-theoretic vocabulary and taxonomy (section 2.4). This vocabulary serves as the guideline for how to talk about the subject-matter of each of these disciplines. Part II and chapter 3 then deal with the sub-competences that have been identified as underlying linguistic competence. They concern the use of perception, identification/categorization, conceptualization, action, attribution, and the use of linguistic symbols. Section 3.1 in part II deals with perception. In particular, two crucial properties of perception will be discussed: that it consists of a bottom-up part and a top-down part, and that the output of perception is underspecified in the sense that what we perceive is not informative with respect to actional, i.e., socially relevant matters. The sections on perception to some degree anticipate the characterization of conceptualization in section 3.2 because the latter will be reconstructed as simulated perception. The property of underspecification is thus sustained in conceptualization, too. If utterances encode concepts and concepts are underspecified with respect to those matters that are most important for everyday interaction, one wonders how verbal interaction can (actually) be successful. Here is where action competence and attribution come into play (the non-conceptual contents referred to above). I will show that native speakers act and cognize according to particular socio-cognitive parameters, on the basis of which they make socially relevant attributions. These in turn specify what was underspecified about concepts beforehand. In other words, actional knowledge including attribution must complement concepts in order to count as the semantics underlying linguistic utterances. Sections 3.3 and 3.4 develop a descriptive means for semantic contents. I present the inherent structural organization of concepts and demonstrate how the spatial and temporal aspects of conceptualization can be systematically related to the syntactic structures underlying utterances. In particular, I will argue that conceptualization is organized by means of trajector-landmark configurations which can quite regularly be related to parts of speech in syntactic constructions using the notion of diagrammatic iconicity. Given a diagrammatic mapping and conceptualization as simulated perception the utterance thus becomes something like an instruction to simulate a perception. In part III, section 4.1 deals with the question of what the formal constituents of utterances/constructions contribute to the building of a concept from an utterance. In this context a theory of the German dative is presented, based on the theoretical notions developed throughout this work. Section 4.2 sketches the non-formal properties that reduce the remaining underspecification. In this context one of the most fundamental cognitive properties of language users is uncovered, namely their need to find the cause of any event they are cognizing about. I will then outline the consequences of this property for language production and comprehension. Section 4.3 lists the most important linking schemas for German on the basis of the most important constructions, i.e., motivated conceptualization-syntactic construction mappings, and then describes in a step-by-step manner how – from the utterance-as-instruction-for-conceptualization perspective – such an instruction is obeyed, and how such an instruction is built up from the perception of an event, respectively. The last section, 4.4, is dedicated to a discussion of some of the most famous and most puzzling linguistic phenomena which theoretical linguists traditionally deal with. In discussing the formal aspects of the linguistic competence, examples from German are used

    Concepts, Frames and Cascades in Semantics, Cognition and Ontology

    Get PDF
    This open access book presents novel theoretical, empirical and experimental work exploring the nature of mental representations that support natural language production and understanding, and other manifestations of cognition. One fundamental question raised in the text is whether requisite knowledge structures can be adequately modeled by means of a uniform representational format, and if so, what exactly is its nature. Frames are a key topic covered which have had a strong impact on the exploration of knowledge representations in artificial intelligence, psychology and linguistics; cascades are a novel development in frame theory. Other key subject areas explored are: concepts and categorization, the experimental investigation of mental representation, as well as cognitive analysis in semantics. This book is of interest to students, researchers, and professionals working on cognition in the fields of linguistics, philosophy, and psychology

    Applying the Free-Energy Principle to Complex Adaptive Systems

    Get PDF
    The free energy principle is a mathematical theory of the behaviour of self-organising systems that originally gained prominence as a unified model of the brain. Since then, the theory has been applied to a plethora of biological phenomena, extending from single-celled and multicellular organisms through to niche construction and human culture, and even the emergence of life itself. The free energy principle tells us that perception and action operate synergistically to minimize an organism’s exposure to surprising biological states, which are more likely to lead to decay. A key corollary of this hypothesis is active inference—the idea that all behavior involves the selective sampling of sensory data so that we experience what we expect to (in order to avoid surprises). Simply put, we act upon the world to fulfill our expectations. It is now widely recognized that the implications of the free energy principle for our understanding of the human mind and behavior are far-reaching and profound. To date, however, its capacity to extend beyond our brain—to more generally explain living and other complex adaptive systems—has only just begun to be explored. The aim of this collection is to showcase the breadth of the free energy principle as a unified theory of complex adaptive systems—conscious, social, living, or not
    • …
    corecore