Search CORE

460 research outputs found

Action-oriented Scene Understanding

Author: Lüddecke Timo
Publication venue: University Goettingen Repository
Publication date: 21/08/2019
Field of study

In order to allow robots to act autonomously it is crucial that they do not only describe their environment accurately but also identify how to interact with their surroundings. While we witnessed tremendous progress in descriptive computer vision, approaches that explicitly target action are scarcer. This cumulative dissertation approaches the goal of interpreting visual scenes “in the wild” with respect to actions implied by the scene. We call this approach action-oriented scene understanding. It involves identifying and judging opportunities for interaction with constituents of the scene (e.g. objects and their parts) as well as understanding object functions and how interactions will impact the future. All of these aspects are addressed on three levels of abstraction: elements, perception and reasoning. On the elementary level, we investigate semantic and functional grouping of objects by analyzing annotated natural image scenes. We compare object label-based and visual context definitions with respect to their suitability for generating meaningful object class representations. Our findings suggest that representations generated from visual context are on-par in terms of semantic quality with those generated from large quantities of text. The perceptive level concerns action identification. We propose a system to identify possible interactions for robots and humans with the environment (affordances) on a pixel level using state-of-the-art machine learning methods. Pixel-wise part annotations of images are transformed into 12 affordance maps. Using these maps, a convolutional neural network is trained to densely predict affordance maps from unknown RGB images. In contrast to previous work, this approach operates exclusively on RGB images during both, training and testing, and yet achieves state-of-the-art performance. At the reasoning level, we extend the question from asking what actions are possible to what actions are plausible. For this, we gathered a dataset of household images associated with human ratings of the likelihoods of eight different actions. Based on the judgement provided by the human raters, we train convolutional neural networks to generate plausibility scores from unseen images. Furthermore, having considered only static scenes previously in this thesis, we propose a system that takes video input and predicts plausible future actions. Since this requires careful identification of relevant features in the video sequence, we analyze this particular aspect in detail using a synthetic dataset for several state-of-the-art video models. We identify feature learning as a major obstacle for anticipation in natural video data. The presented projects analyze the role of action in scene understanding from various angles and in multiple settings while highlighting the advantages of assuming an action-oriented perspective. We conclude that action-oriented scene understanding can augment classic computer vision in many real-life applications, in particular robotics

A review and comparison of ontology-based approaches to robot autonomy

Author: Alenya Guillem
Barreto Marcos
Beetz Michael
Bermejo-Alonso Julita
Bessler Daniel
Borgo Stefano
Diab Mohammed
Goncalves Paulo
Gyrard Amelie
Habib Maki K.
Khamis Alaa
Li Howard
Nakawala Hirenkumar
Olivares-Alarcos Alberto
Olszewska Joanna Isabelle
Pignaton Edison
Quintas Joao
Rosell Jan
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2019
Field of study

Within the next decades, robots will need to be able to execute a large variety of tasks autonomously in a large variety of environments. To relax the resulting programming effort, a knowledge-enabled approach to robot programming can be adopted to organize information in re-usable knowledge pieces. However, for the ease of reuse, there needs to be an agreement on the meaning of terms. A common approach is to represent these terms using ontology languages that conceptualize the respective domain. In this work, we will review projects that use ontologies to support robot autonomy. We will systematically search for projects that fulfill a set of inclusion criteria and compare them with each other with respect to the scope of their ontology, what types of cognitive capabilities are supported by the use of ontologies, and which is their application domain.Peer ReviewedPostprint (author's final draft

Merging multi-modal information and cross-modal learning in artificial cognitive systems

Author: VREČKO ALEN
Publication venue
Publication date: 30/08/2016
Field of study

Čezmodalno povezovanje je združevanje dveh ali več modalnih predstavitev lastnosti neke entitete v skupno predstavitev. Gre za eno temeljnih lastnosti spoznavnih sistemov, ki delujejo v kompleksnem okolju. Da bi se spoznavni sistemi uspešno prilagajali spremembam v dinamičnem okolju, je potrebno mehanizem čezmodalnega povezovanja nadgraditi s čezmodalnim učenjem. Morebiti še najtežja naloga pa je integracija obeh mehanizmov v spoznavni sistem. Njuna vloga v takem sistemu je dvojna: premoščanje semantičnih vrzeli med modalnostmi ter mediacija med nižjenivojskimi mehanizmi za obelavo senzorskih podatkov in višjenivojskimi spoznavnimi procesi, kot sta npr. motivacija in načrtovanje. V magistrski nalogi predstavljamo pristop k modeliranju verjetnostnega večmodalnega združevanja informacij v spoznavnih sistemih. S pomočjo mar-kov-skih logičnih omrežij formuliramo model čezmodalnega povezovanja in učenja ter opišemo načela njegovega vključevanja v spoznavne arhitekture. Prototip modela smo ovrednotili samostojno, z eksperimenti, ki simulirajo trimodalno spoznavno arhitekturo. Na podlagi našega pristopa oblikujemo, implementiramo in integriramo tudi podsistem prepričanj, ki premošča semantični prepad v prototipu spoznavnega sistema George. George je inteligenten robot, ki je sposoben zaznavanja in prepoznavanja predmetov iz okolice ter učenja njihovih lastnosti s pomočjo pogovora s človekom. Njegov poglavitni namen je preizkus različnih paradigem o interaktivnemu učenju konceptov. V ta namen smo izdelali in izvedli interaktivne eksperimente za vrednotenje Georgevih vedenjskih mehanizmov. S temi eksperimenti smo naš pristop k večmodalnemu združevanju informacij preizkusili in ovrednotili tudi kot del delujočega spoznavnega sistema.Cross-modal binding is the ability to merge two or more modal representations of the same entity into a single shared representation. This ability is one of the fundamental properties of any cognitive system operating in a complex environment. In order to adapt successfully to changes in a dynamic environment the binding mechanism has to be supplemented with cross-modal learning. But perhaps the most difficult task is the integration of both mechanisms into a cognitive system. Their role in such a system is two-fold: to bridge the semantic gap between modalities, and to mediate between the lower-level mechanisms for processing the sensory data, and the higher-level cognitive processes, such as motivation and planning. In this master thesis, we present an approach to probabilistic merging of multi-modal information in cognitive systems. By this approach, we formulate a model of binding and cross-modal learning in Markov logic networks, and describe the principles of its integration into a cognitive architecture. We implement a prototype of the model and evaluate it with off-line experiments that simulate a cognitive architecture with three modalities. Based on our approach, we design, implement and integrate the belief layer -- a subsystem that bridges the semantic gap in a prototype cognitive system named George. George is an intelligent robot that is able to detect and recognise objects in its surroundings, and learn about their properties in a situated dialogue with a human tutor. Its main purpose is to validate various paradigms of interactive learning. To this end, we have developed and performed on-line experiments that evaluate the mechanisms of robot\u27s behaviour. With these experiments, we were also able to test and evaluate our approach to merging multi-modal information as part of a functional cognitive system

Repository of the University of Ljubljana

Artificial Cognition for Social Human-Robot Interaction: An Implementation

Author: Alami
Alami
Alami
Alami
Alili
Alili
Anderson
Atkinson
Aurélie Clodic
Austin
Baddeley
Baron-Cohen
Baxter
Beetz
Blisard
Blodow
Breazeal
Breazeal
Chen
Clark
Clodic
Clodic
Clodic
Coradeschi
Daoutis
Dautenhahn
Dehais
Demiris
Devin
E. Akin Sisbot
Erdem
Ferreira
Fiore
Fiore
Fong
Galindo
Gat
Gharbi
Gharbi
Goldberg
Grosz
Harnad
Hawes
Ingrand
Jacobsson
Kelleher
Kemp
Klein
Knoblich
Koay
Kruse
Lallement
Lallée
Lehman
Lemaignan
Lemaignan
Lemaignan
Lemaignan
Lemaignan
Lemaignan
Lemaignan
Lemaignan
Lemaignan
Lenat
Lörken
Madhava
Mainprice
Mainprice
Marin
Mathieu Warnier
Matuszek
Mavridis
McCarthy
Milliez
Milliez
Moll
Nüchter
O'Keefe
Pacherie
Pandey
Pandey
Perner
Pfeifer
Premack
Rachid Alami
Regier
Rich
Rios-Martinez
Ros
Ros
Saffiotti
Sebanz
Shapiro
Sidner
Sirin
Sisbot
Sisbot
Sisbot
Sisbot
Séverin Lemaignan
Tenorth
Trafton
Tulving
Tulving
Tversky
Varadarajan
Varela
Vesper
Volpe
Waibel
Waldhart
Warnier
Wimmer
Woolridge
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

© 2017 The Authors Human–Robot Interaction challenges Artificial Intelligence in many regards: dynamic, partially unknown environments that were not originally designed for robots; a broad variety of situations with rich semantics to understand and interpret; physical interactions with humans that requires fine, low-latency yet socially acceptable control strategies; natural and multi-modal communication which mandates common-sense knowledge and the representation of possibly divergent mental models. This article is an attempt to characterise these challenges and to exhibit a set of key decisional issues that need to be addressed for a cognitive robot to successfully share space and tasks with a human. We identify first the needed individual and collaborative cognitive skills: geometric reasoning and situation assessment based on perspective-taking and affordance analysis; acquisition and representation of knowledge models for multiple agents (humans and robots, with their specificities); situated, natural and multi-modal dialogue; human-aware task planning; human–robot joint task achievement. The article discusses each of these abilities, presents working implementations, and shows how they combine in a coherent and original deliberative architecture for human–robot interaction. Supported by experimental results, we eventually show how explicit knowledge management, both symbolic and geometric, proves to be instrumental to richer and more natural human–robot interactions by pushing for pervasive, human-level semantics within the robot's deliberative system

Plymouth Electronic Archive and Research Library