74 research outputs found

    Learning Topometric Semantic Maps from Occupancy Grids

    Full text link
    Today's mobile robots are expected to operate in complex environments they share with humans. To allow intuitive human-robot collaboration, robots require a human-like understanding of their surroundings in terms of semantically classified instances. In this paper, we propose a new approach for deriving such instance-based semantic maps purely from occupancy grids. We employ a combination of deep learning techniques to detect, segment and extract door hypotheses from a random-sized map. The extraction is followed by a post-processing chain to further increase the accuracy of our approach, as well as place categorization for the three classes room, door and corridor. All detected and classified entities are described as instances specified in a common coordinate system, while a topological map is derived to capture their spatial links. To train our two neural networks used for detection and map segmentation, we contribute a simulator that automatically creates and annotates the required training data. We further provide insight into which features are learned to detect doorways, and how the simulated training data can be augmented to train networks for the direct application on real-world grid maps. We evaluate our approach on several publicly available real-world data sets. Even though the used networks are solely trained on simulated data, our approach demonstrates high robustness and effectiveness in various real-world indoor environments.Comment: Presented at the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS

    Automated generation of geometrically-precise and semantically-informed virtual geographic environnements populated with spatially-reasoning agents

    Get PDF
    La GĂ©o-Simulation Multi-Agent (GSMA) est un paradigme de modĂ©lisation et de simulation de phĂ©nomĂšnes dynamiques dans une variĂ©tĂ© de domaines d'applications tels que le domaine du transport, le domaine des tĂ©lĂ©communications, le domaine environnemental, etc. La GSMA est utilisĂ©e pour Ă©tudier et analyser des phĂ©nomĂšnes qui mettent en jeu un grand nombre d'acteurs simulĂ©s (implĂ©mentĂ©s par des agents) qui Ă©voluent et interagissent avec une reprĂ©sentation explicite de l'espace qu'on appelle Environnement GĂ©ographique Virtuel (EGV). Afin de pouvoir interagir avec son environnement gĂ©ographique qui peut ĂȘtre dynamique, complexe et Ă©tendu (Ă  grande Ă©chelle), un agent doit d'abord disposer d'une reprĂ©sentation dĂ©taillĂ©e de ce dernier. Les EGV classiques se limitent gĂ©nĂ©ralement Ă  une reprĂ©sentation gĂ©omĂ©trique du monde rĂ©el laissant de cĂŽtĂ© les informations topologiques et sĂ©mantiques qui le caractĂ©risent. Ceci a pour consĂ©quence d'une part de produire des simulations multi-agents non plausibles, et, d'autre part, de rĂ©duire les capacitĂ©s de raisonnement spatial des agents situĂ©s. La planification de chemin est un exemple typique de raisonnement spatial dont un agent pourrait avoir besoin dans une GSMA. Les approches classiques de planification de chemin se limitent Ă  calculer un chemin qui lie deux positions situĂ©es dans l'espace et qui soit sans obstacle. Ces approches ne prennent pas en compte les caractĂ©ristiques de l'environnement (topologiques et sĂ©mantiques), ni celles des agents (types et capacitĂ©s). Les agents situĂ©s ne possĂšdent donc pas de moyens leur permettant d'acquĂ©rir les connaissances nĂ©cessaires sur l'environnement virtuel pour pouvoir prendre une dĂ©cision spatiale informĂ©e. Pour rĂ©pondre Ă  ces limites, nous proposons une nouvelle approche pour gĂ©nĂ©rer automatiquement des Environnements GĂ©ographiques Virtuels InformĂ©s (EGVI) en utilisant les donnĂ©es fournies par les SystĂšmes d'Information GĂ©ographique (SIG) enrichies par des informations sĂ©mantiques pour produire des GSMA prĂ©cises et plus rĂ©alistes. De plus, nous prĂ©sentons un algorithme de planification hiĂ©rarchique de chemin qui tire avantage de la description enrichie et optimisĂ©e de l'EGVI pour fournir aux agents un chemin qui tient compte Ă  la fois des caractĂ©ristiques de leur environnement virtuel et de leurs types et capacitĂ©s. Finalement, nous proposons une approche pour la gestion des connaissances sur l'environnement virtuel qui vise Ă  supporter la prise de dĂ©cision informĂ©e et le raisonnement spatial des agents situĂ©s

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling

    Cognitive Mapping for Object Searching in Indoor Scenes

    Get PDF
    abstract: Visual navigation is a multi-disciplinary field across computer vision, machine learning and robotics. It is of great significance in both research and industrial applications. An intelligent agent with visual navigation ability will be capable of performing the following tasks: actively explore in environments, distinguish and localize a requested target and approach the target following acquired strategies. Despite a variety of advances in mobile robotics, enabling an autonomous with above-mentioned abilities is still a challenging and complex task. However, the solution to the task is very likely to accelerate the landing of assistive robots. Reinforcement learning is a method that trains autonomous robot based on rewarding desired behaviors to help it obtain an action policy that maximizes rewards while the robot interacting with the environment. Through trial and error, an agent learns sophisticated and skillful strategies to handle complex tasks in the environment. Inspired by navigation procedures of human beings that when navigating through environments, humans reason about accessible spaces and geometry of the environment a lot based on first-person view, figure out the destination and then ease over, this work develops a model that maps from pixels to actions and inherently estimate the target as well as the free-space map. The model has three major constituents: (i) a cognitive mapper that maps the topologic free-space map from first-person view images, (ii) a target recognition network that locates a desired object and (iii) an action policy deep reinforcement learning network. Further, a planner model with cascade architecture based on multi-scale semantic top-down occupancy map input is proposed.Dissertation/ThesisMasters Thesis Computer Engineering 201

    Behaviour Based Simulated Low-Cost Multi-Robot Exploration

    Get PDF
    Institute of Perception, Action and BehaviourThe use of multiple robots for exploration holds the promise of improved performance over single robot systems. To exploit effectively the advantage of having several robots, the robots must be co-ordinated which requires communication. Previous research relies on a fixed communication network topology, a single lead explorer, and flat communication. This thesis presents a novel architecture to keep a group of robots as a single connected and adaptable communication network to explore and map the environment. This architecture, BERODE (BEhavioural ROle DEcentralized), aims to be robust, efficient and scalable to large numbers of robots. The network is adaptable, the number of explorers variable, and communications hierarchical (local/global). The network is kept connected by an MST (Minimum Spanning Tree) control network, a subnetwork containing only the minimum necessary links to be a fully connected network. As the robots explore, the MST control network is updated either partially (local network) or globally to improve signal quality. The local network for a robot is formed by the robots that are within a certain retransmission distance in the MST control network. BERODE implements a hierarchic approach to distributing information to improve scalability with respect to the number of robots. The robots share information at two levels: frequently within their local network and less frequently to the entire robot network. The robots coordinate by assuming behaviours depending on their connections in the MST control network. The behavioural roles balance between the tasks of exploration and network maintenance where the Explorer role is the most focused on the exploration task. This improves efficiency by allowing varying number of robots to take the Explorer role depending on circumstances. The roles generate reactive plans that ensure the connectivity of the network. These plans are based on the imposition of heterogeneous virtual spring forces. Our simulations show that BERODE is more efficient, scalable and robust with respect to communications than the previous approaches that rely on fixed control networks. BERODE is more efficient because it required less time to build a complete map of the environment than the fixed control networks. BERODE is more scalable because it keeps the robots as a single connected network for more time than the fixed control networks. BERODE is more robust because it has a better success rate at finishing the exploration

    Behavior-based Control for Service Robots inspired by Human Motion Patterns : a Robotic Shopping Assistant

    Get PDF
    Es wurde, unter Verwendung menschenĂ€hnlicher Bewegungsmuster und eines verhaltensbasierten Ansatzes, eine Steuerung fĂŒr mobile Serviceroboter entwickelt, die Aufgabenplanung, globale und lokale Navigation in dynamischen Umgebungen, sowie die gemeinsame AufgabenausfĂŒhrung mit einem Benutzer umfasst. Das Verhaltensnetzwerk besteht aus Modulen mit voneinander unabhĂ€ngigen Aufgaben. Das komplexe Gesamtverhalten des Systems ergibt sich durch die Vereinigung der Einzelverhalten (\u27Emergenz\u27)

    Inductive Pattern Formation

    Get PDF
    With the extended computational limits of algorithmic recursion, scientific investigation is transitioning away from computationally decidable problems and beginning to address computationally undecidable complexity. The analysis of deductive inference in structure-property models are yielding to the synthesis of inductive inference in process-structure simulations. Process-structure modeling has examined external order parameters of inductive pattern formation, but investigation of the internal order parameters of self-organization have been hampered by the lack of a mathematical formalism with the ability to quantitatively define a specific configuration of points. This investigation addressed this issue of quantitative synthesis. Local space was developed by the Poincare inflation of a set of points to construct neighborhood intersections, defining topological distance and introducing situated Boolean topology as a local replacement for point-set topology. Parallel development of the local semi-metric topological space, the local semi-metric probability space, and the local metric space of a set of points provides a triangulation of connectivity measures to define the quantitative architectural identity of a configuration and structure independent axes of a structural configuration space. The recursive sequence of intersections constructs a probabilistic discrete spacetime model of interacting fields to define the internal order parameters of self-organization, with order parameters external to the configuration modeled by adjusting the morphological parameters of individual neighborhoods and the interplay of excitatory and inhibitory point sets. The evolutionary trajectory of a configuration maps the development of specific hierarchical structure that is emergent from a specific set of initial conditions, with nested boundaries signaling the nonlinear properties of local causative configurations. This exploration of architectural configuration space concluded with initial process-structure-property models of deductive and inductive inference spaces. In the computationally undecidable problem of human niche construction, an adaptive-inductive pattern formation model with predictive control organized the bipartite recursion between an information structure and its physical expression as hierarchical ensembles of artificial neural network-like structures. The union of architectural identity and bipartite recursion generates a predictive structural model of an evolutionary design process, offering an alternative to the limitations of cognitive descriptive modeling. The low computational complexity of these models enable them to be embedded in physical constructions to create the artificial life forms of a real-time autonomously adaptive human habitat

    Semantic Localization and Mapping in Robot Vision

    Get PDF
    Integration of human semantics plays an increasing role in robotics tasks such as mapping, localization and detection. Increased use of semantics serves multiple purposes, including giving computers the ability to process and present data containing human meaningful concepts, allowing computers to employ human reasoning to accomplish tasks. This dissertation presents three solutions which incorporate semantics onto visual data in order to address these problems. First, on the problem of constructing topological maps from sequence of images. The proposed solution includes a novel image similarity score which uses dynamic programming to match images using both appearance and relative positions of local features simultaneously. An MRF is constructed to model the probability of loop-closures and a locally optimal labeling is found using Loopy-BP. The recovered loop closures are then used to generate a topological map. Results are presented on four urban sequences and one indoor sequence. The second system uses video and annotated maps to solve localization. Data association is achieved through detection of object classes, annotated in prior maps, rather than through detection of visual features. To avoid the caveats of object recognition, a new representation of query images is introduced consisting of a vector of detection scores for each object class. Using soft object detections, hypotheses about pose are refined through particle filtering. Experiments include both small office spaces, and a large open urban rail station with semantically ambiguous places. This approach showcases a representation that is both robust and can exploit the plethora of existing prior maps for GPS-denied environments while avoiding the data association problems encountered when matching point clouds or visual features. Finally, a purely vision-based approach for constructing semantic maps given camera pose and simple object exemplar images. Object response heatmaps are combined with known pose to back-project detection information onto the world. These update the world model, integrating information over time as the camera moves. The approach avoids making hard decisions on object recognition, and aggregates evidence about objects in the world coordinate system. These solutions simultaneously showcase the contribution of semantics in robotics and provide state of the art solutions to these fundamental problems

    Spatial representation for planning and executing robot behaviors in complex environments

    Get PDF
    Robots are already improving our well-being and productivity in different applications such as industry, health-care and indoor service applications. However, we are still far from developing (and releasing) a fully functional robotic agent that can autonomously survive in tasks that require human-level cognitive capabilities. Robotic systems on the market, in fact, are designed to address specific applications, and can only run pre-defined behaviors to robustly repeat few tasks (e.g., assembling objects parts, vacuum cleaning). They internal representation of the world is usually constrained to the task they are performing, and does not allows for generalization to other scenarios. Unfortunately, such a paradigm only apply to a very limited set of domains, where the environment can be assumed to be static, and its dynamics can be handled before deployment. Additionally, robots configured in this way will eventually fail if their "handcrafted'' representation of the environment does not match the external world. Hence, to enable more sophisticated cognitive skills, we investigate how to design robots to properly represent the environment and behave accordingly. To this end, we formalize a representation of the environment that enhances the robot spatial knowledge to explicitly include a representation of its own actions. Spatial knowledge constitutes the core of the robot understanding of the environment, however it is not sufficient to represent what the robot is capable to do in it. To overcome such a limitation, we formalize SK4R, a spatial knowledge representation for robots which enhances spatial knowledge with a novel and "functional" point of view that explicitly models robot actions. To this end, we exploit the concept of affordances, introduced to express opportunities (actions) that objects offer to an agent. To encode affordances within SK4R, we define the "affordance semantics" of actions that is used to annotate an environment, and to represent to which extent robot actions support goal-oriented behaviors. We demonstrate the benefits of a functional representation of the environment in multiple robotic scenarios that traverse and contribute different research topics relating to: robot knowledge representations, social robotics, multi-robot systems and robot learning and planning. We show how a domain-specific representation, that explicitly encodes affordance semantics, provides the robot with a more concrete understanding of the environment and of the effects that its actions have on it. The goal of our work is to design an agent that will no longer execute an action, because of mere pre-defined routine, rather, it will execute an actions because it "knows'' that the resulting state leads one step closer to success in its task
    • 

    corecore