387 research outputs found

    Attention and Anticipation in Fast Visual-Inertial Navigation

    Get PDF
    We study a Visual-Inertial Navigation (VIN) problem in which a robot needs to estimate its state using an on-board camera and an inertial sensor, without any prior knowledge of the external environment. We consider the case in which the robot can allocate limited resources to VIN, due to tight computational constraints. Therefore, we answer the following question: under limited resources, what are the most relevant visual cues to maximize the performance of visual-inertial navigation? Our approach has four key ingredients. First, it is task-driven, in that the selection of the visual cues is guided by a metric quantifying the VIN performance. Second, it exploits the notion of anticipation, since it uses a simplified model for forward-simulation of robot dynamics, predicting the utility of a set of visual cues over a future time horizon. Third, it is efficient and easy to implement, since it leads to a greedy algorithm for the selection of the most relevant visual cues. Fourth, it provides formal performance guarantees: we leverage submodularity to prove that the greedy selection cannot be far from the optimal (combinatorial) selection. Simulations and real experiments on agile drones show that our approach ensures state-of-the-art VIN performance while maintaining a lean processing time. In the easy scenarios, our approach outperforms appearance-based feature selection in terms of localization errors. In the most challenging scenarios, it enables accurate visual-inertial navigation while appearance-based feature selection fails to track robot's motion during aggressive maneuvers.Comment: 20 pages, 7 figures, 2 table

    小型探査ローバにおけるリスクを考慮したロバスト画像処理

    Get PDF
    【学位授与の要件】中央大学学位規則第4条第1項【論文審査委員主査】橋本 秀紀 (中央大学理工学部教授)【論文審査委員副査】國井 康晴(中央大学理工学部教授)、中村 太郎(中央大学理工学部教授)、久保田 孝(宇宙航空研究開発機構教授)博士(工学)中央大

    Natural landmark detection for visually-guided robot navigation

    Get PDF
    The main difficulty to attain fully autonomous robot navigation outdoors is the fast detection of reliable visual references, and their subsequent characterization as landmarks for immediate and unambiguous recognition. Aimed at speed, our strategy has been to track salient regions along image streams by just performing on-line pixel sampling. Persistent regions are considered good candidates for landmarks, which are then characterized by a set of subregions with given color and normalized shape. They are stored in a database for posterior recognition during the navigation process. Some experimental results showing landmark-based navigation of the legged robot Lauron III in an outdoor setting are provided.Peer Reviewe

    Outdoor view recognition based on landmark grouping and logistic regression

    Get PDF
    Vision-based robot localization outdoors has remained more elusive than its indoors counterpart. Drastic illumination changes and the scarceness of suitable landmarks are the main difficulties. This paper attempts to surmount them by deviating from the main trend of using local features. Instead, a global descriptor called landmark-view is defined, which aggregates the most visually-salient landmarks present in each scene. Thus, landmark co-occurrence and spatial and saliency relationships between them are added to the single landmark characterization, based on saliency and color distribution. A suitable framework to compare landmark-views is developed, and it is shown how this remarkably enhances the recognition performance, compared against single landmark recognition. A view-matching model is constructed using logistic regression. Experimentation using 45 views, acquired outdoors, containing 273 landmarks, yielded good recognition results. The overall percentage of correct view classification obtained was 80.6%, indicating the adequacy of the approach.Peer ReviewedPostprint (author’s final draft

    Visual Attention Mechanism for a Social Robot

    Get PDF
    This paper describes a visual perception system for a social robot. The central part of this system is an artificial attention mechanism that discriminates the most relevant information from all the visual information perceived by the robot. It is composed by three stages. At the preattentive stage, the concept of saliency is implemented based on ‘proto-objects’ [37]. From these objects, different saliency maps are generated. Then, the semiattentive stage identifies and tracks significant items according to the tasks to accomplish. This tracking process allows to implement the ‘inhibition of return’. Finally, the attentive stage fixes the field of attention to the most relevant object depending on the behaviours to carry out. Three behaviours have been implemented and tested which allow the robot to detect visual landmarks in an initially unknown environment, and to recognize and capture the upper-body motion of people interested in interact with it

    Is bottom-up attention useful for object recognition?

    Get PDF
    A key problem in learning multiple objects from unlabeled images is that it is a priori impossible to tell which part of the image corresponds to each individual object, and which part is irrelevant clutter which is not associated to the objects. We investigate empirically to what extent pure bottom-up attention can extract useful information about the location, size and shape of objects from images and demonstrate how this information can be utilized to enable unsupervised learning of objects from unlabeled images. Our experiments demonstrate that the proposed approach to using bottom-up attention is indeed useful for a variety of applications

    On the assessment of landmark salience for human navigation

    Get PDF
    In this paper, we propose a conceptual framework for assessing the salience of landmarks for navigation. Landmark salience is derived as a result of the observer's point of view, both physical and cognitive, the surrounding environment, and the objects contained therein. This is in contrast to the currently held view that salience is an inherent property of some spatial feature. Salience, in our approach, is expressed as a three-valued Saliency Vector. The components that determine this vector are Perceptual Salience, which defines the exogenous (or passive) potential of an object or region for acquisition of visual attention, Cognitive Salience, which is an endogenous (or active) mode of orienting attention, triggered by informative cues providing advance information about the target location, and Contextual Salience, which is tightly coupled to modality and task to be performed. This separation between voluntary and involuntary direction of visual attention in dependence of the context allows defining a framework that accounts for the interaction between observer, environment, and landmark. We identify the low-level factors that contribute to each type of salience and suggest a probabilistic approach for their integration. Finally, we discuss the implications, consider restrictions, and explore the scope of the framewor

    Robot Navigation in Human Environments

    Get PDF
    For the near future, we envision service robots that will help us with everyday chores in home, office, and urban environments. These robots need to work in environments that were designed for humans and they have to collaborate with humans to fulfill their tasks. In this thesis, we propose new methods for communicating, transferring knowledge, and collaborating between humans and robots in four different navigation tasks. In the first application, we investigate how automated services for giving wayfinding directions can be improved to better address the needs of the human recipients. We propose a novel method based on inverse reinforcement learning that learns from a corpus of human-written route descriptions what amount and type of information a route description should contain. By imitating the human teachers' description style, our algorithm produces new route descriptions that sound similarly natural and convey similar information content, as we show in a user study. In the second application, we investigate how robots can leverage background information provided by humans for exploring an unknown environment more efficiently. We propose an algorithm for exploiting user-provided information such as sketches or floor plans by combining a global exploration strategy based on the solution of a traveling salesman problem with a local nearest-frontier-first exploration scheme. Our experiments show that the exploration tours are significantly shorter and that our system allows the user to effectively select the areas that the robot should explore. In the second part of this thesis, we focus on humanoid robots in home and office environments. The human-like body plan allows humanoid robots to navigate in environments and operate tools that were designed for humans, making humanoid robots suitable for a wide range of applications. As localization and mapping are prerequisites for all navigation tasks, we first introduce a novel feature descriptor for RGB-D sensor data and integrate this building block into an appearance-based simultaneous localization and mapping system that we adapt and optimize for the usage on humanoid robots. Our optimized system is able to track a real Nao humanoid robot more accurately and more robustly than existing approaches. As the third application, we investigate how humanoid robots can cover known environments efficiently with their camera, for example for inspection or search tasks. We extend an existing next-best-view approach by integrating inverse reachability maps, allowing us to efficiently sample and check collision-free full-body poses. Our approach enables the robot to inspect as much of the environment as possible. In our fourth application, we extend the coverage scenario to environments that also include articulated objects that the robot has to actively manipulate to uncover obstructed regions. We introduce algorithms for navigation subtasks that run highly parallelized on graphics processing units for embedded devices. Together with a novel heuristic for estimating utility maps, our system allows to find high-utility camera poses for efficiently covering environments with articulated objects. All techniques presented in this thesis were implemented in software and thoroughly evaluated in user studies, simulations, and experiments in both artificial and real-world environments. Our approaches advance the state of the art towards universally usable robots in everyday environments.Roboternavigation in menschlichen Umgebungen In naher Zukunft erwarten wir Serviceroboter, die uns im Haushalt, im Büro und in der Stadt alltägliche Arbeiten abnehmen. Diese Roboter müssen in für Menschen gebauten Umgebungen zurechtkommen und sie müssen mit Menschen zusammenarbeiten um ihre Aufgaben zu erledigen. In dieser Arbeit schlagen wir neue Methoden für die Kommunikation, Wissenstransfer und Zusammenarbeit zwischen Menschen und Robotern bei Navigationsaufgaben in vier Anwendungen vor. In der ersten Anwendung untersuchen wir, wie automatisierte Dienste zur Generierung von Wegbeschreibungen verbessert werden können, um die Beschreibungen besser an die Bedürfnisse der Empfänger anzupassen. Wir schlagen eine neue Methode vor, die inverses bestärkendes Lernen nutzt, um aus einem Korpus von von Menschen geschriebenen Wegbeschreibungen zu lernen, wie viel und welche Art von Information eine Wegbeschreibung enthalten sollte. Indem unser Algorithmus den Stil der Wegbeschreibungen der menschlichen Lehrer imitiert, kann der Algorithmus neue Wegbeschreibungen erzeugen, die sich ähnlich natürlich anhören und einen ähnlichen Informationsgehalt vermitteln, was wir in einer Benutzerstudie zeigen. In der zweiten Anwendung untersuchen wir, wie Roboter von Menschen bereitgestellte Hintergrundinformationen nutzen können, um eine bisher unbekannte Umgebung schneller zu erkunden. Wir schlagen einen Algorithmus vor, der Hintergrundinformationen wie Gebäudegrundrisse oder Skizzen nutzt, indem er eine globale Explorationsstrategie basierend auf der Lösung eines Problems des Handlungsreisenden kombiniert mit einer lokalen Explorationsstrategie. Unsere Experimente zeigen, dass die Erkundungstouren signifikant kürzer werden und dass der Benutzer mit unserem System effektiv die zu erkundenden Regionen spezifizieren kann. Der zweite Teil dieser Arbeit konzentriert sich auf humanoide Roboter in Umgebungen zu Hause und im Büro. Der menschenähnliche Körperbau ermöglicht es humanoiden Robotern, in Umgebungen zu navigieren und Werkzeuge zu benutzen, die für Menschen gebaut wurden, wodurch humanoide Roboter für vielfältige Aufgaben einsetzbar sind. Da Lokalisierung und Kartierung Grundvoraussetzungen für alle Navigationsaufgaben sind, führen wir zunächst einen neuen Merkmalsdeskriptor für RGB-D-Sensordaten ein und integrieren diesen Baustein in ein erscheinungsbasiertes simultanes Lokalisierungs- und Kartierungsverfahren, das wir an die Besonderheiten von humanoiden Robotern anpassen und optimieren. Unser System kann die Position eines realen humanoiden Roboters genauer und robuster verfolgen, als es mit existierenden Ansätzen möglich ist. Als dritte Anwendung untersuchen wir, wie humanoide Roboter bekannte Umgebungen effizient mit ihrer Kamera abdecken können, beispielsweise zu Inspektionszwecken oder zum Suchen eines Gegenstands. Wir erweitern ein bestehendes Verfahren, das die nächstbeste Beobachtungsposition berechnet, durch inverse Erreichbarkeitskarten, wodurch wir kollisionsfreie Ganzkörperposen effizient generieren und prüfen können. Unser Ansatz ermöglicht es dem Roboter, so viel wie möglich von der Umgebung zu untersuchen. In unserer vierten Anwendung erweitern wir dieses Szenario um Umgebungen, die auch bewegbare Gegenstände enthalten, die der Roboter aktiv bewegen muss um verdeckte Regionen zu sehen. Wir führen Algorithmen für Teilprobleme ein, die hoch parallelisiert auf Grafikkarten von eingebetteten Systemen ausgeführt werden. Zusammen mit einer neuen Heuristik zur Schätzung von Nutzenkarten ermöglicht dies unserem System Beobachtungspunkte mit hohem Nutzen zu finden, um Umgebungen mit bewegbaren Objekten effizient zu inspizieren. Alle vorgestellten Techniken wurden in Software implementiert und sorgfältig evaluiert in Benutzerstudien, Simulationen und Experimenten in künstlichen und realen Umgebungen. Unsere Verfahren bringen den Stand der Forschung voran in Richtung universell einsetzbarer Roboter in alltäglichen Umgebungen

    Visual attention and swarm cognition for off-road robots

    Get PDF
    Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2011Esta tese aborda o problema da modelação de atenção visual no contexto de robôs autónomos todo-o-terreno. O objectivo de utilizar mecanismos de atenção visual é o de focar a percepção nos aspectos do ambiente mais relevantes à tarefa do robô. Esta tese mostra que, na detecção de obstáculos e de trilhos, esta capacidade promove robustez e parcimónia computacional. Estas são características chave para a rapidez e eficiência dos robôs todo-o-terreno. Um dos maiores desafios na modelação de atenção visual advém da necessidade de gerir o compromisso velocidade-precisão na presença de variações de contexto ou de tarefa. Esta tese mostra que este compromisso é resolvido se o processo de atenção visual for modelado como um processo auto-organizado, cuja operação é modulada pelo módulo de selecção de acção, responsável pelo controlo do robô. Ao fechar a malha entre o processo de selecção de acção e o de percepção, o último é capaz de operar apenas onde é necessário, antecipando as acções do robô. Para fornecer atenção visual com propriedades auto-organizadas, este trabalho obtém inspiração da Natureza. Concretamente, os mecanismos responsáveis pela capacidade que as formigas guerreiras têm de procurar alimento de forma auto-organizada, são usados como metáfora na resolução da tarefa de procurar, também de forma auto-organizada, obstáculos e trilhos no campo visual do robô. A solução proposta nesta tese é a de colocar vários focos de atenção encoberta a operar como um enxame, através de interacções baseadas em feromona. Este trabalho representa a primeira realização corporizada de cognição de enxame. Este é um novo campo de investigação que procura descobrir os princípios básicos da cognição, inspeccionando as propriedades auto-organizadas da inteligência colectiva exibida pelos insectos sociais. Logo, esta tese contribui para a robótica como disciplina de engenharia e para a robótica como disciplina de modelação, capaz de suportar o estudo do comportamento adaptável.Esta tese aborda o problema da modelação de atenção visual no contexto de robôs autónomos todo-o-terreno. O objectivo de utilizar mecanismos de atenção visual é o de focar a percepção nos aspectos do ambiente mais relevantes à tarefa do robô. Esta tese mostra que, na detecção de obstáculos e de trilhos, esta capacidade promove robustez e parcimónia computacional. Estas são características chave para a rapidez e eficiência dos robôs todo-o-terreno. Um dos maiores desafios na modelação de atenção visual advém da necessidade de gerir o compromisso velocidade-precisão na presença de variações de contexto ou de tarefa. Esta tese mostra que este compromisso é resolvido se o processo de atenção visual for modelado como um processo auto-organizado, cuja operação é modulada pelo módulo de selecção de acção, responsável pelo controlo do robô. Ao fechar a malha entre o processo de selecção de acção e o de percepção, o último é capaz de operar apenas onde é necessário, antecipando as acções do robô. Para fornecer atenção visual com propriedades auto-organizadas, este trabalho obtém inspi- ração da Natureza. Concretamente, os mecanismos responsáveis pela capacidade que as formi- gas guerreiras têm de procurar alimento de forma auto-organizada, são usados como metáfora na resolução da tarefa de procurar, também de forma auto-organizada, obstáculos e trilhos no campo visual do robô. A solução proposta nesta tese é a de colocar vários focos de atenção encoberta a operar como um enxame, através de interacções baseadas em feromona. Este trabalho representa a primeira realização corporizada de cognição de enxame. Este é um novo campo de investigação que procura descobrir os princípios básicos da cognição, ins- peccionando as propriedades auto-organizadas da inteligência colectiva exibida pelos insectos sociais. Logo, esta tese contribui para a robótica como disciplina de engenharia e para a robótica como disciplina de modelação, capaz de suportar o estudo do comportamento adaptável.Fundação para a Ciência e a Tecnologia (FCT,SFRH/BD/27305/2006); Laboratory of Agent Modelling (LabMag
    corecore