151 research outputs found

    Robot orientation with histograms on MSL

    Get PDF
    One of the most important tasks on robot soccer is localization. The team robots should self-localize on the 18 x 12 meters soccer field. Since a few years ago the soccer field has increased and the corner posts were removed and that increased the localization task complexity. One important aspect to take care for a proper localization is to find out the robot orientation. This paper proposes a new technique to calculate the robot orientation. The proposed method consists of using a histogram of white-green transitions (to detect the lines on the field) to know the robot orientation. This technique does not take much computational time and proves to be very reliable.(undefined

    Fast computational processing for mobile robots' self-localization

    Get PDF
    This paper intends to present a different approach to solve the Self-Localization problem regarding a RoboCup’s Middle Size League game, developed by MINHO team researchers. The method uses white field markings as key points, to compute the position with least error, creating an error-based graphic where the minimum corresponds to the real position, that are computed by comparing the key (line) points with a precomputed set of values for each position. This approach allows a very fast local and global localization calculation, allowing the global localization to be used more often, while driving the estimate to its real value. Differently from the majority of other teams in this league, it was important to come up with a new and improved method to solve the traditional slow Self-Localization problem.This work was developed at the Laboratório de Automação e Robótica by MINHO team´s researching and developing team, at University of Minho, under the supervision of Professor A. Fernando Ribeiro and A. Gil Lopes. The knowledge exchanging between the RoboCup’s MSL teams and community contributed greatly for the development of this work. This work has been supported by COMPETE: POCI-01- 0145-FEDER-007043 and FCT – Fundação para a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2013.info:eu-repo/semantics/publishedVersio

    Vision based referee sign language recognition system for the RoboCup MSL league

    Get PDF
    In RoboCup Middle Size league (MSL) the main referee uses assisting technology, controlled by a second referee, to support him, in particular for conveying referee decisions for robot players with the help of a wireless communication system. In this paper a vision-based system is introduced, able to interpret dynamic and static gestures of the referee, thus eliminating the need for a second one. The referee's gestures are interpreted by the system and sent directly to the Referee Box, which sends the proper commands to the robots. The system is divided into four modules: a real time hand tracking and feature extraction, a SVM (Support Vector Machine) for static hand posture identification, an HMM (Hidden Markov Model) for dynamic unistroke hand gesture recognition, and a FSM (Finite State Machine) to control the various system states transitions. The experimental results showed that the system works very reliably, being able to recognize the combination of gestures and hand postures in real-time. For the hand posture recognition, with the SVM model trained with the selected features, an accuracy of 98,2% was achieved. Also, the system has many advantages over the current implemented one, like avoiding the necessity of a second referee, working on noisy environments, working on wireless jammed situations. This system is easy to implement and train and may be an inexpensive solution

    Visual Odometry Based on Structural Matching of Local Invariant Features Using Stereo Camera Sensor

    Get PDF
    This paper describes a novel sensor system to estimate the motion of a stereo camera. Local invariant image features are matched between pairs of frames and linked into image trajectories at video rate, providing the so-called visual odometry, i.e., motion estimates from visual input alone. Our proposal conducts two matching sessions: the first one between sets of features associated to the images of the stereo pairs and the second one between sets of features associated to consecutive frames. With respect to previously proposed approaches, the main novelty of this proposal is that both matching algorithms are conducted by means of a fast matching algorithm which combines absolute and relative feature constraints. Finding the largest-valued set of mutually consistent matches is equivalent to finding the maximum-weighted clique on a graph. The stereo matching allows to represent the scene view as a graph which emerge from the features of the accepted clique. On the other hand, the frame-to-frame matching defines a graph whose vertices are features in 3D space. The efficiency of the approach is increased by minimizing the geometric and algebraic errors to estimate the final displacement of the stereo camera between consecutive acquired frames. The proposed approach has been tested for mobile robotics navigation purposes in real environments and using different features. Experimental results demonstrate the performance of the proposal, which could be applied in both industrial and service robot fields

    Geological Object Recognition in Extraterrestrial Environments

    Get PDF
    On July 4 1997, the landing of NASA’s Pathnder probe and its rover Sojourner marked the beginning of a new era in space exploration; robots with the ability to move have made up the vanguard of human extraterrestrial exploration ever since. With Sojourners landing, for the rst time, a ground traversing robot was at a distance too far from earth to make direct human control practical. This has given rise to the development of autonomous systems to improve the e?ciency of these robots,in both their ability to move,and their ability to make decisions regarding their environment. Computer Vision comprises a large part of these autonomous systems, and in the course of performing these tasks a large number of images are taken for the purpose of navigation. The limited nature of the current Deep Space Network means that a majority of these images are never seen by human eyes. This work explores the possibility of using these images to target certain features by using a combination of three AdaBoost algorithms and established image feature approaches to help prioritize interesting subjects from an ever growing data set of imaging data

    Machine Learning for Multi-Robot Semantic Simultaneous Localization and Mapping

    Get PDF
    RÉSUMÉ L’automatisation et la robotique prennent une place de plus en plus importante dans notre vie quotidienne, avec de nombreuses utilisations possibles. Les robots pourraient nous épargner des tâches dangereuses et pénibles, ou rendre des choses impossibles jusqu’à maintenant possibles. Pour que les robots s’intègrent en toute sécurité dans notre monde et dans de nouveaux environnements inconnus, il est clef qu’ils soient équipés d’une capacité de per-ception, et en particulier qu’ils puissent se localiser par rapport à leur entourage. Afin d’être réellement indépendants, les robots doivent pouvoir le faire en se basant uniquement sur leurs propres capteurs, les plus couramment utilisés étant les caméras. Une solution pour obtenir de telles estimations est d’utiliser un algorithme de cartographie et localisa-tion simultanée (SLAM), dans lequel le robot va simultanément construire une carte de son environnement et estimer son propre état. Le SLAM avec un seul robot a fait l’objet de nombreux travaux scientifiques, et est désormais considéré comme un domaine de recherche mature. Cependant, l’utilisation d’une équipe de robots peut o˙rir plusieurs avantages en termes de robustesse, d’eÿcacité et de performances pour de nombreuses tâches. Dans ce cas, des algorithmes de SLAM multi-robots sont nécessaires pour permettre à chaque robot de bénéficier de l’expérience de toute l’équipe. Le SLAM multi-robot peut s’appuyer sur des solutions SLAM classiques, mais nécessite des adaptations et fait face à des contraintes de calculs et de communications supplémentaires. Un défi particulier dans le SLAM multi-robots est la nécessité pour les robots de trouver des fermetures de boucles inter-robots: des liens entre les trajectoires de di˙érents robots qui peuvent être trouvés lorsqu’ils visitent le même endroit. Deux catégories d’approches sont possibles pour détecter les fermetures de boucles inter-robots. Dans les méthodes indirectes, les robots communiquent pour vérifier s’ils ont cartographié un espace commun, puis tentent de trouver des fermetures de boucles à partir des données recueillies par chacun des robots dans cet espace. Dans les méthodes directes, les robots s’appuient directement sur les données de leurs capteurs pour estimer les fermetures de boucles. Chaque approche a des avantages et des inconvénients, mais les méthodes indi-rectes ont été plus étudiées récemment. Ce mémoire s’appuie sur les avancées récentes de la vision par ordinateur pour présenter des contributions à chaque catégorie d’approches pour la détection de fermetures de boucles inter-robots. Une première contribution est présentée pour la détection de fermetures de boucles indirecte dans une équipe de robots entièrement en communication. Elle utilise des constellations, une représentation sémantique compacte de l’environnement basée sur les objets qui le compose.----------ABSTRACT Automation and robotics are becoming more and more common in our daily lives, with many possible applications. Deploying robots in the world can extend what humans are capable of doing, and can save us from dangerous and strenuous tasks. For robots to be safely sent out in our real world, and in new unknown environments, one key capability they need is to perceive their environment, and particularly to localize themselves with respect to their surroundings. To truly be able to be deployed anywhere, robots should be able to do so relying only on their sensors, the most commonly used being cameras. One way to generate such an estimate is by using a simultaneous localization and mapping (SLAM) algorithm, in which the robot will concurrently build a map of its environment and estimate its state within it. Single-robot SLAM has been extensively researched and is now considered a mature field. However, using a team of robots can provide several benefits in terms of robustness, eÿciency, and performance for many tasks. In this case, multi-robot SLAM algorithms are required to allow each robot to benefit from the whole team’s experience. Multi-robot SLAM can build on top of single-robot SLAM solutions, but requires adaptations and faces computation and communication constraints. One particular challenge that arises in multi-robot SLAM is the need for robots to find inter-robot loop closures: relationships between trajectories of di˙erent robots that can be found when they visit the same place. Two categories of approaches are possible to detect inter-robot loop closures. In indirect methods, robots communicate to find if they have mapped the same area, and then attempt to find loop closures using data gathered by each robot in the place that was jointly visited. In direct methods, robots directly rely on data they gather from their sensors to estimate the loop closures. Each approach has its own benefits and challenges, with indirect methods being more popular in recent works. This thesis builds on recent computer vision advancements to present contributions to each category of approaches for inter-robot loop closure detection. A first approach is presented for indirect loop closure detection in a team of fully connected robots. It relies on constellations, a compact semantic representation of the environment based on objects that are in it. Descriptors and comparison methods for constellations are designed to robustly recognize places based on their constellation with minimal data exchange. These are used in a decentralized place recognition mechanism that is scalable as the size of the team increases. The proposed method performs comparably to state-of-the-art solutions in terms of performance and data exchanges require, while being more meaningful and interpretable

    Percepción basada en visión estereoscópica, planificación de trayectorias y estrategias de navegación para exploración robótica autónoma

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Ingeniería del Software e Inteligencia artificial, leída el 13-05-2015En esta tesis se trata el desarrollo de una estrategia de navegación autónoma basada en visión artificial para exploración robótica autónoma de superficies planetarias. Se han desarrollado una serie de subsistemas, módulos y software específicos para la investigación desarrollada en este trabajo, ya que la mayoría de las herramientas existentes para este dominio son propiedad de agencias espaciales nacionales, no accesibles a la comunidad científica. Se ha diseñado una arquitectura software modular multi-capa con varios niveles jerárquicos para albergar el conjunto de algoritmos que implementan la estrategia de navegación autónoma y garantizar la portabilidad del software, su reutilización e independencia del hardware. Se incluye también el diseño de un entorno de trabajo destinado a dar soporte al desarrollo de las estrategias de navegación. Éste se basa parcialmente en herramientas de código abierto al alcance de cualquier investigador o institución, con las necesarias adaptaciones y extensiones, e incluye capacidades de simulación 3D, modelos de vehículos robóticos, sensores, y entornos operacionales, emulando superficies planetarias como Marte, para el análisis y validación a nivel funcional de las estrategias de navegación desarrolladas. Este entorno también ofrece capacidades de depuración y monitorización.La presente tesis se compone de dos partes principales. En la primera se aborda el diseño y desarrollo de las capacidades de autonomía de alto nivel de un rover, centrándose en la navegación autónoma, con el soporte de las capacidades de simulación y monitorización del entorno de trabajo previo. Se han llevado a cabo un conjunto de experimentos de campo, con un robot y hardware real, detallándose resultados, tiempo de procesamiento de algoritmos, así como el comportamiento y rendimiento del sistema en general. Como resultado, se ha identificado al sistema de percepción como un componente crucial dentro de la estrategia de navegación y, por tanto, el foco principal de potenciales optimizaciones y mejoras del sistema. Como consecuencia, en la segunda parte de este trabajo, se afronta el problema de la correspondencia en imágenes estéreo y reconstrucción 3D de entornos naturales no estructurados. Se han analizado una serie de algoritmos de correspondencia, procesos de imagen y filtros. Generalmente se asume que las intensidades de puntos correspondientes en imágenes del mismo par estéreo es la misma. Sin embargo, se ha comprobado que esta suposición es a menudo falsa, a pesar de que ambas se adquieren con un sistema de visión compuesto de dos cámaras idénticas. En consecuencia, se propone un sistema experto para la corrección automática de intensidades en pares de imágenes estéreo y reconstrucción 3D del entorno basado en procesos de imagen no aplicados hasta ahora en el campo de la visión estéreo. Éstos son el filtrado homomórfico y la correspondencia de histogramas, que han sido diseñados para corregir intensidades coordinadamente, ajustando una imagen en función de la otra. Los resultados se han podido optimizar adicionalmente gracias al diseño de un proceso de agrupación basado en el principio de continuidad espacial para eliminar falsos positivos y correspondencias erróneas. Se han estudiado los efectos de la aplicación de dichos filtros, en etapas previas y posteriores al proceso de correspondencia, con eficiencia verificada favorablemente. Su aplicación ha permitido la obtención de un mayor número de correspondencias válidas en comparación con los resultados obtenidos sin la aplicación de los mismos, consiguiendo mejoras significativas en los mapas de disparidad y, por lo tanto, en los procesos globales de percepción y reconstrucción 3D.Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)Fac. de InformáticaTRUEunpu

    Hand gesture recognition system based in computer vision and machine learning: Applications on human-machine interaction

    Get PDF
    Tese de Doutoramento em Engenharia de Eletrónica e de ComputadoresSendo uma forma natural de interação homem-máquina, o reconhecimento de gestos implica uma forte componente de investigação em áreas como a visão por computador e a aprendizagem computacional. O reconhecimento gestual é uma área com aplicações muito diversas, fornecendo aos utilizadores uma forma mais natural e mais simples de comunicar com sistemas baseados em computador, sem a necessidade de utilização de dispositivos extras. Assim, o objectivo principal da investigação na área de reconhecimento de gestos aplicada à interacção homemmáquina é o da criação de sistemas, que possam identificar gestos específicos e usálos para transmitir informações ou para controlar dispositivos. Para isso as interfaces baseados em visão para o reconhecimento de gestos, necessitam de detectar a mão de forma rápida e robusta e de serem capazes de efetuar o reconhecimento de gestos em tempo real. Hoje em dia, os sistemas de reconhecimento de gestos baseados em visão são capazes de trabalhar com soluções específicas, construídos para resolver um determinado problema e configurados para trabalhar de uma forma particular. Este projeto de investigação estudou e implementou soluções, suficientemente genéricas, com o recurso a algoritmos de aprendizagem computacional, permitindo a sua aplicação num conjunto alargado de sistemas de interface homem-máquina, para reconhecimento de gestos em tempo real. A solução proposta, Gesture Learning Module Architecture (GeLMA), permite de forma simples definir um conjunto de comandos que pode ser baseado em gestos estáticos e dinâmicos e que pode ser facilmente integrado e configurado para ser utilizado numa série de aplicações. É um sistema de baixo custo e fácil de treinar e usar, e uma vez que é construído unicamente com bibliotecas de código. As experiências realizadas permitiram mostrar que o sistema atingiu uma precisão de 99,2% em termos de reconhecimento de gestos estáticos e uma precisão média de 93,7% em termos de reconhecimento de gestos dinâmicos. Para validar a solução proposta, foram implementados dois sistemas completos. O primeiro é um sistema em tempo real capaz de ajudar um árbitro a arbitrar um jogo de futebol robótico. A solução proposta combina um sistema de reconhecimento de gestos baseada em visão com a definição de uma linguagem formal, o CommLang Referee, à qual demos a designação de Referee Command Language Interface System (ReCLIS). O sistema identifica os comandos baseados num conjunto de gestos estáticos e dinâmicos executados pelo árbitro, sendo este posteriormente enviado para um interface de computador que transmite a respectiva informação para os robôs. O segundo é um sistema em tempo real capaz de interpretar um subconjunto da Linguagem Gestual Portuguesa. As experiências demonstraram que o sistema foi capaz de reconhecer as vogais em tempo real de forma fiável. Embora a solução implementada apenas tenha sido treinada para reconhecer as cinco vogais, o sistema é facilmente extensível para reconhecer o resto do alfabeto. As experiências também permitiram mostrar que a base dos sistemas de interação baseados em visão pode ser a mesma para todas as aplicações e, deste modo facilitar a sua implementação. A solução proposta tem ainda a vantagem de ser suficientemente genérica e uma base sólida para o desenvolvimento de sistemas baseados em reconhecimento gestual que podem ser facilmente integrados com qualquer aplicação de interface homem-máquina. A linguagem formal de definição da interface pode ser redefinida e o sistema pode ser facilmente configurado e treinado com um conjunto de gestos diferentes de forma a serem integrados na solução final.Hand gesture recognition is a natural way of human computer interaction and an area of very active research in computer vision and machine learning. This is an area with many different possible applications, giving users a simpler and more natural way to communicate with robots/systems interfaces, without the need for extra devices. So, the primary goal of gesture recognition research applied to Human-Computer Interaction (HCI) is to create systems, which can identify specific human gestures and use them to convey information or controlling devices. For that, vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition in real time. Nowadays, vision-based gesture recognition systems are able to work with specific solutions, built to solve one particular problem and configured to work in a particular manner. This research project studied and implemented solutions, generic enough, with the help of machine learning algorithms, allowing its application in a wide range of human-computer interfaces, for real-time gesture recognition. The proposed solution, Gesture Learning Module Architecture (GeLMA), allows the definition in a simple way of a set of commands that can be based on static and dynamic gestures and that can be easily integrated and configured to be used in a number of applications. It is easy to train and use, and since it is mainly built with open source libraries it is also an inexpensive solution. Experiments carried out showed that the system achieved an accuracy of 99.2% in terms of hand posture recognition and an average accuracy of 93,72% in terms of dynamic gesture recognition. To validate the proposed framework, two systems were implemented. The first one is an online system able to help a robotic soccer game referee judge a game in real time. The proposed solution combines a vision-based hand gesture recognition system with a formal language definition, the Referee CommLang, into what is called the Referee Command Language Interface System (ReCLIS). The system builds a command based on system-interpreted static and dynamic referee gestures, and is able to send it to a computer interface which can then transmit the proper commands to the robots. The second one is an online system able to interpret the Portuguese Sign Language. The experiments showed that the system was able to reliably recognize the vowels in real-time. Although the implemented solution was only trained to recognize the five vowels, it is easily extended to recognize the rest of the alphabet. These experiments also showed that the core of vision-based interaction systems can be the same for all applications and thus facilitate its implementation. The proposed framework has the advantage of being generic enough and a solid foundation for the development of hand gesture recognition systems that can be integrated in any human-computer interface application. The interface language can be redefined and the system can be easily configured to train different sets of gestures that can be easily integrated into the final solution
    corecore