15 research outputs found

    Object Search and Localization for an Indoor Mobile Robot

    Get PDF
    In this paper we present a method for search and localization of objects with a mobile robot using a monocular camera with zoom capabilities. We show how to overcome the limitations of low resolution images in object recognition by utilizing a combination of an attention mechanism and zooming as the first steps in the recognition process. The attention mechanism is based on receptive field cooccurrence histograms and the object recognition on SIFT feature matching. We present two methods for estimating the distance to the objects which serves both as the input to the control of the zoom and the final object localization. Through extensive experiments in a realistic environment, we highlight the strengths and weaknesses of both methods. To evaluate the usefulness of the method we also present results from experiments with an integrated system where a global sensing plan is generated based on view planning to let the camera cover the space on a per room basis

    Обзор современных систем и алгоритмов технического зрения, которые используются на беспилотных летательных аппаратах и наземных мобильных роботах

    Get PDF
    В роботі наведений огляд сучасних систем та алгоритмів технічного зору, описані їх сильні та слабки сторони, вирішені та невирішені задачи. Окреслений напрямок майбутніх досліджень в цій галузі.In this paper state-of-art of computer vision systems and algorythms is presented. The algorythms advantages and disadvatages, solved and unsolved tasks are discribed. The topics of the future work are circumscribed.В работе приведен обзор современных систем и алгоритмов технического зрения, описаны их сильные и слабые стороны, решенные и нерешенные задачи. Очерчено направление будущих исследований в этой области

    Author Index

    Get PDF
    Author Index: CIT Vol. 17 (2009), No 1–

    OMap: An assistive solution for identifying and localizing objects in a semi-structured environment

    Get PDF
    A system capable of detection and localization of objects of interest in a semi-structured environment will enhance the quality of life of people who are blind or visually impaired. Towards building such a system, this thesis presents a personalized real-time system called O\u27Map that finds misplaced/moved personal items and localizes them with respect to known landmarks. First, we adopted a participatory design approach to identify users’ need and functionalities of the system. Second, we used the concept from system thinking and design thinking to develop a real-time object recognition engine that was optimized to run on low form factor devices. The object recognition engine finds robust correspondences between the query image and item templates using K-D tree of invariant feature descriptor with two nearest neighbors and ratio test. Quantitative evaluation demonstrates that O\u27Map identifies object of interest with an average F-measure of 0.9650

    Manipulation-based active search for occluded objects

    Get PDF
    Object search is an integral part of daily life, and in the quest for competent mobile manipulation robots it is an unavoidable problem. Previous approaches focus on cases where objects are in unknown rooms but lying out in the open, which transforms object search into active visual search. However, in real life, objects may be in the back of cupboards occluded by other objects, instead of conveniently on a table by themselves. Extending search to occluded objects requires a more precise model and tighter integration with manipulation. We present a novel generative model for representing container contents by using object co-occurrence information and spatial constraints. Given a target object, a planner uses the model to guide an agent to explore containers where the target is likely, potentially needing to move occluding objects to enable further perception. We demonstrate the model on simulated domains and a detailed simulation involving a PR2 robot.National Science Foundation (U.S.) (Grant 1117325)United States. Office of Naval Research. Multidisciplinary University Research Initiative (Grant N00014-09-1-1051)United States. Air Force Office of Scientific Research (Grant FA2386-10-1-4135

    Design of Self-Balancing Tracing Bicycle for Smart Car Competition Case Under Engineering Education

    Get PDF
    Smart car is an academic competition held for cultivating college students\u27 engineering ability in China for 16 years. To improve the performance of smart cars, this study integrates engineering education topics by introducing a smart car system with regard to the selection of key components, design of hardware and circuit boards, processing of sensor signals, as well as assembly, algorithms, and control. After completing this engineering education, students could achieve better results in the academic competition. According to the K model rules of the 16th smart car competition, a self-balancing autonomous tracking bicycle based on steering gear control is designed and developed. A gyroscope is used to detect the posture of the bicycle. It inductively receives the centerline of the track and then combined with the PID control algorithm realizes the autonomous tracking. The whole process from mechanical structure optimization and electronic circuit design to algorithm design, debugging, and competition runs through the CDIO of engineering education, realizing the cultivation of compound engineering innovative abilities

    Place and Object Recognition for Real-time Visual Mapping

    Get PDF
    Este trabajo aborda dos de las principales dificultades presentes en los sistemas actuales de localización y creación de mapas de forma simultánea (del inglés Simultaneous Localization And Mapping, SLAM): el reconocimiento de lugares ya visitados para cerrar bucles en la trajectoria y crear mapas precisos, y el reconocimiento de objetos para enriquecer los mapas con estructuras de alto nivel y mejorar la interación entre robots y personas. En SLAM visual, las características que se extraen de las imágenes de una secuencia de vídeo se van acumulando con el tiempo, haciendo más laboriosos dos de los aspectos de la detección de bucles: la eliminación de los bucles incorrectos que se detectan entre lugares que tienen una apariencia muy similar, y conseguir un tiempo de ejecución bajo y factible en trayectorias largas. En este trabajo proponemos una técnica basada en vocabularios visuales y en bolsas de palabras para detectar bucles de manera robusta y eficiente, centrándonos en dos ideas principales: 1) aprovechar el origen secuencial de las imágenes de vídeo, y 2) hacer que todo el proceso pueda funcionar a frecuencia de vídeo. Para beneficiarnos del origen secuencial de las imágenes, presentamos una métrica de similaridad normalizada para medir el parecido entre imágenes e incrementar la distintividad de las detecciones correctas. A su vez, agrupamos los emparejamientos de imágenes candidatas a ser bucle para evitar que éstas compitan cuando realmente fueron tomadas desde el mismo lugar. Finalmente, incorporamos una restricción temporal para comprobar la coherencia entre detecciones consecutivas. La eficiencia se logra utilizando índices inversos y directos y características binarias. Un índice inverso acelera la comparación entre imágenes de lugares, y un índice directo, el cálculo de correspondencias de puntos entre éstas. Por primera vez, en este trabajo se han utilizado características binarias para detectar bucles, dando lugar a una solución viable incluso hasta para decenas de miles de imágenes. Los bucles se verifican comprobando la coherencia de la geometría de las escenas emparejadas. Para ello utilizamos varios métodos robustos que funcionan tanto con una como con múltiples cámaras. Presentamos resultados competitivos y sin falsos positivos en distintas secuencias, con imágenes adquiridas tanto a alta como a baja frecuencia, con cámaras frontales y laterales, y utilizando el mismo vocabulario y la misma configuración. Con descriptores binarios, el sistema completo requiere 22 milisegundos por imagen en una secuencia de 26.300 imágenes, resultando un orden de magnitud más rápido que otras técnicas actuales. Se puede utilizar un algoritmo similar al de reconocimiento de lugares para resolver el reconocimiento de objetos en SLAM visual. Detectar objetos en este contexto es particularmente complicado debido a que las distintas ubicaciones, posiciones y tamaños en los que se puede ver un objeto en una imagen son potencialmente infinitos, por lo que suelen ser difíciles de distinguir. Además, esta complejidad se multiplica cuando la comparación ha de hacerse contra varios objetos 3D. Nuestro esfuerzo en este trabajo está orientado a: 1) construir el primer sistema de SLAM visual que puede colocar objectos 3D reales en el mapa, y 2) abordar los problemas de escalabilidad resultantes al tratar con múltiples objetos y vistas de éstos. En este trabajo, presentamos el primer sistema de SLAM monocular que reconoce objetos 3D, los inserta en el mapa y refina su posición en el espacio 3D a medida que el mapa se va construyendo, incluso cuando los objetos dejan de estar en el campo de visión de la cámara. Esto se logra en tiempo real con modelos de objetos compuestos por información tridimensional y múltiples imágenes representando varios puntos de vista del objeto. Después nos centramos en la escalabilidad de la etapa del reconocimiento de los objetos 3D. Presentamos una técnica rápida para segmentar imágenes en regiones de interés para detectar objetos pequeños o lejanos. Tras ello, proponemos sustituir el modelo de objetos de vistas independientes por un modelado con una única bolsa de palabras de características binarias asociadas a puntos 3D. Creamos también una base de datos que incorpora índices inversos y directos para aprovechar sus ventajas a la hora de recuperar rápidamente tanto objetos candidatos a ser detectados como correspondencias de puntos, tal y como hacían en el caso de la detección de bucles. Los resultados experimentales muestran que nuestro sistema funciona en tiempo real en un entorno de escritorio con cámara en mano y en una habitación con una cámara montada sobre un robot autónomo. Las mejoras en el proceso de reconocimiento obtienen resultados satisfactorios, sin detecciones erróneas y con un tiempo de ejecución medio de 28 milisegundos por imagen con una base de datos de 20 objetos 3D

    Semantic Robot Programming for Taskable Goal-Directed Manipulation

    Full text link
    Autonomous robots have the potential to assist people to be more productive in factories, homes, hospitals, and similar environments. Unlike traditional industrial robots that are pre-programmed for particular tasks in controlled environments, modern autonomous robots should be able to perform arbitrary user-desired tasks. Thus, it is beneficial to provide pathways to enable users to program an arbitrary robot to perform an arbitrary task in an arbitrary world. Advances in robot Programming by Demonstration (PbD) has made it possible for end-users to program robot behavior for performing desired tasks through demonstrations. However, it still remains a challenge for users to program robot behavior in a generalizable, performant, scalable, and intuitive manner. In this dissertation, we address the problem of robot programming by demonstration in a declarative manner by introducing the concept of Semantic Robot Programming (SRP). In SRP, we focus on addressing the following challenges for robot PbD: 1) generalization across robots, tasks, and worlds, 2) robustness under partial observations of cluttered scenes, 3) efficiency in task performance as the workspace scales up, and 4) feasibly intuitive modalities of interaction for end-users to demonstrate tasks to robots. Through SRP, our objective is to enable an end-user to intuitively program a mobile manipulator by providing a workspace demonstration of the desired goal scene. We use a scene graph to semantically represent conditions on the current and goal states of the world. To estimate the scene graph given raw sensor observations, we bring together discriminative object detection and generative state estimation for the inference of object classes and poses. The proposed scene estimation method outperformed the state of the art in cluttered scenes. With SRP, we successfully enabled users to program a Fetch robot to set up a kitchen tray on a cluttered tabletop in 10 different start and goal settings. In order to scale up SRP from tabletop to large scale, we propose Contextual-Temporal Mapping (CT-Map) for semantic mapping of large scale scenes given streaming sensor observations. We model the semantic mapping problem via a Conditional Random Field (CRF), which accounts for spatial dependencies between objects. Over time, object poses and inter-object spatial relations can vary due to human activities. To deal with such dynamics, CT-Map maintains the belief over object classes and poses across an observed environment. We present CT-Map semantically mapping cluttered rooms with robustness to perceptual ambiguities, demonstrating higher accuracy on object detection and 6 DoF pose estimation compared to state-of-the-art neural network-based object detector and commonly adopted 3D registration methods. Towards SRP at the building scale, we explore notions of Generalized Object Permanence (GOP) for robots to search for objects efficiently. We state the GOP problem as the prediction of where an object can be located when it is not being directly observed by a robot. We model object permanence via a factor graph inference model, with factors representing long-term memory, short-term memory, and common sense knowledge over inter-object spatial relations. We propose the Semantic Linking Maps (SLiM) model to maintain the belief over object locations while accounting for object permanence through a CRF. Based on the belief maintained by SLiM, we present a hybrid object search strategy that enables the Fetch robot to actively search for objects on a large scale, with a higher search success rate and less search time compared to state-of-the-art search methods.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155073/1/zengzhen_1.pd

    Navigation and Geolocation Within Urban and Semi-Urban Environments Using Low-Rate Wireless Personal Area Networks

    Get PDF
    IEEE 802.15.4 defines networks and hardware capable of low power, low data rate transmissions. The use of these networks for the “Internet of Things”, machine to machine communications, energy metering, control and automation etc is increasing. In an urban environment, these networks may well soon become so popular and widespread in their usage that their discoverability and coverage density is sufficient for aiding geolocation – in the same way that IEEE 802.11 WiFi networks are used today. This research shows that although possible, there are some current inherent weaknesses in the use of IEEE 802.15.4 networks for location purposes particularly with respect to multilateration
    corecore