252 research outputs found

    Indoor Localization System based on Artificial Landmarks and Monocular Vision

    Get PDF
     This paper presents a visual localization approach that is suitable for domestic and industrial environments as it enables accurate, reliable and robust pose estimation. The mobile robot is equipped with a single camera which update sits pose whenever a landmark is available on the field of view. The innovation presented by this research focuses on the artificial landmark system which has the ability to detect the presence of the robot, since both entities communicate with each other using an infrared signal protocol modulated in frequency. Besides this communication capability, each landmark has several high intensity light-emitting diodes (LEDs) that shine only for some instances according to the communication, which makes it possible for the camera shutter and the blinking of the LEDs to synchronize. This synchronization increases the system tolerance concerning changes in brightness in the ambient lights over time, independently of the landmarks location. Therefore, the environment’s ceiling is populated with several landmarks and an Extended Kalman Filter is used to combine the dead-reckoning and landmark information. This increases the flexibility of the system by reducing the number of landmarks required. The experimental evaluation was conducted in a real indoor environment with an autonomous wheelchair prototype

    Novel robust computer vision algorithms for micro autonomous systems

    Get PDF
    People detection and tracking are an essential component of many autonomous platforms, interactive systems and intelligent vehicles used in various search and rescues operations and similar humanitarian applications. Currently, researchers are focusing on the use of vision sensors such as cameras due to their advantages over other sensor types. Cameras are information rich, relatively inexpensive and easily available. Additionally, 3D information is obtained from stereo vision, or by triangulating over several frames in monocular configurations. Another method to obtain 3D data is by using RGB-D sensors (e.g. Kinect) that provide both image and depth data. This method is becoming more attractive over the past few years due to its affordable price and availability for researchers. The aim of this research was to find robust multi-target detection and tracking algorithms for Micro Autonomous Systems (MAS) that incorporate the use of the RGB-D sensor. Contributions include the discovery of novel robust computer vision algorithms. It proposed a new framework for human body detection, from video file, to detect a single person adapted from Viola and Jones framework. The 2D Multi Targets Detection and Tracking (MTDT) algorithm applied the Gaussian Mixture Model (GMM) to reduce noise in the pre-processing stage. Blob analysis was used to detect targets, and Kalman filter was used to track targets. The 3D MTDT extends beyond 2D with the use of depth data from the RGB-D sensor in the pre-processing stage. Bayesian model was employed to provide multiple cues. It includes detection of the upper body, face, skin colour, motion and shape. Kalman filter proved for speed and robustness of the track management. Simultaneous Localisation and Mapping (SLAM) fusing with 3D information was investigated. The new framework introduced front end and back end processing. The front end consists of localisation steps, post refinement and loop closing system. The back-end focus on the post-graph optimisation to eliminate errors.The proposed computer vision algorithms proved for better speed and robustness. The frameworks produced impressive results. New algorithms can be used to improve performances in real time applications including surveillance, vision navigation, environmental perception and vision-based control system on MAS

    RGB-D And Thermal Sensor Fusion: A Systematic Literature Review

    Full text link
    In the last decade, the computer vision field has seen significant progress in multimodal data fusion and learning, where multiple sensors, including depth, infrared, and visual, are used to capture the environment across diverse spectral ranges. Despite these advancements, there has been no systematic and comprehensive evaluation of fusing RGB-D and thermal modalities to date. While autonomous driving using LiDAR, radar, RGB, and other sensors has garnered substantial research interest, along with the fusion of RGB and depth modalities, the integration of thermal cameras and, specifically, the fusion of RGB-D and thermal data, has received comparatively less attention. This might be partly due to the limited number of publicly available datasets for such applications. This paper provides a comprehensive review of both, state-of-the-art and traditional methods used in fusing RGB-D and thermal camera data for various applications, such as site inspection, human tracking, fault detection, and others. The reviewed literature has been categorised into technical areas, such as 3D reconstruction, segmentation, object detection, available datasets, and other related topics. Following a brief introduction and an overview of the methodology, the study delves into calibration and registration techniques, then examines thermal visualisation and 3D reconstruction, before discussing the application of classic feature-based techniques as well as modern deep learning approaches. The paper concludes with a discourse on current limitations and potential future research directions. It is hoped that this survey will serve as a valuable reference for researchers looking to familiarise themselves with the latest advancements and contribute to the RGB-DT research field.Comment: 33 pages, 20 figure

    RGB-D SLAM

    Get PDF
    This project has been developed as an implementation of a SLAM technique called GraphSLAM. This technique applies the theory of graphs to create an on-line optimization system that allows robots to map the scenario and locate themselves using a Time of Flight camera as the input source. To do that, a RGB-D system has been calibrated and used to create color 3D point clouds. With this information, the feature detector module estimates, as a first approximation, the pose of the camera. Therefore, a ICP pose refinement completes the graph structure. Finally, a HogMAN graph optimizer close the loop on each iteration using a hierarchical manifold optimization. As a result, 3D color maps are created containing, at the same time, the exact position of the robot over the map. ____________________________________________________________________________________________________________________________________Este proyecto ha sido desarrollado como una propuesta para implementación de una de las técnicas de SLAM denominada GraphSLAM. Esta técnica aplica la teoría de grafos para crear un sistema de optimización en tiempo real que permite a los robots mapear un escenario y localizarse utilizando una cámara de tiempo de vuelo como fuente de información. Para llevarlo a cabo, ha sido desarrollado y calibrado un sistema RGB-D que tiene como finalidad crear una nube de puntos 3D. Con esta información, el detector de características estima, como primera aproximación, la posición de la cámara. A continuación, mediante ICP se realiza una corrección más fina de la estructura del grafo. Finalmente, mediante un optimizador global de grafos denominado HogMAN se cierra el bucle en cada iteración basándose en manifolds jerárquicos. Como resultado, se generan mapas 3D a color que contien, al mismo tiempo, la posición exacta del robot dentro del mapa.Ingeniería Técnica en Telemátic

    Real-time object detection using monocular vision for low-cost automotive sensing systems

    Get PDF
    This work addresses the problem of real-time object detection in automotive environments using monocular vision. The focus is on real-time feature detection, tracking, depth estimation using monocular vision and finally, object detection by fusing visual saliency and depth information. Firstly, a novel feature detection approach is proposed for extracting stable and dense features even in images with very low signal-to-noise ratio. This methodology is based on image gradients, which are redefined to take account of noise as part of their mathematical model. Each gradient is based on a vector connecting a negative to a positive intensity centroid, where both centroids are symmetric about the centre of the area for which the gradient is calculated. Multiple gradient vectors define a feature with its strength being proportional to the underlying gradient vector magnitude. The evaluation of the Dense Gradient Features (DeGraF) shows superior performance over other contemporary detectors in terms of keypoint density, tracking accuracy, illumination invariance, rotation invariance, noise resistance and detection time. The DeGraF features form the basis for two new approaches that perform dense 3D reconstruction from a single vehicle-mounted camera. The first approach tracks DeGraF features in real-time while performing image stabilisation with minimal computational cost. This means that despite camera vibration the algorithm can accurately predict the real-world coordinates of each image pixel in real-time by comparing each motion-vector to the ego-motion vector of the vehicle. The performance of this approach has been compared to different 3D reconstruction methods in order to determine their accuracy, depth-map density, noise-resistance and computational complexity. The second approach proposes the use of local frequency analysis of i ii gradient features for estimating relative depth. This novel method is based on the fact that DeGraF gradients can accurately measure local image variance with subpixel accuracy. It is shown that the local frequency by which the centroid oscillates around the gradient window centre is proportional to the depth of each gradient centroid in the real world. The lower computational complexity of this methodology comes at the expense of depth map accuracy as the camera velocity increases, but it is at least five times faster than the other evaluated approaches. This work also proposes a novel technique for deriving visual saliency maps by using Division of Gaussians (DIVoG). In this context, saliency maps express the difference of each image pixel is to its surrounding pixels across multiple pyramid levels. This approach is shown to be both fast and accurate when evaluated against other state-of-the-art approaches. Subsequently, the saliency information is combined with depth information to identify salient regions close to the host vehicle. The fused map allows faster detection of high-risk areas where obstacles are likely to exist. As a result, existing object detection algorithms, such as the Histogram of Oriented Gradients (HOG) can execute at least five times faster. In conclusion, through a step-wise approach computationally-expensive algorithms have been optimised or replaced by novel methodologies to produce a fast object detection system that is aligned to the requirements of the automotive domain

    3D Sensor Placement and Embedded Processing for People Detection in an Industrial Environment

    Get PDF
    Papers I, II and III are extracted from the dissertation and uploaded as separate documents to meet post-publication requirements for self-arciving of IEEE conference papers.At a time when autonomy is being introduced in more and more areas, computer vision plays a very important role. In an industrial environment, the ability to create a real-time virtual version of a volume of interest provides a broad range of possibilities, including safety-related systems such as vision based anti-collision and personnel tracking. In an offshore environment, where such systems are not common, the task is challenging due to rough weather and environmental conditions, but the result of introducing such safety systems could potentially be lifesaving, as personnel work close to heavy, huge, and often poorly instrumented moving machinery and equipment. This thesis presents research on important topics related to enabling computer vision systems in industrial and offshore environments, including a review of the most important technologies and methods. A prototype 3D sensor package is developed, consisting of different sensors and a powerful embedded computer. This, together with a novel, highly scalable point cloud compression and sensor fusion scheme allows to create a real-time 3D map of an industrial area. The question of where to place the sensor packages in an environment where occlusions are present is also investigated. The result is algorithms for automatic sensor placement optimisation, where the goal is to place sensors in such a way that maximises the volume of interest that is covered, with as few occluded zones as possible. The method also includes redundancy constraints where important sub-volumes can be defined to be viewed by more than one sensor. Lastly, a people detection scheme using a merged point cloud from six different sensor packages as input is developed. Using a combination of point cloud clustering, flattening and convolutional neural networks, the system successfully detects multiple people in an outdoor industrial environment, providing real-time 3D positions. The sensor packages and methods are tested and verified at the Industrial Robotics Lab at the University of Agder, and the people detection method is also tested in a relevant outdoor, industrial testing facility. The experiments and results are presented in the papers attached to this thesis.publishedVersio

    Neural Network based Robot 3D Mapping and Navigation using Depth Image Camera

    Get PDF
    Robotics research has been developing rapidly in the past decade. However, in order to bring robots into household or office environments and cooperate well with humans, it is still required more research works. One of the main problems is robot localization and navigation. To be able to accomplish its missions, the mobile robot needs to solve problems of localizing itself in the environment, finding the best path and navigate to the goal. The navigation methods can be categorized into map-based navigation and map-less navigation. In this research we propose a method based on neural networks, using a depth image camera to solve the robot navigation problem. By using a depth image camera, the surrounding environment can be recognized regardless of the lighting conditions. A neural network-based approach is fast enough for robot navigation in real-time which is important to develop the full autonomous robots.In our method, mapping and annotating of the surrounding environment is done by the robot using a Feed-Forward Neural Network and a CNN network. The 3D map not only contains the geometric information of the environments but also their semantic contents. The semantic contents are important for robots to accomplish their tasks. For instance, consider the task “Go to cabinet to take a medicine”. The robot needs to know the position of the cabinet and medicine which is not supplied by solely the geometrical map. A Feed-Forward Neural Network is trained to convert the depth information from depth images into 3D points in real-world coordination. A CNN network is trained to segment the image into classes. By combining the two neural networks, the objects in the environment are segmented and their positions are determined.We implemented the proposed method using the mobile humanoid robot. Initially, the robot moves in the environment and build the 3D map with objects placed in their positions. Then, the robot utilizes the developed 3D map for goal-directed navigation.The experimental results show good performance in terms of the 3D map accuracy and robot navigation. Most of the objects in the working environments are classified by the trained CNN. Un-recognized objects are classified by Feed-Forward Neural Network. As a result, the generated maps reflected exactly working environments and can be applied for robots to safely navigate in them. The 3D geometric maps can be generated regardless of the lighting conditions. The proposed localization method is robust even in texture-less environments which are the toughest environments in the field of vision-based localization.博士(工学)法政大学 (Hosei University
    corecore