13 research outputs found

    Robust human detection with occlusion handling by fusion of thermal and depth images from mobile robot

    Get PDF
    In this paper, a robust surveillance system to enable robots to detect humans in indoor environments is proposed. The proposed method is based on fusing information from thermal and depth images which allows the detection of human even under occlusion. The proposed method consists of three stages, pre-processing, ROI generation and object classification. A new dataset was developed to evaluate the performance of the proposed method. The experimental results show that the proposed method is able to detect multiple humans under occlusions and illumination variations

    Deep learning fusion of RGB and depth images for pedestrian detection

    Get PDF
    In this paper, we propose an effective method based on the Faster-RCNN structureto combine RGB and depth images for pedestrian detection. During the training stage,we generate a semantic segmentation map from the depth image and use it to refine theconvolutional features extracted from the RGB images. In addition, we acquire moreaccurate region proposals by exploring the perspective projection with the help of depthinformation. Experimental results demonstrate that our proposed method achieves thestate-of-the-art RGBD pedestrian detection performance on KITTI [12] datas

    Growing Neural Gas with Different Topologies for 3D Space Perception

    Get PDF
    Three-dimensional space perception is one of the most important capabilities for an autonomous mobile robot in order to operate a task in an unknown environment adaptively since the autonomous robot needs to detect the target object and estimate the 3D pose of the target object for performing given tasks efficiently. After the 3D point cloud is measured by an RGB-D camera, the autonomous robot needs to reconstruct a structure from the 3D point cloud with color information according to the given tasks since the point cloud is unstructured data. For reconstructing the unstructured point cloud, growing neural gas (GNG) based methods have been utilized in many research studies since GNG can learn the data distribution of the point cloud appropriately. However, the conventional GNG based methods have unsolved problems about the scalability and multi-viewpoint clustering. In this paper, therefore, we propose growing neural gas with different topologies (GNG-DT) as a new topological structure learning method for solving the problems. GNG-DT has multiple topologies of each property, while the conventional GNG method has a single topology of the input vector. In addition, the distance measurement in the winner node selection uses only the position information for preserving the environmental space of the point cloud. Next, we show several experimental results of the proposed method using simulation and RGB-D datasets measured by Kinect. In these experiments, we verified that our proposed method almost outperforms the other methods from the viewpoint of the quantization and clustering errors. Finally, we summarize our proposed method and discuss the future direction on this research

    Tracking de personas en entornos industriales robotizados con imágenes RGB-D

    Get PDF
    La detección de objetos es una de las aplicaciones más comunes en la visión artificial, las imágenes son procesadas para obtener relaciones que describen los objetos y su tipo, llamados clases. La detección en video es similar, sin embargo, tener un seguimiento del objeto, es otra tarea aún más compleja. Con el avance tecnológico, las imágenes y video dejaron de ser 2D y adquirieron tridimensionalidad, con un nuevo canal que estima la profundidad de los objetos. Este trabajo pretende analizar los métodos existentes para el seguimiento de personas en imágenes de profundidad y evaluar su rendimiento en entornos industriales robotizados.Ingeniero Mecatrónicopregrad

    Bio-inspired predictive orientation decomposition of skeleton trajectories for real-time human activity prediction

    Full text link

    Selective joint motion recognition using multi sensor for salat learning

    Get PDF
    Over the past few years, there has been significant attention given on motion recognition in computer vision as it has a wide range of potential applications that can be further developed. Hence, a wide variety of algorithms and techniques has been proposed to develop human motion recognition systems for the benefit of the human. Salat—an essential ritual in Muslim daily life which helps them be good Muslims—is not solely about the spiritual act, but it also involves the physical movements in which it has to be done according to its code of conduct. The existing motion recognition proposed for computing applications for salat movement is unsuitable as the movement in salat must be performed in accordance to the rules and procedures stipulated, the accuracy and sequence. In addition, tracking all skeleton joints does not contribute equally toward activity recognition as well as it is also computationally intensive. The current salat recognition focuses on recognizing main movements and it does not cover the whole cycle of salat activity. Besides, using a wearable sensor is not natural in performing salat since the user needs to give absolute concentration during salat activity. The research conducted was based on the intersections of technological development and Muslim spiritual practices. This study has been developed utilizing dual-sensor cameras and a special sensor prayer mat that has the ability to cooperate in recognizing salat movement and identifying the error in the movement. With the current technology in depth cameras and software development kits, human joint information is available to locate the joint position. Only important joints with the significant movement were selected to be tracked to perform real-time motion recognition. This selective joint algorithm is computationally efficient and offers good recognition accuracy in real-time. Once the features have been constructed, the Hidden Markov Model classifier was utilized to train and test the algorithm. The algorithm was tested on a purposely built dataset of depth videos recorded using a Kinect camera. This motion recognition system was designed based on the salat activity to recognize the user movement and his error rate, which will later be compared with the traditional tutor-based methodology. Subsequently, an evaluation comprising 25 participants was conducted utilizing usability testing methods. The experiment was conducted to evaluate the success score of the user’s salat movement recognition and error rate. Besides, user experience and subjective satisfaction toward the proposed system have been considered to evaluate user acceptance. The results showed that the evaluation of the proposed system was significantly different from the traditional tutor-based method evaluation. Results indicated a significant difference (p < 0.05) in success score and user’s error rate between the proposed system and traditional tutor-based methodology. This study also depicted that the proposed motion recognition system had successfully recognized salat movement and evaluated user error in salat activity, offering an alternative salat learning methodology. This motion identification system appears to offer an alternate learning process in a variety of study domains, not just salat movement activity

    Behavioral pedestrian tracking using a camera and lidar sensors on a moving vehicle

    Get PDF
    In this paper, we present a novel 2D&ndash;3D pedestrian tracker designed for applications in autonomous vehicles. The system operates on a tracking by detection principle and can track multiple pedestrians in complex urban traffic situations. By using a behavioral motion model and a non-parametric distribution as state model, we are able to accurately track unpredictable pedestrian motion in the presence of heavy occlusion. Tracking is performed independently, on the image and ground plane, in global, motion compensated coordinates. We employ Camera and LiDAR data fusion to solve the association problem where the optimal solution is found by matching 2D and 3D detections to tracks using a joint log-likelihood observation model. Each 2D&ndash;3D particle filter then updates their state from associated observations and a behavioral motion model. Each particle moves independently following the pedestrian motion parameters which we learned offline from an annotated training dataset. Temporal stability of the state variables is achieved by modeling each track as a Markov Decision Process with probabilistic state transition properties. A novel track management system then handles high level actions such as track creation, deletion and interaction. Using a probabilistic track score the track manager can cull false and ambiguous detections while updating tracks with detections from actual pedestrians. Our system is implemented on a GPU and exploits the massively parallelizable nature of particle filters. Due to the Markovian nature of our track representation, the system achieves real-time performance operating with a minimal memory footprint. Exhaustive and independent evaluation of our tracker was performed by the KITTI benchmark server, where it was tested against a wide variety of unknown pedestrian tracking situations. On this realistic benchmark, we outperform all published pedestrian trackers in a multitude of tracking metrics
    corecore