5,000 research outputs found

    Person Detection, Tracking and Identification by Mobile Robots Using RGB-D Images

    Get PDF
    This dissertation addresses the use of RGB-D images for six important tasks of mobile robots: face detection, face tracking, face pose estimation, face recognition, person de- tection and person tracking. These topics have widely been researched in recent years because they provide mobile robots with abilities necessary to communicate with humans in natural ways. The RGB-D images from a Microsoft Kinect cameras are expected to play an important role in improving both accuracy and computational costs of the proposed algorithms for mobile robots. We contribute some applications of the Microsoft Kinect camera for mobile robots and show their effectiveness by doing realistic experiments on our mobile robots. An important component for mobile robots to interact with humans in a natural way is real time multiple face detection. Various face detection algorithms for mobile robots have been proposed; however, almost all of them have not yet met the requirements of accuracy and speed to run in real time on a robot platform. In the scope of our re- search, we have developed a method of combining color and depth images provided by a Kinect camera and navigation information for face detection on mobile robots. We demonstrate several experiments with challenging datasets. Our results show that this method improves the accuracy and computational costs, and it runs in real time in indoor environments. Tracking faces in uncontrolled environments has still remained a challenging task be- cause the face as well as the background changes quickly over time and the face often moves through different illumination conditions. RGB-D images are beneficial for this task because the mobile robot can easily estimate the face size and improve the perfor- mance of face tracking in different distances between the mobile robot and the human. In this dissertation, we present a real time algorithm for mobile robots to track human faces accurately despite the fact that humans can move freely and far away from the camera or go through different illumination conditions in uncontrolled environments. We combine the algorithm of an adaptive correlation filter (David S. Bolme and Lui (2010)) with a Viola-Jones object detection (Viola and Jones (2001b)) to track the face. Furthermore,we introduce a new technique of face pose estimation, which is applied after tracking the face. On the tracked face, the algorithm of an adaptive correlation filter with a Viola-Jones object detection is also applied to reliably track the facial features including the two external eye corners and the nose. These facial features provide geometric cues to estimate the face pose robustly. We carefully analyze the accuracy of these approaches based on different datasets and show how they can robustly run on a mobile robot in uncontrolled environments. Both face tracking and face pose estimation play key roles as essential preprocessing steps for robust face recognition on mobile robots. The ability to recognize faces is a crucial element for human-robot interaction. Therefore, we pursue an approach for mobile robots to detect, track and recognize human faces accurately, even though they go through different illumination conditions. For the sake of improved accuracy, recognizing the tracked face is established by using an algorithm that combines local ternary patterns and collaborative representation based classification. This approach inherits the advantages of both collaborative representation based classification, which is fast and relatively accurate, and local ternary patterns, which is robust to misalignment of faces and complex illumination conditions. This combination enhances the efficiency of face recognition under different illumination and noisy conditions. Our method achieves high recognition rates on challenging face databases and can run in real time on mobile robots. An important application field of RGB-D images is person detection and tracking by mobile robots. Compared to classical RGB images, RGB-D images provide more depth information to locate humans more precisely and reliably. For this purpose, the mobile robot moves around in its environment and continuously detects and tracks people reliably, even when humans often change in a wide variety of poses, and are frequently occluded. We have improved the performance of face and upper body detection to enhance the efficiency of person detection in dealing with partial occlusions and changes in human poses. In order to handle higher challenges of complex changes of human poses and occlusions, we concurrently use a fast compressive tracker and a Kalman filter to track the detected humans. Experimental results on a challenging database show that our method achieves high performance and can run in real time on mobile robots

    Human robot interaction in a crowded environment

    No full text
    Human Robot Interaction (HRI) is the primary means of establishing natural and affective communication between humans and robots. HRI enables robots to act in a way similar to humans in order to assist in activities that are considered to be laborious, unsafe, or repetitive. Vision based human robot interaction is a major component of HRI, with which visual information is used to interpret how human interaction takes place. Common tasks of HRI include finding pre-trained static or dynamic gestures in an image, which involves localising different key parts of the human body such as the face and hands. This information is subsequently used to extract different gestures. After the initial detection process, the robot is required to comprehend the underlying meaning of these gestures [3]. Thus far, most gesture recognition systems can only detect gestures and identify a person in relatively static environments. This is not realistic for practical applications as difficulties may arise from people‟s movements and changing illumination conditions. Another issue to consider is that of identifying the commanding person in a crowded scene, which is important for interpreting the navigation commands. To this end, it is necessary to associate the gesture to the correct person and automatic reasoning is required to extract the most probable location of the person who has initiated the gesture. In this thesis, we have proposed a practical framework for addressing the above issues. It attempts to achieve a coarse level understanding about a given environment before engaging in active communication. This includes recognizing human robot interaction, where a person has the intention to communicate with the robot. In this regard, it is necessary to differentiate if people present are engaged with each other or their surrounding environment. The basic task is to detect and reason about the environmental context and different interactions so as to respond accordingly. For example, if individuals are engaged in conversation, the robot should realize it is best not to disturb or, if an individual is receptive to the robot‟s interaction, it may approach the person. Finally, if the user is moving in the environment, it can analyse further to understand if any help can be offered in assisting this user. The method proposed in this thesis combines multiple visual cues in a Bayesian framework to identify people in a scene and determine potential intentions. For improving system performance, contextual feedback is used, which allows the Bayesian network to evolve and adjust itself according to the surrounding environment. The results achieved demonstrate the effectiveness of the technique in dealing with human-robot interaction in a relatively crowded environment [7]

    Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

    Get PDF
    No abstract available

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Adaptive Real-Time Image Processing for Human-Computer Interaction

    Get PDF

    Unobtrusive and pervasive video-based eye-gaze tracking

    Get PDF
    Eye-gaze tracking has long been considered a desktop technology that finds its use inside the traditional office setting, where the operating conditions may be controlled. Nonetheless, recent advancements in mobile technology and a growing interest in capturing natural human behaviour have motivated an emerging interest in tracking eye movements within unconstrained real-life conditions, referred to as pervasive eye-gaze tracking. This critical review focuses on emerging passive and unobtrusive video-based eye-gaze tracking methods in recent literature, with the aim to identify different research avenues that are being followed in response to the challenges of pervasive eye-gaze tracking. Different eye-gaze tracking approaches are discussed in order to bring out their strengths and weaknesses, and to identify any limitations, within the context of pervasive eye-gaze tracking, that have yet to be considered by the computer vision community.peer-reviewe

    Mobile Robots in Human Environments:towards safe, comfortable and natural navigation

    Get PDF
    • …
    corecore