520 research outputs found

    Fast and robust image feature matching methods for computer vision applications

    Get PDF
    Service robotic systems are designed to solve tasks such as recognizing and manipulating objects, understanding natural scenes, navigating in dynamic and populated environments. It's immediately evident that such tasks cannot be modeled in all necessary details as easy as it is with industrial robot tasks; therefore, service robotic system has to have the ability to sense and interact with the surrounding physical environment through a multitude of sensors and actuators. Environment sensing is one of the core problems that limit the deployment of mobile service robots since existing sensing systems are either too slow or too expensive. Visual sensing is the most promising way to provide a cost effective solution to the mobile robot sensing problem. It's usually achieved using one or several digital cameras placed on the robot or distributed in its environment. Digital cameras are information rich sensors and are relatively inexpensive and can be used to solve a number of key problems for robotics and other autonomous intelligent systems, such as visual servoing, robot navigation, object recognition, pose estimation, and much more. The key challenges to taking advantage of this powerful and inexpensive sensor is to come up with algorithms that can reliably and quickly extract and match the useful visual information necessary to automatically interpret the environment in real-time. Although considerable research has been conducted in recent years on the development of algorithms for computer and robot vision problems, there are still open research challenges in the context of the reliability, accuracy and processing time. Scale Invariant Feature Transform (SIFT) is one of the most widely used methods that has recently attracted much attention in the computer vision community due to the fact that SIFT features are highly distinctive, and invariant to scale, rotation and illumination changes. In addition, SIFT features are relatively easy to extract and to match against a large database of local features. Generally, there are two main drawbacks of SIFT algorithm, the first drawback is that the computational complexity of the algorithm increases rapidly with the number of key-points, especially at the matching step due to the high dimensionality of the SIFT feature descriptor. The other one is that the SIFT features are not robust to large viewpoint changes. These drawbacks limit the reasonable use of SIFT algorithm for robot vision applications since they require often real-time performance and dealing with large viewpoint changes. This dissertation proposes three new approaches to address the constraints faced when using SIFT features for robot vision applications, Speeded up SIFT feature matching, robust SIFT feature matching and the inclusion of the closed loop control structure into object recognition and pose estimation systems. The proposed methods are implemented and tested on the FRIEND II/III service robotic system. The achieved results are valuable to adapt SIFT algorithm to the robot vision applications

    Face Recognition using Segmental Euclidean Distance

    Get PDF
    In this paper an attempt has been made to detect the face using the combination of integral image along with the cascade structured classifier which is built using Adaboost learning algorithm. The detected faces are then passed through a filtering process for discarding the non face regions. They are individually split up into five segments consisting of forehead, eyes, nose, mouth and chin. Each segment is considered as a separate image and Eigenface also called principal component analysis (PCA) features of each segment is computed. The faces having a slight pose are also aligned for proper segmentation. The test image is also segmented similarly and its PCA features are found. The segmental Euclidean distance classifier is used for matching the test image with the stored one. The success rate comes out to be 88 per cent on the CG(full) database created from the databases of California Institute and Georgia Institute. However the performance of this approach on ORL(full) database with the same features is only 70 per cent. For the sake of comparison, DCT(full) and fuzzy features are tried on CG and ORL databases but using a well known classifier, support vector machine (SVM). Results of recognition rate with DCT features on SVM classifier are increased by 3 per cent over those due to PCA features and Euclidean distance classifier on the CG database. The results of recognition are improved to 96 per cent with fuzzy features on ORL database with SVM.Defence Science Journal, 2011, 61(5), pp.431-442, DOI:http://dx.doi.org/10.14429/dsj.61.117

    Automatic object classification for surveillance videos.

    Get PDF
    PhDThe recent popularity of surveillance video systems, specially located in urban scenarios, demands the development of visual techniques for monitoring purposes. A primary step towards intelligent surveillance video systems consists on automatic object classification, which still remains an open research problem and the keystone for the development of more specific applications. Typically, object representation is based on the inherent visual features. However, psychological studies have demonstrated that human beings can routinely categorise objects according to their behaviour. The existing gap in the understanding between the features automatically extracted by a computer, such as appearance-based features, and the concepts unconsciously perceived by human beings but unattainable for machines, or the behaviour features, is most commonly known as semantic gap. Consequently, this thesis proposes to narrow the semantic gap and bring together machine and human understanding towards object classification. Thus, a Surveillance Media Management is proposed to automatically detect and classify objects by analysing the physical properties inherent in their appearance (machine understanding) and the behaviour patterns which require a higher level of understanding (human understanding). Finally, a probabilistic multimodal fusion algorithm bridges the gap performing an automatic classification considering both machine and human understanding. The performance of the proposed Surveillance Media Management framework has been thoroughly evaluated on outdoor surveillance datasets. The experiments conducted demonstrated that the combination of machine and human understanding substantially enhanced the object classification performance. Finally, the inclusion of human reasoning and understanding provides the essential information to bridge the semantic gap towards smart surveillance video systems

    Inner-Canthus Localization of Thermal Images in Face-View Invariant

    Get PDF
    Inner-canthus localization has played an essential role in measuring human body temperature. This is due to the theory that human core body temperature can be measured in the inner-canthus. Such measurement is useful for mass screening since it is non-contact, non-invasive and fast. This paper presents an algorithm that has been developed to locate the inner-canthus. The algorithm proposed a robust method in various face-view, i.e., frontal, sided and tilted. The algorithm consisted of: face segmentation, determining face-orientation, rotating face into straight view, eye localization, and inner-canthus localization. The face segmentation used human temperature threshold of 34°C — the face orientation used trend line of a middle point between each most-bottom and most-top coordinates. The face rotation was based on the gradient of the trend line. Once the face is rotated, the eye location was determined using facial proportion. The inner-canthus location was determined as the highest intensities in the eye-frame. The test on 15 thermal images of faces with various view showed localization accuracy of 80% for eye-frame determination and 100% for inner-canthus localization

    A Methodology for Extracting Human Bodies from Still Images

    Get PDF
    Monitoring and surveillance of humans is one of the most prominent applications of today and it is expected to be part of many future aspects of our life, for safety reasons, assisted living and many others. Many efforts have been made towards automatic and robust solutions, but the general problem is very challenging and remains still open. In this PhD dissertation we examine the problem from many perspectives. First, we study the performance of a hardware architecture designed for large-scale surveillance systems. Then, we focus on the general problem of human activity recognition, present an extensive survey of methodologies that deal with this subject and propose a maturity metric to evaluate them. One of the numerous and most popular algorithms for image processing found in the field is image segmentation and we propose a blind metric to evaluate their results regarding the activity at local regions. Finally, we propose a fully automatic system for segmenting and extracting human bodies from challenging single images, which is the main contribution of the dissertation. Our methodology is a novel bottom-up approach relying mostly on anthropometric constraints and is facilitated by our research in the fields of face, skin and hands detection. Experimental results and comparison with state-of-the-art methodologies demonstrate the success of our approach

    Unmanned Aerial Systems for Wildland and Forest Fires

    Full text link
    Wildfires represent an important natural risk causing economic losses, human death and important environmental damage. In recent years, we witness an increase in fire intensity and frequency. Research has been conducted towards the development of dedicated solutions for wildland and forest fire assistance and fighting. Systems were proposed for the remote detection and tracking of fires. These systems have shown improvements in the area of efficient data collection and fire characterization within small scale environments. However, wildfires cover large areas making some of the proposed ground-based systems unsuitable for optimal coverage. To tackle this limitation, Unmanned Aerial Systems (UAS) were proposed. UAS have proven to be useful due to their maneuverability, allowing for the implementation of remote sensing, allocation strategies and task planning. They can provide a low-cost alternative for the prevention, detection and real-time support of firefighting. In this paper we review previous work related to the use of UAS in wildfires. Onboard sensor instruments, fire perception algorithms and coordination strategies are considered. In addition, we present some of the recent frameworks proposing the use of both aerial vehicles and Unmanned Ground Vehicles (UV) for a more efficient wildland firefighting strategy at a larger scale.Comment: A recent published version of this paper is available at: https://doi.org/10.3390/drones501001

    Artificial Vision in the Nao Humanoid Robot

    Get PDF
    Projecte Final de Màster UPC realitzat en col.laboració amb l'Universitat Rovira i Virgili. Departament d'Enginyeria Informàtica i MatemàtiquesRobocup is an international robotic soccer competition held yearly to promote innovative research and application in robotic intelligence. Nao humanoid robot is the new RoboCup Standard Platform robot. This platform is the new Nao robot designed and manufactured by the french company Aldebaran Robotics. The new robot is an advanced platform for developing new computer vision and robotics methods. This Master Thesis is oriented to the study of some fundamental issues for the artificial vision in the Nao humanoid robots. In particular, color representation models, real-time segmentation techniques, object detection and visual sonar approaches are the computer vision techniques applied to Nao robot in this Master Thesis. Also, Nao’s camera model, mathematical robot kinematic and stereo-vision techniques are studied and developed. This thesis also studies the integration between kinematic model and robot perception model to perform RoboCup soccer games and RoboCup technical challenges. This work is focused in the RoboCup environment but all computer vision and robotics algorithms can be easily extended to another robotics fields

    A New Approach to Automatic Saliency Identification in Images Based on Irregularity of Regions

    Get PDF
    This research introduces an image retrieval system which is, in different ways, inspired by the human vision system. The main problems with existing machine vision systems and image understanding are studied and identified, in order to design a system that relies on human image understanding. The main improvement of the developed system is that it uses the human attention principles in the process of image contents identification. Human attention shall be represented by saliency extraction algorithms, which extract the salient regions or in other words, the regions of interest. This work presents a new approach for the saliency identification which relies on the irregularity of the region. Irregularity is clearly defined and measuring tools developed. These measures are derived from the formality and variation of the region with respect to the surrounding regions. Both local and global saliency have been studied and appropriate algorithms were developed based on the local and global irregularity defined in this work. The need for suitable automatic clustering techniques motivate us to study the available clustering techniques and to development of a technique that is suitable for salient points clustering. Based on the fact that humans usually look at the surrounding region of the gaze point, an agglomerative clustering technique is developed utilising the principles of blobs extraction and intersection. Automatic thresholding was needed in different stages of the system development. Therefore, a Fuzzy thresholding technique was developed. Evaluation methods of saliency region extraction have been studied and analysed; subsequently we have developed evaluation techniques based on the extracted regions (or points) and compared them with the ground truth data. The proposed algorithms were tested against standard datasets and compared with the existing state-of-the-art algorithms. Both quantitative and qualitative benchmarking are presented in this thesis and a detailed discussion for the results has been included. The benchmarking showed promising results in different algorithms. The developed algorithms have been utilised in designing an integrated saliency-based image retrieval system which uses the salient regions to give a description for the scene. The system auto-labels the objects in the image by identifying the salient objects and gives labels based on the knowledge database contents. In addition, the system identifies the unimportant part of the image (background) to give a full description for the scene
    corecore