439 research outputs found

    Shape-based invariant features extraction for object recognition

    No full text
    International audienceThe emergence of new technologies enables generating large quantity of digital information including images; this leads to an increasing number of generated digital images. Therefore it appears a necessity for automatic systems for image retrieval. These systems consist of techniques used for query specification and re-trieval of images from an image collection. The most frequent and the most com-mon means for image retrieval is the indexing using textual keywords. But for some special application domains and face to the huge quantity of images, key-words are no more sufficient or unpractical. Moreover, images are rich in content; so in order to overcome these mentioned difficulties, some approaches are pro-posed based on visual features derived directly from the content of the image: these are the content-based image retrieval (CBIR) approaches. They allow users to search the desired image by specifying image queries: a query can be an exam-ple, a sketch or visual features (e.g., colour, texture and shape). Once the features have been defined and extracted, the retrieval becomes a task of measuring simi-larity between image features. An important property of these features is to be in-variant under various deformations that the observed image could undergo. In this chapter, we will present a number of existing methods for CBIR applica-tions. We will also describe some measures that are usually used for similarity measurement. At the end, and as an application example, we present a specific ap-proach, that we are developing, to illustrate the topic by providing experimental results

    Alignment of Hyperspectral Images Using KAZE Features

    Get PDF
    Image registration is a common operation in any type of image processing, specially in remote sensing images. Since the publication of the scale–invariant feature transform (SIFT) method, several algorithms based on feature detection have been proposed. In particular, KAZE builds the scale space using a nonlinear diffusion filter instead of Gaussian filters. Nonlinear diffusion filtering allows applying a controlled blur while the important structures of the image are preserved. Hyperspectral images contain a large amount of spatial and spectral information that can be used to perform a more accurate registration. This article presents HSI–KAZE, a method to register hyperspectral remote sensing images based on KAZE but considering the spectral information. The proposed method combines the information of a set of preselected bands, and it adapts the keypoint descriptor and the matching stage to take into account the spectral information. The method is adequate to register images in extreme situations in which the scale between them is very different. The effectiveness of the proposed algorithm has been tested on real images taken on different dates, and presenting different types of changes. The experimental results show that the method is robust achieving image registrations with scales of up to 24.0×This research was supported in part by the Consellería de Cultura, Educación e Ordenación Universitaria, Xunta de Galicia [grant numbers GRC2014/008 and ED431G/08] and Ministerio de Educación, Cultura y Deporte [grant number TIN2016-76373-P] both are co–funded by the European Regional Development Fund. The work of Álvaro Ordóñez was supported by the Ministerio de Educación, Cultura y Deporte under an FPU Grant [grant number FPU16/03537]. This work was also partially supported by Consejería de Educación, Junta de Castilla y León (PROPHET Project) [grant number VA082P17]S

    Video content analysis for intelligent forensics

    Get PDF
    The networks of surveillance cameras installed in public places and private territories continuously record video data with the aim of detecting and preventing unlawful activities. This enhances the importance of video content analysis applications, either for real time (i.e. analytic) or post-event (i.e. forensic) analysis. In this thesis, the primary focus is on four key aspects of video content analysis, namely; 1. Moving object detection and recognition, 2. Correction of colours in the video frames and recognition of colours of moving objects, 3. Make and model recognition of vehicles and identification of their type, 4. Detection and recognition of text information in outdoor scenes. To address the first issue, a framework is presented in the first part of the thesis that efficiently detects and recognizes moving objects in videos. The framework targets the problem of object detection in the presence of complex background. The object detection part of the framework relies on background modelling technique and a novel post processing step where the contours of the foreground regions (i.e. moving object) are refined by the classification of edge segments as belonging either to the background or to the foreground region. Further, a novel feature descriptor is devised for the classification of moving objects into humans, vehicles and background. The proposed feature descriptor captures the texture information present in the silhouette of foreground objects. To address the second issue, a framework for the correction and recognition of true colours of objects in videos is presented with novel noise reduction, colour enhancement and colour recognition stages. The colour recognition stage makes use of temporal information to reliably recognize the true colours of moving objects in multiple frames. The proposed framework is specifically designed to perform robustly on videos that have poor quality because of surrounding illumination, camera sensor imperfection and artefacts due to high compression. In the third part of the thesis, a framework for vehicle make and model recognition and type identification is presented. As a part of this work, a novel feature representation technique for distinctive representation of vehicle images has emerged. The feature representation technique uses dense feature description and mid-level feature encoding scheme to capture the texture in the frontal view of the vehicles. The proposed method is insensitive to minor in-plane rotation and skew within the image. The capability of the proposed framework can be enhanced to any number of vehicle classes without re-training. Another important contribution of this work is the publication of a comprehensive up to date dataset of vehicle images to support future research in this domain. The problem of text detection and recognition in images is addressed in the last part of the thesis. A novel technique is proposed that exploits the colour information in the image for the identification of text regions. Apart from detection, the colour information is also used to segment characters from the words. The recognition of identified characters is performed using shape features and supervised learning. Finally, a lexicon based alignment procedure is adopted to finalize the recognition of strings present in word images. Extensive experiments have been conducted on benchmark datasets to analyse the performance of proposed algorithms. The results show that the proposed moving object detection and recognition technique superseded well-know baseline techniques. The proposed framework for the correction and recognition of object colours in video frames achieved all the aforementioned goals. The performance analysis of the vehicle make and model recognition framework on multiple datasets has shown the strength and reliability of the technique when used within various scenarios. Finally, the experimental results for the text detection and recognition framework on benchmark datasets have revealed the potential of the proposed scheme for accurate detection and recognition of text in the wild

    A comprehensive review of fruit and vegetable classification techniques

    Get PDF
    Recent advancements in computer vision have enabled wide-ranging applications in every field of life. One such application area is fresh produce classification, but the classification of fruit and vegetable has proven to be a complex problem and needs to be further developed. Fruit and vegetable classification presents significant challenges due to interclass similarities and irregular intraclass characteristics. Selection of appropriate data acquisition sensors and feature representation approach is also crucial due to the huge diversity of the field. Fruit and vegetable classification methods have been developed for quality assessment and robotic harvesting but the current state-of-the-art has been developed for limited classes and small datasets. The problem is of a multi-dimensional nature and offers significantly hyperdimensional features, which is one of the major challenges with current machine learning approaches. Substantial research has been conducted for the design and analysis of classifiers for hyperdimensional features which require significant computational power to optimise with such features. In recent years numerous machine learning techniques for example, Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Decision Trees, Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) have been exploited with many different feature description methods for fruit and vegetable classification in many real-life applications. This paper presents a critical comparison of different state-of-the-art computer vision methods proposed by researchers for classifying fruit and vegetable

    3D automatic target recognition for missile platforms

    Get PDF
    The quest for military Automatic Target Recognition (ATR) procedures arises from the demand to reduce collateral damage and fratricide. Although missiles with two-dimensional ATR capabilities do exist, the potential of future Light Detection and Ranging (LIDAR) missiles with three-dimensional (3D) ATR abilities shall significantly improve the missile’s effectiveness in complex battlefields. This is because 3D ATR can encode the target’s underlying structure and thus reinforce target recognition. However, the current military grade 3D ATR or military applied computer vision algorithms used for object recognition do not pose optimum solutions in the context of an ATR capable LIDAR based missile, primarily due to the computational and memory (in terms of storage) constraints that missiles impose. Therefore, this research initially introduces a 3D descriptor taxonomy for the Local and the Global descriptor domain, capable of realising the processing cost of each potential option. Through these taxonomies, the optimum missile oriented descriptor per domain is identified that will further pinpoint the research route for this thesis. In terms of 3D descriptors that are suitable for missiles, the contribution of this thesis is a 3D Global based descriptor and four 3D Local based descriptors namely the SURF Projection recognition (SPR), the Histogram of Distances (HoD), the processing efficient variant (HoD-S) and the binary variant B-HoD. These are challenged against current state-of-the-art 3D descriptors on standard commercial datasets, as well as on highly credible simulated air-to-ground missile engagement scenarios that consider various platform parameters and nuisances including simulated scale change and atmospheric disturbances. The results obtained over the different datasets showed an outstanding computational improvement, on average x19 times faster than state-of-the-art techniques in the literature, while maintaining or even improving on some occasions the detection rate to a minimum of 90% and over of correct classified targets

    Pattern Recognition

    Get PDF
    A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

    Signal processing and machine learning techniques for human verification based on finger textures

    Get PDF
    PhD ThesisIn recent years, Finger Textures (FTs) have attracted considerable attention as potential biometric characteristics. They can provide robust recognition performance as they have various human-speci c features, such as wrinkles and apparent lines distributed along the inner surface of all ngers. The main topic of this thesis is verifying people according to their unique FT patterns by exploiting signal processing and machine learning techniques. A Robust Finger Segmentation (RFS) method is rst proposed to isolate nger images from a hand area. It is able to detect the ngers as objects from a hand image. An e cient adaptive nger segmentation method is also suggested to address the problem of alignment variations in the hand image called the Adaptive and Robust Finger Segmentation (ARFS) method. A new Multi-scale Sobel Angles Local Binary Pattern (MSALBP) feature extraction method is proposed which combines the Sobel direction angles with the Multi-Scale Local Binary Pattern (MSLBP). Moreover, an enhanced method called the Enhanced Local Line Binary Pattern (ELLBP) is designed to e ciently analyse the FT patterns. As a result, a powerful human veri cation scheme based on nger Feature Level Fusion with a Probabilistic Neural Network (FLFPNN) is proposed. A multi-object fusion method, termed the Finger Contribution Fusion Neural Network (FCFNN), combines the contribution scores of the nger objects. The veri cation performances are examined in the case of missing FT areas. Consequently, to overcome nger regions which are poorly imaged a method is suggested to salvage missing FT elements by exploiting the information embedded within the trained Probabilistic Neural Network (PNN). Finally, a novel method to produce a Receiver Operating Characteristic (ROC) curve from a PNN is suggested. Furthermore, additional development to this method is applied to generate the ROC graph from the FCFNN. Three databases are employed for evaluation: The Hong Kong Polytechnic University Contact-free 3D/2D (PolyU3D2D), Indian Institute of Technology (IIT) Delhi and Spectral 460nm (S460) from the CASIA Multi-Spectral (CASIAMS) databases. Comparative simulation studies con rm the e ciency of the proposed methods for human veri cation. The main advantage of both segmentation approaches, the RFS and ARFS, is that they can collect all the FT features. The best results have been benchmarked for the ELLBP feature extraction with the FCFNN, where the best Equal Error Rate (EER) values for the three databases PolyU3D2D, IIT Delhi and CASIAMS (S460) have been achieved 0.11%, 1.35% and 0%, respectively. The proposed salvage approach for the missing feature elements has the capability to enhance the veri cation performance for the FLFPNN. Moreover, ROC graphs have been successively established from the PNN and FCFNN.the ministry of higher education and scientific research in Iraq (MOHESR); the Technical college of Mosul; the Iraqi Cultural Attach e; the active people in the MOHESR, who strongly supported Iraqi students

    Effective and efficient kernel-based image representations for classification and retrieval

    Get PDF
    Image representation is a challenging task. In particular, in order to obtain better performances in different image processing applications such as video surveillance, autonomous driving, crime scene detection and automatic inspection, effective and efficient image representation is a fundamental need. The performance of these applications usually depends on how accurately images are classified into their corresponding groups or how precisely relevant images are retrieved from a database based on a query. Accuracy in image classification and precision in image retrieval depend on the effectiveness of image representation. Existing image representation methods have some limitations. For example, spatial pyramid matching, which is a popular method incorporating spatial information in image-level representation, has not been fully studied to date. In addition, the strengths of pyramid match kernel and spatial pyramid matching are not combined for better image matching. Kernel descriptors based on gradient, colour and shape overcome the limitations of histogram-based descriptors, but suffer from information loss, noise effects and high computational complexity. Furthermore, the combined performance of kernel descriptors has limitations related to computational complexity, higher dimensionality and lower effectiveness. Moreover, the potential of a global texture descriptor which is based on human visual perception has not been fully explored to date. Therefore, in this research project, kernel-based effective and efficient image representation methods are proposed to address the above limitations. An enhancement is made to spatial pyramid matching in terms of improved rotation invariance. This is done by investigating different partitioning schemes suitable to achieve rotation-invariant image representation and the proposal of a weight function for appropriate level contribution in image matching. In addition, the strengths of pyramid match kernel and spatial pyramid are combined to enhance matching accuracy between images. The existing kernel descriptors are modified and improved to achieve greater effectiveness, minimum noise effects, less dimensionality and lower computational complexity. A novel fusion approach is also proposed to combine the information related to all pixel attributes, before the descriptor extraction stage. Existing kernel descriptors are based only on gradient, colour and shape information. In this research project, a texture-based kernel descriptor is proposed by modifying an existing popular global texture descriptor. Finally, all the contributions are evaluated in an integrated system. The performances of the proposed methods are qualitatively and quantitatively evaluated on two to four different publicly available image databases. The experimental results show that the proposed methods are more effective and efficient in image representation than existing benchmark methods.Doctor of Philosoph

    Saliency for Image Description and Retrieval

    Get PDF
    We live in a world where we are surrounded by ever increasing numbers of images. More often than not, these images have very little metadata by which they can be indexed and searched. In order to avoid information overload, techniques need to be developed to enable these image collections to be searched by their content. Much of the previous work on image retrieval has used global features such as colour and texture to describe the content of the image. However, these global features are insufficient to accurately describe the image content when different parts of the image have different characteristics. This thesis initially discusses how this problem can be circumvented by using salient interest regions to select the areas of the image that are most interesting and generate local descriptors to describe the image characteristics in that region. The thesis discusses a number of different saliency detectors that are suitable for robust retrieval purposes and performs a comparison between a number of these region detectors. The thesis then discusses how salient regions can be used for image retrieval using a number of techniques, but most importantly, two techniques inspired from the field of textual information retrieval. Using these robust retrieval techniques, a new paradigm in image retrieval is discussed, whereby the retrieval takes place on a mobile device using a query image captured by a built-in camera. This paradigm is demonstrated in the context of an art gallery, in which the device can be used to find more information about particular images. The final chapter of the thesis discusses some approaches to bridging the semantic gap in image retrieval. The chapter explores ways in which un-annotated image collections can be searched by keyword. Two techniques are discussed; the first explicitly attempts to automatically annotate the un-annotated images so that the automatically applied annotations can be used for searching. The second approach does not try to explicitly annotate images, but rather, through the use of linear algebra, it attempts to create a semantic space in which images and keywords are positioned such that images are close to the keywords that represent them within the space
    corecore