582 research outputs found

    A comprehensive review of fruit and vegetable classification techniques

    Get PDF
    Recent advancements in computer vision have enabled wide-ranging applications in every field of life. One such application area is fresh produce classification, but the classification of fruit and vegetable has proven to be a complex problem and needs to be further developed. Fruit and vegetable classification presents significant challenges due to interclass similarities and irregular intraclass characteristics. Selection of appropriate data acquisition sensors and feature representation approach is also crucial due to the huge diversity of the field. Fruit and vegetable classification methods have been developed for quality assessment and robotic harvesting but the current state-of-the-art has been developed for limited classes and small datasets. The problem is of a multi-dimensional nature and offers significantly hyperdimensional features, which is one of the major challenges with current machine learning approaches. Substantial research has been conducted for the design and analysis of classifiers for hyperdimensional features which require significant computational power to optimise with such features. In recent years numerous machine learning techniques for example, Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Decision Trees, Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) have been exploited with many different feature description methods for fruit and vegetable classification in many real-life applications. This paper presents a critical comparison of different state-of-the-art computer vision methods proposed by researchers for classifying fruit and vegetable

    Recognition and matching in the presence of deformation and lighting change

    Get PDF
    Natural images of objects and scenes show a fascinating amount of variability due to different factors like lighting and viewpoint change, occlusion, articulation and non-rigid deformation. There are certain cases like recognition of specular objects and images with arbitrary deformations where existing techniques do not perform well. For image deformation, we propose a method for faster keypoint matching with histogram descriptors and a completely deformation invariant representation. We also propose a method for improving specular object recognition. Histograms are a powerful statistical representation for keypoint matching and content based image retrieval. The earth mover's distance (EMD) is an important perceptually meaningful metric for comparing histograms, but it suffers from high (O(n3 log n)) computational complexity. We propose a novel linear time algorithm for approximating EMD with the weighted L1 norm of the wavelet transform of the difference histogram. We prove that the resulting wavelet EMD metric is equivalent to EMD. We experimentally show that wavelet EMD is a good approximation to EMD, has similar performance, but requires much less computation. We also give a fast algorithm for the best partial EMD match between two histograms. Images of non-planar object can undergo a large non-linear deformation due to a viewpoint change. Complex deformations occur in images of non-rigid objects, for example, in medical image sequences. We propose using the contour tree as a novel framework invariant to arbitrary deformations for representing and comparing images. It represents all the deformation invariant information in an image. Lighting changes greatly affect the appearance of specular objects and make recognition difficult much more than for Lambertian objects. In model based recognition of specular objects, an important constraint is that the estimated lighting should be non-negative everywhere. We propose a new method to enforce this constraint and explore its usefulness in specular object recognition, using the spherical harmonic representation of lighting. The new method is faster as well as more accurate than previous methods. Experiments on both synthetic and real data indicate that the constraint can improve recognition of specular objects by better separating the correct and incorrect models

    Automated Semantic Content Extraction from Images

    Get PDF
    In this study, an automatic semantic segmentation and object recognition methodology is implemented which bridges the semantic gap between low level features of image content and high level conceptual meaning. Semantically understanding an image is essential in modeling autonomous robots, targeting customers in marketing or reverse engineering of building information modeling in the construction industry. To achieve an understanding of a room from a single image we proposed a new object recognition framework which has four major components: segmentation, scene detection, conceptual cueing and object recognition. The new segmentation methodology developed in this research extends Felzenswalb\u27s cost function to include new surface index and depth features as well as color, texture and normal features to overcome issues of occlusion and shadowing commonly found in images. Adding depth allows capturing new features for object recognition stage to achieve high accuracy compared to the current state of the art. The goal was to develop an approach to capture and label perceptually important regions which often reflect global representation and understanding of the image. We developed a system by using contextual and common sense information for improving object recognition and scene detection, and fused the information from scene and objects to reduce the level of uncertainty. This study in addition to improving segmentation, scene detection and object recognition, can be used in applications that require physical parsing of the image into objects, surfaces and their relations. The applications include robotics, social networking, intelligence and anti-terrorism efforts, criminal investigations and security, marketing, and building information modeling in the construction industry. In this dissertation a structural framework (ontology) is developed that generates text descriptions based on understanding of objects, structures and the attributes of an image

    Video content analysis for intelligent forensics

    Get PDF
    The networks of surveillance cameras installed in public places and private territories continuously record video data with the aim of detecting and preventing unlawful activities. This enhances the importance of video content analysis applications, either for real time (i.e. analytic) or post-event (i.e. forensic) analysis. In this thesis, the primary focus is on four key aspects of video content analysis, namely; 1. Moving object detection and recognition, 2. Correction of colours in the video frames and recognition of colours of moving objects, 3. Make and model recognition of vehicles and identification of their type, 4. Detection and recognition of text information in outdoor scenes. To address the first issue, a framework is presented in the first part of the thesis that efficiently detects and recognizes moving objects in videos. The framework targets the problem of object detection in the presence of complex background. The object detection part of the framework relies on background modelling technique and a novel post processing step where the contours of the foreground regions (i.e. moving object) are refined by the classification of edge segments as belonging either to the background or to the foreground region. Further, a novel feature descriptor is devised for the classification of moving objects into humans, vehicles and background. The proposed feature descriptor captures the texture information present in the silhouette of foreground objects. To address the second issue, a framework for the correction and recognition of true colours of objects in videos is presented with novel noise reduction, colour enhancement and colour recognition stages. The colour recognition stage makes use of temporal information to reliably recognize the true colours of moving objects in multiple frames. The proposed framework is specifically designed to perform robustly on videos that have poor quality because of surrounding illumination, camera sensor imperfection and artefacts due to high compression. In the third part of the thesis, a framework for vehicle make and model recognition and type identification is presented. As a part of this work, a novel feature representation technique for distinctive representation of vehicle images has emerged. The feature representation technique uses dense feature description and mid-level feature encoding scheme to capture the texture in the frontal view of the vehicles. The proposed method is insensitive to minor in-plane rotation and skew within the image. The capability of the proposed framework can be enhanced to any number of vehicle classes without re-training. Another important contribution of this work is the publication of a comprehensive up to date dataset of vehicle images to support future research in this domain. The problem of text detection and recognition in images is addressed in the last part of the thesis. A novel technique is proposed that exploits the colour information in the image for the identification of text regions. Apart from detection, the colour information is also used to segment characters from the words. The recognition of identified characters is performed using shape features and supervised learning. Finally, a lexicon based alignment procedure is adopted to finalize the recognition of strings present in word images. Extensive experiments have been conducted on benchmark datasets to analyse the performance of proposed algorithms. The results show that the proposed moving object detection and recognition technique superseded well-know baseline techniques. The proposed framework for the correction and recognition of object colours in video frames achieved all the aforementioned goals. The performance analysis of the vehicle make and model recognition framework on multiple datasets has shown the strength and reliability of the technique when used within various scenarios. Finally, the experimental results for the text detection and recognition framework on benchmark datasets have revealed the potential of the proposed scheme for accurate detection and recognition of text in the wild

    Directional wavelet based features for colonic polyp classification

    Get PDF
    In this work, various wavelet based methods like the discrete wavelet transform, the dual-tree complex wavelet transform, the Gabor wavelet transform, curvelets, contourlets and shearlets are applied for the automated classification of colonic polyps. The methods are tested on 8 HD-endoscopic image databases, where each database is acquired using different imaging modalities (Pentax's i-Scan technology combined with or without staining the mucosa), 2 NBI high-magnification databases and one database with chromoscopy high-magnification images. To evaluate the suitability of the wavelet based methods with respect to the classification of colonic polyps, the classification performances of 3 wavelet transforms and the more recent curvelets, contourlets and shearlets are compared using a common framework. Wavelet transforms were already often and successfully applied to the classification of colonic polyps, whereas curvelets, contourlets and shearlets have not been used for this purpose so far. We apply different feature extraction techniques to extract the information of the subbands of the wavelet based methods. Most of the in total 25 approaches were already published in different texture classification contexts. Thus, the aim is also to assess and compare their classification performance using a common framework. Three of the 25 approaches are novel. These three approaches extract Weibull features from the subbands of curvelets, contourlets and shearlets. Additionally, 5 state-of-the-art non wavelet based methods are applied to our databases so that we can compare their results with those of the wavelet based methods. It turned out that extracting Weibull distribution parameters from the subband coefficients generally leads to high classification results, especially for the dual-tree complex wavelet transform, the Gabor wavelet transform and the Shearlet transform. These three wavelet based transforms in combination with Weibull features even outperform the state-of-the-art methods on most of the databases. We will also show that the Weibull distribution is better suited to model the subband coefficient distribution than other commonly used probability distributions like the Gaussian distribution and the generalized Gaussian distribution. So this work gives a reasonable summary of wavelet based methods for colonic polyp classification and the huge amount of endoscopic polyp databases used for our experiments assures a high significance of the achieved results.(VLID)223912

    Colour and texture image analysis in a Local Binary Pattern framework

    Get PDF
    In this Thesis we use colour and Local Binary Pattern based texture analysis for image classification and reconstruction. In complementary work we offer a new texture description called the Sudoku transform, an extension of the Local Binary Pattern. Our new method when used to classify members of benchmark datasets shows a performance increment over traditional methods including the Local Binary Pattern. Finally we consider the invertibility of texture descriptions and show how with our new method - Quadratic Reconstruction - that a highly accurate image can be recovered purely from its textural information

    Computer Vision for Microscopy Applications

    Get PDF
    • …
    corecore