100 research outputs found

    Investigation of new feature descriptors for image search and classification

    Get PDF
    Content-based image search, classification and retrieval is an active and important research area due to its broad applications as well as the complexity of the problem. Understanding the semantics and contents of images for recognition remains one of the most difficult and prevailing problems in the machine intelligence and computer vision community. With large variations in size, pose, illumination and occlusions, image classification is a very challenging task. A good classification framework should address the key issues of discriminatory feature extraction as well as efficient and accurate classification. Towards that end, this dissertation focuses on exploring new image descriptors by incorporating cues from the human visual system, and integrating local, texture, shape as well as color information to construct robust and effective feature representations for advancing content-based image search and classification. Based on the Gabor wavelet transformation, whose kernels are similar to the 2D receptive field profiles of the mammalian cortical simple cells, a series of new image descriptors is developed. Specifically, first, a new color Gabor-HOG (GHOG) descriptor is introduced by concatenating the Histograms of Oriented Gradients (HOG) of the component images produced by applying Gabor filters in multiple scales and orientations to encode shape information. Second, the GHOG descriptor is analyzed in six different color spaces and grayscale to propose different color GHOG descriptors, which are further combined to present a new Fused Color GHOG (FC-GHOG) descriptor. Third, a novel GaborPHOG (GPHOG) descriptor is proposed which improves upon the Pyramid Histograms of Oriented Gradients (PHOG) descriptor, and subsequently a new FC-GPHOG descriptor is constructed by combining the multiple color GPHOG descriptors and employing the Principal Component Analysis (PCA). Next, the Gabor-LBP (GLBP) is derived by accumulating the Local Binary Patterns (LBP) histograms of the local Gabor filtered images to encode texture and local information of an image. Furthermore, a novel Gabor-LBPPHOG (GLP) image descriptor is proposed which integrates the GLBP and the GPHOG descriptors as a feature set and an innovative Fused Color Gabor-LBP-PHOG (FC-GLP) is constructed by fusing the GLP from multiple color spaces. Subsequently, The GLBP and the GHOG descriptors are then combined to produce the Gabor-LBP-HOG (GLH) feature vector which performs well on different object and scene image categories. The six color GLH vectors are further concatenated to form the Fused Color GLH (FC-GLH) descriptor. Finally, the Wigner based Local Binary Patterns (WLBP) descriptor is proposed that combines multi-neighborhood LBP, Pseudo-Wigner distribution of images and the popular bag of words model to effectively classify scene images. To assess the feasibility of the proposed new image descriptors, two classification methods are used: one method applies the PCA and the Enhanced Fisher Model (EFM) for feature extraction and the nearest neighbor rule for classification, while the other method employs the Support Vector Machine (SVM). The classification performance of the proposed descriptors is tested on several publicly available popular image datasets. The experimental results show that the proposed new image descriptors achieve image search and classification results better than or at par with other popular image descriptors, such as the Scale Invariant Feature Transform (SIFT), the Pyramid Histograms of visual Words (PHOW), the Pyramid Histograms of Oriented Gradients (PHOG), the Spatial Envelope (SE), the Color SIFT four Concentric Circles (C4CC), the Object Bank (OB), the Context Aware Topic Model (CA-TM), the Hierarchical Matching Pursuit (HMP), the Kernel Spatial Pyramid Matching (KSPM), the SIFT Sparse-coded Spatial Pyramid Matching (Sc-SPM), the Kernel Codebook (KC) and the LBP

    A survey on artificial intelligence-based acoustic source identification

    Get PDF
    The concept of Acoustic Source Identification (ASI), which refers to the process of identifying noise sources has attracted increasing attention in recent years. The ASI technology can be used for surveillance, monitoring, and maintenance applications in a wide range of sectors, such as defence, manufacturing, healthcare, and agriculture. Acoustic signature analysis and pattern recognition remain the core technologies for noise source identification. Manual identification of acoustic signatures, however, has become increasingly challenging as dataset sizes grow. As a result, the use of Artificial Intelligence (AI) techniques for identifying noise sources has become increasingly relevant and useful. In this paper, we provide a comprehensive review of AI-based acoustic source identification techniques. We analyze the strengths and weaknesses of AI-based ASI processes and associated methods proposed by researchers in the literature. Additionally, we did a detailed survey of ASI applications in machinery, underwater applications, environment/event source recognition, healthcare, and other fields. We also highlight relevant research directions

    Holistic methods for visual navigation of mobile robots in outdoor environments

    Get PDF
    Differt D. Holistic methods for visual navigation of mobile robots in outdoor environments. Bielefeld: Universität Bielefeld; 2017

    Feature point classification and matching

    Get PDF
    Ankara : The Department of Electrical and Electronics Engineering and the Institute of Engineering and Sciences of Bilkent University, 2007.Thesis (Master's) -- Bilkent University, 2007.Includes bibliographical references leaves 85-105.A feature point is a salient point which can be separated from its neighborhood. Widely used definitions assume that feature points are corners. However, some non-feature points also satisfy this assumption. Hence, non-feature points, which are highly undesired, are usually detected as feature points. Texture properties around detected points can be used to eliminate non-feature points by determining the distinctiveness of the detected points within their neighborhoods. There are many texture description methods, such as autoregressive models, Gibbs/Markov random field models, time-frequency transforms, etc. To increase the performance of feature point related applications, two new feature point descriptors are proposed, and used in non-feature point elimination and feature point sorting-matching. To have a computationally feasible descriptor algorithm, a single image resolution scale is selected for analyzing the texture properties around the detected points. To create a scale-space, wavelet decomposition is applied to the given images and neighborhood scale-spaces are formed for every detected point. The analysis scale of a point is selected according to the changes in the kurtosis values of histograms which are extracted from the neighborhood scale-space. By using descriptors, the detected non-feature points are eliminated, feature points are sorted and with inclusion of conventional descriptors feature points are matched. According to the scores obtained in the experiments, the proposed detection-matching scheme performs more reliable than the Harris detector gray-level patch matching scheme. However, SIFT detection-matching scheme performs better than the proposed scheme.Ay, AvÅŸar PolatM.S

    Improvements of local directional pattern for texture classification.

    Get PDF
    Doctoral Degree. University of KwaZulu-Natal, Durban.The Local Directional Pattern (LDP) method has established its effectiveness and performance compared to the popular Local Binary Pattern (LBP) method in different applications. In this thesis, several extensions and modification of LDP are proposed with an objective to increase its robustness and discriminative power. Local Directional Pattern (LDP) is dependent on the empirical choice of three for the number of significant bits used to code the responses of the Kirsch Mask operation. In a first study, we applied LDP on informal settlements using various values for the number of significant bits k. It was observed that the change of the value of the number of significant bits led to a change in the performance, depending on the application. Local Directional Pattern (LDP) is based on the computation Kirsch Mask application response values in eight directions. But this method ignores the gray value of the center pixel, which may lead to loss of significant information. Centered Local Directional Pattern (CLDP) is introduced to solve this issue, using the value of the center pixel based on its relations with neighboring pixels. Local Directional Pattern (LDP) also generates a code based on the absolute value of the edge response value; however, the sign of the original value indicates two different trends (positive or negative) of the gradient. To capture the gradient trend, Signed Local Directional Pattern (SLDP) and Centered-SLDP (C-SLDP) are proposed, which compute the eight edge responses based on the two different directions (positive or negative) of the gradients.The Directional Local Binary pattern (DLBP) is introduced, which adopts directional information to represent texture images. This method is more stable than both LDP and LBP because it utilizes the center pixel as a threshold for the edge response of a pixel in eight directions, instead of employing the center pixel as the threshold for pixel intensity of the neighbors, as in the LBP method. Angled Local directional pattern (ALDP) is also presented, with an objective to resolve two problems in the LDP method. These are the value of the number of significant bits k, and to taking into account the center pixel value. It computes the angle values for the edge response of a pixel in eight directions for each angle (0â—¦,45â—¦,90â—¦,135â—¦). Each angle vector contains three values. The central value in each vector is chosen as a threshold for the other two neighboring pixels. Circular Local Directional Pattern (CILDP) isalso presented, with an objective of a better analysis, especially with textures with a different scale. The method is built around the circular shape to compute the directional edge vector using different radiuses. The performances of LDP, LBP, CLDP, SLDP, C-SLDP, DLBP, ALDP and CILDP are evaluated using five classifiers (K-nearest neighbour algorithm (k-NN), Support Vector Machine (SVM), Perceptron, Naive-Bayes (NB), and Decision Tree (DT)) applied to two different texture datasets: Kylberg dataset and KTH-TIPS2-b dataset. The experimental results demonstrated that the proposed methods outperform both LDP and LBP

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Computational Imaging and Artificial Intelligence: The Next Revolution of Mobile Vision

    Full text link
    Signal capture stands in the forefront to perceive and understand the environment and thus imaging plays the pivotal role in mobile vision. Recent explosive progresses in Artificial Intelligence (AI) have shown great potential to develop advanced mobile platforms with new imaging devices. Traditional imaging systems based on the "capturing images first and processing afterwards" mechanism cannot meet this unprecedented demand. Differently, Computational Imaging (CI) systems are designed to capture high-dimensional data in an encoded manner to provide more information for mobile vision systems.Thanks to AI, CI can now be used in real systems by integrating deep learning algorithms into the mobile vision platform to achieve the closed loop of intelligent acquisition, processing and decision making, thus leading to the next revolution of mobile vision.Starting from the history of mobile vision using digital cameras, this work first introduces the advances of CI in diverse applications and then conducts a comprehensive review of current research topics combining CI and AI. Motivated by the fact that most existing studies only loosely connect CI and AI (usually using AI to improve the performance of CI and only limited works have deeply connected them), in this work, we propose a framework to deeply integrate CI and AI by using the example of self-driving vehicles with high-speed communication, edge computing and traffic planning. Finally, we outlook the future of CI plus AI by investigating new materials, brain science and new computing techniques to shed light on new directions of mobile vision systems

    Food Recognition and Volume Estimation in a Dietary Assessment System

    Full text link
    Recently obesity has become an epidemic and one of the most serious worldwide public health concerns of the 21st century. Obesity diminishes the average life expectancy and there is now convincing evidence that poor diet, in combination with physical inactivity are key determinants of an individual s risk of developing chronic diseases such as cancer, cardiovascular disease or diabetes. Assessing what people eat is fundamental to establishing the link between diet and disease. Food records are considered the best approach for assessing energy intake. However, this method requires literate and highly motivated subjects. This is a particular problem for adolescents and young adults who are the least likely to undertake food records. The ready access of the majority of the population to mobile phones (with integrated camera, improved memory capacity, network connectivity and faster processing capability) has opened up new opportunities for dietary assessment. The dietary information extracted from dietary assessment provide valuable insights into the cause of diseases that greatly helps practicing dietitians and researchers to develop subsequent approaches for mounting intervention programs for prevention. In such systems, the camera in the mobile phone is used for capturing images of food consumed and these images are then processed to automatically estimate the nutritional content of the food. However, food objects are deformable objects that exhibit variations in appearance, shape, texture and color so the food classification and volume estimation in these systems suffer from lower accuracy. The improvement of the food recognition accuracy and volume estimation accuracy are challenging tasks. This thesis presents new techniques for food classification and food volume estimation. For food recognition, emphasis was given to texture features. The existing food recognition techniques assume that the food images will be viewed at similar scales and from the same viewpoints. However, this assumption fails in practical applications, because it is difficult to ensure that a user in a dietary assessment system will put his/her camera at the same scale and orientation to capture food images as that of the target food images in the database. A new scale and rotation invariant feature generation approach that applies Gabor filter banks is proposed. To obtain scale and rotation invariance, the proposed approach identifies the dominant orientation of the filtered coefficient and applies a circular shifting operation to place this value at the first scale of dominant direction. The advantages of this technique are it does not require the scale factor to be known in advance and it is scale/and rotation invariant separately and concurrently. This approach is modified to achieve improved accuracy by applying a Gaussian window along the scale dimension which reduces the impact of high and low frequencies of the filter outputs enabling better matching between the same classes. Besides automatic classification, semi automatic classification and group classification are also considered to have an idea about the improvement. To estimate the volume of a food item, a stereo pair is used to recover the structure as a 3D point cloud. A slice based volume estimation approach is proposed that converts the 3D point cloud to a series of 2D slices. The proposed approach eliminates the problem of knowing the distance between two cameras with the help of disparities and depth information from a fiducial marker. The experimental results show that the proposed approach can provide an accurate estimate of food volume
    • …
    corecore