1,173 research outputs found

    An improved image segmentation algorithm for salient object detection

    Get PDF
    Semantic object detection is one of the most important and challenging problems in image analysis. Segmentation is an optimal approach to detect salient objects, but often fails to generate meaningful regions due to over-segmentation. This paper presents an improved semantic segmentation approach which is based on JSEG algorithm and utilizes multiple region merging criteria. The experimental results demonstrate that the proposed algorithm is encouraging and effective in salient object detection

    Saliency for Image Description and Retrieval

    Get PDF
    We live in a world where we are surrounded by ever increasing numbers of images. More often than not, these images have very little metadata by which they can be indexed and searched. In order to avoid information overload, techniques need to be developed to enable these image collections to be searched by their content. Much of the previous work on image retrieval has used global features such as colour and texture to describe the content of the image. However, these global features are insufficient to accurately describe the image content when different parts of the image have different characteristics. This thesis initially discusses how this problem can be circumvented by using salient interest regions to select the areas of the image that are most interesting and generate local descriptors to describe the image characteristics in that region. The thesis discusses a number of different saliency detectors that are suitable for robust retrieval purposes and performs a comparison between a number of these region detectors. The thesis then discusses how salient regions can be used for image retrieval using a number of techniques, but most importantly, two techniques inspired from the field of textual information retrieval. Using these robust retrieval techniques, a new paradigm in image retrieval is discussed, whereby the retrieval takes place on a mobile device using a query image captured by a built-in camera. This paradigm is demonstrated in the context of an art gallery, in which the device can be used to find more information about particular images. The final chapter of the thesis discusses some approaches to bridging the semantic gap in image retrieval. The chapter explores ways in which un-annotated image collections can be searched by keyword. Two techniques are discussed; the first explicitly attempts to automatically annotate the un-annotated images so that the automatically applied annotations can be used for searching. The second approach does not try to explicitly annotate images, but rather, through the use of linear algebra, it attempts to create a semantic space in which images and keywords are positioned such that images are close to the keywords that represent them within the space

    Extracting generic text information from images

    Full text link
    University of Technology, Sydney. Faculty of Engineering and Information Technology.As a vast amount of text appears everywhere, including natural scene, web pages and videos, text becomes very important information for different applications. Extracting text information from images and video frames is the first step of applying them to a specific application and this task is completed by a text information extraction (TIE) system. TIE consists of text detection, text binarisation and text recognition. For different applications or projects, one or more of these three TIE components may be embedded. Although many efforts have been made to extract text from images and videos, this problem is far from being solved due to the difficulties existing in different scenarios. This thesis focuses on the research of text detection and text binarisation. For the work on text detection in born-digital images, a new scheme for coarse text detection and a texture-based feature for fine text detection are proposed. In the coarse detection step, a novel scheme based on Maximum Gradient Difference (MGD) response of text lines is proposed. MGD values are classified into multiple clusters by a clustering algorithm to create multiple layer images. Then, the text line candidates are detected in different layer images. An SVM classifier trained by a novel texture-based feature is utilized to filter out the non-text regions. The superiority of the proposed feature is demonstrated by comparing with other features for text/non-text classification capability. Another algorithm is designed for detecting texts from natural scene images. Maximally Stable Extremal Regions (MSERs) as character candidates are classified into character MSERs and non-character MSERs based on geometry-based, stroke-based, HOG-based and colour-based features. Two types of misclassified character MSERs are retrieved by two different schemes respectively. A false alarm elimination step is performed for increasing the text detection precision and the bootstrap strategy is used to enhance the power of suppressing false positives. Both promising recall rate and precision rate are achieved. In the aspect of text binarisation research, the combination of the selected colour channel image and graph-based technique are explored firstly. The colour channel image with the histogram having the biggest distance, estimated by mean-shift procedure, between the two main peaks is selected before the graph model is constructed. Then, Normalised cut is employed on the graph to get the binarisation result. For circumventing the drawbacks of the grayscale-based method, a colour-based text binarisation method is proposed. A modified Connected Component (CC)-based validation measurement and a new objective segmentation evaluation criterion are applied as sequential processing. The experimental results show the effectiveness of our text binarisation algorithms

    Unsupervised segmentation of natural images based on the adaptive integration of colour-texture descriptors

    Get PDF

    A comprehensive review of fruit and vegetable classification techniques

    Get PDF
    Recent advancements in computer vision have enabled wide-ranging applications in every field of life. One such application area is fresh produce classification, but the classification of fruit and vegetable has proven to be a complex problem and needs to be further developed. Fruit and vegetable classification presents significant challenges due to interclass similarities and irregular intraclass characteristics. Selection of appropriate data acquisition sensors and feature representation approach is also crucial due to the huge diversity of the field. Fruit and vegetable classification methods have been developed for quality assessment and robotic harvesting but the current state-of-the-art has been developed for limited classes and small datasets. The problem is of a multi-dimensional nature and offers significantly hyperdimensional features, which is one of the major challenges with current machine learning approaches. Substantial research has been conducted for the design and analysis of classifiers for hyperdimensional features which require significant computational power to optimise with such features. In recent years numerous machine learning techniques for example, Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Decision Trees, Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) have been exploited with many different feature description methods for fruit and vegetable classification in many real-life applications. This paper presents a critical comparison of different state-of-the-art computer vision methods proposed by researchers for classifying fruit and vegetable

    Colour Texture analysis

    Get PDF
    This chapter presents a novel and generic framework for image segmentation using a compound image descriptor that encompasses both colour and texture information in an adaptive fashion. The developed image segmentation method extracts the texture information using low-level image descriptors (such as the Local Binary Patterns (LBP)) and colour information by using colour space partitioning. The main advantage of this approach is the analysis of the textured images at a micro-level using the local distribution of the LBP values, and in the colour domain by analysing the local colour distribution obtained after colour segmentation. The use of the colour and texture information separately has proven to be inappropriate for natural images as they are generally heterogeneous with respect to colour and texture characteristics. Thus, the main problem is to use the colour and texture information in a joint descriptor that can adapt to the local properties of the image under analysis. We will review existing approaches to colour and texture analysis as well as illustrating how our approach can be successfully applied to a range of applications including the segmentation of natural images, medical imaging and product inspection

    Features for matching people in different views

    No full text
    There have been significant advances in the computer vision field during the last decade. During this period, many methods have been developed that have been successful in solving challenging problems including Face Detection, Object Recognition and 3D Scene Reconstruction. The solutions developed by computer vision researchers have been widely adopted and used in many real-life applications such as those faced in the medical and security industry. Among the different branches of computer vision, Object Recognition has been an area that has advanced rapidly in recent years. The successful introduction of approaches such as feature extraction and description has been an important factor in the growth of this area. In recent years, researchers have attempted to use these approaches and apply them to other problems such as Content Based Image Retrieval and Tracking. In this work, we present a novel system that finds correspondences between people seen in different images. Unlike other approaches that rely on a video stream to track the movement of people between images, here we present a feature-based approach where we locate a target’s new location in an image, based only on its visual appearance. Our proposed system comprises three steps. In the first step, a set of features is extracted from the target’s appearance. A novel algorithm is developed that allows extraction of features from a target that is particularly suitable to the modelling task. In the second step, each feature is characterised using a combined colour and texture descriptor. Inclusion of information relating to both colour and texture of a feature add to the descriptor’s distinctiveness. Finally, the target’s appearance and pose is modelled as a collection of such features and descriptors. This collection is then used as a template that allows us to search for a similar combination of features in other images that correspond to the target’s new location. We have demonstrated the effectiveness of our system in locating a target’s new position in an image, despite differences in viewpoint, scale or elapsed time between the images. The characterisation of a target as a collection of features also allows our system to robustly deal with the partial occlusion of the target
    corecore