182 research outputs found

    Image Annotation and Topic Extraction Using Super-Word Latent Dirichlet

    Get PDF
    This research presents a multi-domain solution that uses text and images to iteratively improve automated information extraction. Stage I uses local text surrounding an embedded image to provide clues that help rank-order possible image annotations. These annotations are forwarded to Stage II, where the image annotations from Stage I are used as highly-relevant super-words to improve extraction of topics. The model probabilities from the super-words in Stage II are forwarded to Stage III where they are used to refine the automated image annotation developed in Stage I. All stages demonstrate improvement over existing equivalent algorithms in the literature

    The Optimisation of Elementary and Integrative Content-Based Image Retrieval Techniques

    Get PDF
    Image retrieval plays a major role in many image processing applications. However, a number of factors (e.g. rotation, non-uniform illumination, noise and lack of spatial information) can disrupt the outputs of image retrieval systems such that they cannot produce the desired results. In recent years, many researchers have introduced different approaches to overcome this problem. Colour-based CBIR (content-based image retrieval) and shape-based CBIR were the most commonly used techniques for obtaining image signatures. Although the colour histogram and shape descriptor have produced satisfactory results for certain applications, they still suffer many theoretical and practical problems. A prominent one among them is the well-known “curse of dimensionality “. In this research, a new Fuzzy Fusion-based Colour and Shape Signature (FFCSS) approach for integrating colour-only and shape-only features has been investigated to produce an effective image feature vector for database retrieval. The proposed technique is based on an optimised fuzzy colour scheme and robust shape descriptors. Experimental tests were carried out to check the behaviour of the FFCSS-based system, including sensitivity and robustness of the proposed signature of the sampled images, especially under varied conditions of, rotation, scaling, noise and light intensity. To further improve retrieval efficiency of the devised signature model, the target image repositories were clustered into several groups using the k-means clustering algorithm at system runtime, where the search begins at the centres of each cluster. The FFCSS-based approach has proven superior to other benchmarked classic CBIR methods, hence this research makes a substantial contribution towards corresponding theoretical and practical fronts

    Shape-based image retrieval in iconic image databases.

    Get PDF
    by Chan Yuk Ming.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 117-124).Abstract also in Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Content-based Image Retrieval --- p.3Chapter 1.2 --- Designing a Shape-based Image Retrieval System --- p.4Chapter 1.3 --- Information on Trademark --- p.6Chapter 1.3.1 --- What is a Trademark? --- p.6Chapter 1.3.2 --- Search for Conflicting Trademarks --- p.7Chapter 1.3.3 --- Research Scope --- p.8Chapter 1.4 --- Information on Chinese Cursive Script Character --- p.9Chapter 1.5 --- Problem Definition --- p.9Chapter 1.6 --- Contributions --- p.11Chapter 1.7 --- Thesis Organization --- p.13Chapter 2 --- Literature Review --- p.14Chapter 2.1 --- Trademark Retrieval using QBIC Technology --- p.14Chapter 2.2 --- STAR --- p.16Chapter 2.3 --- ARTISAN --- p.17Chapter 2.4 --- Trademark Retrieval using a Visually Salient Feature --- p.18Chapter 2.5 --- Trademark Recognition using Closed Contours --- p.19Chapter 2.6 --- Trademark Retrieval using a Two Stage Hierarchy --- p.19Chapter 2.7 --- Logo Matching using Negative Shape Features --- p.21Chapter 2.8 --- Chapter Summary --- p.22Chapter 3 --- Background on Shape Representation and Matching --- p.24Chapter 3.1 --- Simple Geometric Features --- p.25Chapter 3.1.1 --- Circularity --- p.25Chapter 3.1.2 --- Rectangularity --- p.26Chapter 3.1.3 --- Hole Area Ratio --- p.27Chapter 3.1.4 --- Horizontal Gap Ratio --- p.27Chapter 3.1.5 --- Vertical Gap Ratio --- p.28Chapter 3.1.6 --- Central Moments --- p.28Chapter 3.1.7 --- Major Axis Orientation --- p.29Chapter 3.1.8 --- Eccentricity --- p.30Chapter 3.2 --- Fourier Descriptors --- p.30Chapter 3.3 --- Chain Codes --- p.31Chapter 3.4 --- Seven Invariant Moments --- p.33Chapter 3.5 --- Zernike Moments --- p.35Chapter 3.6 --- Edge Direction Histogram --- p.36Chapter 3.7 --- Curvature Scale Space Representation --- p.37Chapter 3.8 --- Chapter Summary --- p.39Chapter 4 --- Genetic Algorithm for Weight Assignment --- p.42Chapter 4.1 --- Genetic Algorithm (GA) --- p.42Chapter 4.1.1 --- Basic Idea --- p.43Chapter 4.1.2 --- Genetic Operators --- p.44Chapter 4.2 --- Why GA? --- p.45Chapter 4.3 --- Weight Assignment Problem --- p.46Chapter 4.3.1 --- Integration of Image Attributes --- p.46Chapter 4.4 --- Proposed Solution --- p.47Chapter 4.4.1 --- Formalization --- p.47Chapter 4.4.2 --- Proposed Genetic Algorithm --- p.43Chapter 4.5 --- Chapter Summary --- p.49Chapter 5 --- Shape-based Trademark Image Retrieval System --- p.50Chapter 5.1 --- Problems on Existing Methods --- p.50Chapter 5.1.1 --- Edge Direction Histogram --- p.51Chapter 5.1.2 --- Boundary Based Techniques --- p.52Chapter 5.2 --- Proposed Solution --- p.53Chapter 5.2.1 --- Image Preprocessing --- p.53Chapter 5.2.2 --- Automatic Feature Extraction --- p.54Chapter 5.2.3 --- Approximated Boundary --- p.55Chapter 5.2.4 --- Integration of Shape Features and Query Processing --- p.58Chapter 5.3 --- Experimental Results --- p.58Chapter 5.3.1 --- Experiment 1: Weight Assignment using Genetic Algorithm --- p.59Chapter 5.3.2 --- Experiment 2: Speed on Feature Extraction and Retrieval --- p.62Chapter 5.3.3 --- Experiment 3: Evaluation by Precision --- p.63Chapter 5.3.4 --- Experiment 4: Evaluation by Recall for Deformed Images --- p.64Chapter 5.3.5 --- Experiment 5: Evaluation by Recall for Hand Drawn Query Trademarks --- p.66Chapter 5.3.6 --- "Experiment 6: Evaluation by Recall for Rotated, Scaled and Mirrored Images" --- p.66Chapter 5.3.7 --- Experiment 7: Comparison of Different Integration Methods --- p.68Chapter 5.4 --- Chapter Summary --- p.71Chapter 6 --- Shape-based Chinese Cursive Script Character Image Retrieval System --- p.72Chapter 6.1 --- Comparison to Trademark Retrieval Problem --- p.79Chapter 6.1.1 --- Feature Selection --- p.73Chapter 6.1.2 --- Speed of System --- p.73Chapter 6.1.3 --- Variation of Style --- p.73Chapter 6.2 --- Target of the Research --- p.74Chapter 6.3 --- Proposed Solution --- p.75Chapter 6.3.1 --- Image Preprocessing --- p.75Chapter 6.3.2 --- Automatic Feature Extraction --- p.76Chapter 6.3.3 --- Thinned Image and Linearly Normalized Image --- p.76Chapter 6.3.4 --- Edge Directions --- p.77Chapter 6.3.5 --- Integration of Shape Features --- p.78Chapter 6.4 --- Experimental Results --- p.79Chapter 6.4.1 --- Experiment 8: Weight Assignment using Genetic Algorithm --- p.79Chapter 6.4.2 --- Experiment 9: Speed on Feature Extraction and Retrieval --- p.81Chapter 6.4.3 --- Experiment 10: Evaluation by Recall for Deformed Images --- p.82Chapter 6.4.4 --- Experiment 11: Evaluation by Recall for Rotated and Scaled Images --- p.83Chapter 6.4.5 --- Experiment 12: Comparison of Different Integration Methods --- p.85Chapter 6.5 --- Chapter Summary --- p.87Chapter 7 --- Conclusion --- p.88Chapter 7.1 --- Summary --- p.88Chapter 7.2 --- Future Research --- p.89Chapter 7.2.1 --- Limitations --- p.89Chapter 7.2.2 --- Future Directions --- p.90Chapter A --- A Representative Subset of Trademark Images --- p.91Chapter B --- A Representative Subset of Cursive Script Character Images --- p.93Chapter C --- Shape Feature Extraction Toolbox for Matlab V53 --- p.95Chapter C.l --- central .moment --- p.95Chapter C.2 --- centroid --- p.96Chapter C.3 --- cir --- p.96Chapter C.4 --- ess --- p.97Chapter C.5 --- css_match --- p.100Chapter C.6 --- ecc --- p.102Chapter C.7 --- edge一directions --- p.102Chapter C.8 --- fourier-d --- p.105Chapter C.9 --- gen_shape --- p.106Chapter C.10 --- hu7 --- p.108Chapter C.11 --- isclockwise --- p.109Chapter C.12 --- moment --- p.110Chapter C.13 --- normalized-moment --- p.111Chapter C.14 --- orientation --- p.111Chapter C.15 --- resample-pts --- p.112Chapter C.16 --- rectangularity --- p.113Chapter C.17 --- trace-points --- p.114Chapter C.18 --- warp-conv --- p.115Bibliography --- p.11

    Text Detection in Natural Scenes and Technical Diagrams with Convolutional Feature Learning and Cascaded Classification

    Get PDF
    An enormous amount of digital images are being generated and stored every day. Understanding text in these images is an important challenge with large impacts for academic, industrial and domestic applications. Recent studies address the difficulty of separating text targets from noise and background, all of which vary greatly in natural scenes. To tackle this problem, we develop a text detection system to analyze and utilize visual information in a data driven, automatic and intelligent way. The proposed method incorporates features learned from data, including patch-based coarse-to-fine detection (Text-Conv), connected component extraction using region growing, and graph-based word segmentation (Word-Graph). Text-Conv is a sliding window-based detector, with convolution masks learned using the Convolutional k-means algorithm (Coates et. al, 2011). Unlike convolutional neural networks (CNNs), a single vector/layer of convolution mask responses are used to classify patches. An initial coarse detection considers both local and neighboring patch responses, followed by refinement using varying aspect ratios and rotations for a smaller local detection window. Different levels of visual detail from ground truth are utilized in each step, first using constraints on bounding box intersections, and then a combination of bounding box and pixel intersections. Combining masks from different Convolutional k-means initializations, e.g., seeded using random vectors and then support vectors improves performance. The Word-Graph algorithm uses contextual information to improve word segmentation and prune false character detections based on visual features and spatial context. Our system obtains pixel, character, and word detection f-measures of 93.14%, 90.26%, and 86.77% respectively for the ICDAR 2015 Robust Reading Focused Scene Text dataset, out-performing state-of-the-art systems, and producing highly accurate text detection masks at the pixel level. To investigate the utility of our feature learning approach for other image types, we perform tests on 8- bit greyscale USPTO patent drawing diagram images. An ensemble of Ada-Boost classifiers with different convolutional features (MetaBoost) is used to classify patches as text or background. The Tesseract OCR system is used to recognize characters in detected labels and enhance performance. With appropriate pre-processing and post-processing, f-measures of 82% for part label location, and 73% for valid part label locations and strings are obtained, which are the best obtained to-date for the USPTO patent diagram data set used in our experiments. To sum up, an intelligent refinement of convolutional k-means-based feature learning and novel automatic classification methods are proposed for text detection, which obtain state-of-the-art results without the need for strong prior knowledge. Different ground truth representations along with features including edges, color, shape and spatial relationships are used coherently to improve accuracy. Different variations of feature learning are explored, e.g. support vector-seeded clustering and MetaBoost, with results suggesting that increased diversity in learned features benefit convolution-based text detectors

    Content-Based Image Retrieval Using Self-Organizing Maps

    Full text link

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Image retrieval using automatic region tagging

    Get PDF
    The task of tagging, annotating or labelling image content automatically with semantic keywords is a challenging problem. To automatically tag images semantically based on the objects that they contain is essential for image retrieval. In addressing these problems, we explore the techniques developed to combine textual description of images with visual features, automatic region tagging and region-based ontology image retrieval. To evaluate the techniques, we use three corpora comprising: Lonely Planet travel guide articles with images, Wikipedia articles with images and Goats comic strips. In searching for similar images or textual information specified in a query, we explore the unification of textual descriptions and visual features (such as colour and texture) of the images. We compare the effectiveness of using different retrieval similarity measures for the textual component. We also analyse the effectiveness of different visual features extracted from the images. We then investigate the best weight combination of using textual and visual features. Using the queries from the Multimedia Track of INEX 2005 and 2006, we found that the best weight combination significantly improves the effectiveness of the retrieval system. Our findings suggest that image regions are better in capturing the semantics, since we can identify specific regions of interest in an image. In this context, we develop a technique to tag image regions with high-level semantics. This is done by combining several shape feature descriptors and colour, using an equal-weight linear combination. We experimentally compare this technique with more complex machine-learning algorithms, and show that the equal-weight linear combination of shape features is simpler and at least as effective as using a machine learning algorithm. We focus on the synergy between ontology and image annotations with the aim of reducing the gap between image features and high-level semantics. Ontologies ease information retrieval. They are used to mine, interpret, and organise knowledge. An ontology may be seen as a knowledge base that can be used to improve the image retrieval process, and conversely keywords obtained from automatic tagging of image regions may be useful for creating an ontology. We engineer an ontology that surrogates concepts derived from image feature descriptors. We test the usability of the constructed ontology by querying the ontology via the Visual Ontology Query Interface, which has a formally specified grammar known as the Visual Ontology Query Language. We show that synergy between ontology and image annotations is possible and this method can reduce the gap between image features and high-level semantics by providing the relationships between objects in the image. In this thesis, we conclude that suitable techniques for image retrieval include fusing text accompanying the images with visual features, automatic region tagging and using an ontology to enrich the semantic meaning of the tagged image regions

    The 1993 Goddard Conference on Space Applications of Artificial Intelligence

    Get PDF
    This publication comprises the papers presented at the 1993 Goddard Conference on Space Applications of Artificial Intelligence held at the NASA/Goddard Space Flight Center, Greenbelt, MD on May 10-13, 1993. The purpose of this annual conference is to provide a forum in which current research and development directed at space applications of artificial intelligence can be presented and discussed

    Automatic image annotation and object detection

    Get PDF
    We live in the midst of the information era, during which organising and indexing information more effectively is a matter of essential importance. With the fast development of digital imagery, how to search images - a rich form of information - more efficiently by their content has become one of the biggest challenges. Content-based image retrieval (CBIR) has been the traditional and dominant technique for searching images for decades. However, not until recently have researchers started to realise some vital problems existing in CBIR systems. One of the most important is perhaps what people call the \textit{semantic gap}, which refers to the gap between the information that can be extracted from images and the interpretation of the images for humans. As an attempt to bridge the semantic gap, automatic image annotation has been gaining more and more attentions in recent years. This thesis aims to explore a number of different approaches to automatic image annotation and some related issues. It begins with an introduction into different techniques for image description, which forms the foundation of the research on image auto-annotation. The thesis then goes on to give an in-depth examination of some of the quality issues of the data-set used for evaluating auto-annotation systems. A series of approaches to auto-annotation are presented in the follow-up chapters. Firstly, we describe an approach that incorporates the salient based image representation into a statistical model for better annotation performance. Secondly, we explore the use of non-negative matrix factorisation (NMF), a matrix decomposition tehcnique, for two tasks; object class detection and automatic annotation of images. The results imply that NMF is a promising sub-space technique for these purposes. Finally, we propose a model named the image based feature space (IBFS) model for linking image regions and keywords, and for image auto-annotation. Both image regions and keywords are mapped into the same space in which their relationships can be measured. The idea of multiple segmentations is then implemented in the model, and better results are achieved than using a single segmentation.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Weed/Plant Classification Using Evolutionary Optimised Ensemble Based On Local Binary Patterns

    Get PDF
    This thesis presents a novel pixel-level weed classification through rotation-invariant uniform local binary pattern (LBP) features for precision weed control. Based on two-level optimisation structure; First, Genetic Algorithm (GA) optimisation to select the best rotation-invariant uniform LBP configurations; Second, Covariance Matrix Adaptation Evolution Strategy (CMA-ES) in the Neural Network (NN) ensemble to select the best combinations of voting weights of the predicted outcome for each classifier. The model obtained 87.9% accuracy in CWFID public benchmark
    • …
    corecore