569 research outputs found

    When images work faster than words: The integration of content-based image retrieval with the Northumbria Watermark Archive

    Get PDF
    Information on the manufacture, history, provenance, identification, care and conservation of paper-based artwork/objects is disparate and not always readily available. The Northumbria Watermark Archive will incorporate such material into a database, which will be made freely available on the Internet providing an invaluable resource for conservation, research and education. The efficiency of a database is highly dependant on its search mechanism. Text based mechanisms are frequently ineffective when a range of descriptive terminologies might be used i.e. when describing images or translating from foreign languages. In such cases a Content Based Image Retrieval (CBIR) system can be more effective. Watermarks provide paper with unique visual identification characteristics and have been used to provide a point of entry to the archive that is more efficient and effective than a text based search mechanism. The research carried out has the potential to be applied to any numerically large collection of images with distinctive features of colour, shape or texture i.e. coins, architectural features, picture frame profiles, hallmarks, Japanese artists stamps etc. Although the establishment of an electronic archive incorporating a CBIR system can undoubtedly improve access to large collections of images and related data, the development is rarely trouble free. This paper discusses some of the issues that must be considered i.e. collaboration between disciplines; project management; copying and digitising objects; content based image retrieval; the Northumbria Watermark Archive; the use of standardised terminology within a database as well as copyright issues

    Plant image retrieval using color, shape and texture features

    Get PDF
    We present a content-based image retrieval system for plant image retrieval, intended especially for the house plant identification problem. A plant image consists of a collection of overlapping leaves and possibly flowers, which makes the problem challenging.We studied the suitability of various well-known color, shape and texture features for this problem, as well as introducing some new texture matching techniques and shape features. Feature extraction is applied after segmenting the plant region from the background using the max-flow min-cut technique. Results on a database of 380 plant images belonging to 78 different types of plants show promise of the proposed new techniques and the overall system: in 55% of the queries, the correct plant image is retrieved among the top-15 results. Furthermore, the accuracy goes up to 73% when a 132-image subset of well-segmented plant images are considered

    SVS-JOIN : efficient spatial visual similarity join for geo-multimedia

    Get PDF
    In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently

    A Privacy-Preserving Framework for Large-Scale Content-Based Information Retrieval Using K-Secure Sum Protocol

    Get PDF
    We propose a privacy protection framework for large-scale content-based information retrieval. It offers two layers of protection. To begin with, robust hash values are utilized as quiries to avoid uncovering unique content or features. Second, the customer can choose to exclude certain bits in a hash values to further expand the ambiguity for the server. Due to the reduced information, it is computationally difficult for the server to know the customer's interest. The server needs to give back the hash values of every single possible to the customer. The customer performs a search within the candidate list to locate the best match. Since just hash values are exchanged between the client and the server, the privacy of both sides is ensured. We present the idea of tunable privacy, where the privacy protection level can be balanced by policy. It is acknowledged through hash-based piecewise inverted indexing. The thought is to gap a highlight vector into pieces and list every piece with a sub hash value. Each sub hash value is connected with an inverted index list. The framework has been broadly tested using a large scale image database. We have assessed both retrieval performance and privacy-preserving performance for a specific content identification application. Two unique developments of robust hash algorithms are utilized. One depends on random projections; the other depends on the discrete wavelet transform. Both algorithm exhibit satisfactory performances in comparison with state-of-the-art retrieval performances. The outcomes demonstrate that the privacy upgrade somewhat enhances the retrieval performance. We consider the majority voting attack for evaluating the query category and identification. The test results demonstrate that this attack is a threat when there are close duplicities, yet the achievement rate diminishes with the quantity of discarded bits and the number of distinct items

    Digital Image Users and Reuse: Enhancing practitioner discoverability of digital library reuse based on user file naming behavior

    Get PDF
    Diese Dissertation untersucht Geräte, die Praktiker verwenden, um die Wiederverwendung von digitalen Bibliotheksmaterialien zu entdecken. Der Autor führt zwei Verifikationsstudien durch, in denen zwei zuvor angewandte Strategien untersucht werden, die Praktiker verwenden, um die Wiederverwendung digitaler Objekte zu identifizieren, insbesondere Google Images Reverse Image Lookup (RIL) und eingebettete Metadaten. Es beschreibt diese Strategiebeschränkungen und bietet einen neuen, einzigartigen Ansatz zur Verfolgung der Wiederverwendung, indem der Suchansatz des Autors basierend auf dem Benennungsverhalten von Benutzerdateien verwendet wird. Bei der Untersuchung des Nutzens und der Einschränkungen von Google Images und eingebetteten Metadaten beobachtet und dokumentiert der Autor ein Muster des Benennungsverhaltens von Benutzerdateien, das vielversprechend ist, die Wiederverwendung durch den Praktiker zu verbessern. Der Autor führt eine Untersuchung zur Bewertung der Dateibenennung durch, um dieses Muster des Verhaltens der Benutzerdateibenennung und die Auswirkungen der Dateibenennung auf die Suchmaschinenoptimierung zu untersuchen. Der Autor leitet mehrere signifikante Ergebnisse ab, während er diese Studie fertigstellt. Der Autor stellt fest, dass Google Bilder aufgrund der Änderung des Algorithmus kein brauchbares Werkzeug mehr ist, um die Wiederverwendung durch die breite Öffentlichkeit oder andere Benutzer zu entdecken, mit Ausnahme von Benutzern aus der Industrie. Eingebettete Metadaten sind aufgrund der nicht persistenten Natur eingebetteter Metadaten kein zuverlässiges Bewertungsinstrument. Der Autor stellt fest, dass viele Benutzer ihre eigenen Dateinamen generieren, die beim Speichern und Teilen von digitalen Bildern fast ausschließlich für Menschen lesbar sind. Der Autor argumentiert, dass, wenn Praktiker Suchbegriffe nach den "aggregierten Dateinamen" modellieren, sie ihre Entdeckung wiederverwendeter digitaler Objekte erhöhen.This dissertation explores devices practitioners utilize to discover the reuse of digital library materials. The author performs two verification studies investigating two previously employed strategies that practitioners use to identify digital object reuse, specifically Google Images reverse image lookup (RIL) and embedded metadata. It describes these strategy limitations and offers a new, unique approach for tracking reuse by employing the author's search approach based on user file naming behavior. While exploring the utility and limitations of Google Images and embedded metadata, the author observes and documents a pattern of user file naming behavior that exhibits promise for improving practitioner's discoverability of reuse. The author conducts a file naming assessment investigation to examine this pattern of user file naming behavior and the impact of file naming on search engine optimization. The author derives several significant findings while completing this study. The author establishes that Google Images is no longer a viable tool to discover reuse by the general public or other users except for industry users because of its algorithm change. Embedded metadata is not a reliable assessment tool because of the non-persistent nature of embedded metadata. The author finds that many users generate their own file names, almost exclusively human-readable when saving and sharing digital images. The author argues that when practitioners model search terms after the "aggregated file names" they increase their discovery of reused digital objects
    • …
    corecore