1 research outputs found

    A Ground-truth Training set for Hierarchical Clustering in Content-based Image Retrieval

    No full text
    Progress in Content-Based Image Retrieval (CBIR) is hampered by the absence of well-documented and validated test-sets that provide ground-truth for the performance evaluation of image indexing, retrieval and clustering tasks. For quick access to large (tenthousands or millions of images) digital image collections a hierarchically structured indexing or browsing mechanism based on clusters of similar images at various coarse to fine levels is highly wanted. The Leiden 19th-Century Portrait Database (LCPD), that consists of over 16,000 scanned studio portraits (so-called Cartes de Visite CdV), happens to have a clearly delineated set of clusters in the studio logo backside images. Clusters of similar or semantically identical logos can also be formed on a number of levels that show a clear hierarchy. The Leiden Imaging and Multimedia Group is constructing a CD-ROM with a well-documented set of studio portraits and logos that can serve as ground-truth for feature performance evaluation in domains beside color-indexing. Its grey-level image lay-out characteristics are also described by various precalculated feature vector sets. For both portraits (near copy pairs) and studio logos (clusters of identical logos) test-sets will be provided and described at various clustering levels. The statistically significant number of test-set images embedded in a realistically large environment of narrow-domain images are presented to the CBIR community to enable selection of more optimal indexing and retrieval approaches as part of an internationally defined test-set that comprises test-sets specifically designed for color-, texture- and shape retrieval evaluation
    corecore