165 research outputs found

    Hybrid image representation methods for automatic image annotation: a survey

    Get PDF
    In most automatic image annotation systems, images are represented with low level features using either global methods or local methods. In global methods, the entire image is used as a unit. Local methods divide images into blocks where fixed-size sub-image blocks are adopted as sub-units; or into regions by using segmented regions as sub-units in images. In contrast to typical automatic image annotation methods that use either global or local features exclusively, several recent methods have considered incorporating the two kinds of information, and believe that the combination of the two levels of features is beneficial in annotating images. In this paper, we provide a survey on automatic image annotation techniques according to one aspect: feature extraction, and, in order to complement existing surveys in literature, we focus on the emerging image annotation methods: hybrid methods that combine both global and local features for image representation

    Image Labeling on a Network: Using Social-Network Metadata for Image Classification

    Full text link
    Large-scale image retrieval benchmarks invariably consist of images from the Web. Many of these benchmarks are derived from online photo sharing networks, like Flickr, which in addition to hosting images also provide a highly interactive social community. Such communities generate rich metadata that can naturally be harnessed for image classification and retrieval. Here we study four popular benchmark datasets, extending them with social-network metadata, such as the groups to which each image belongs, the comment thread associated with the image, who uploaded it, their location, and their network of friends. Since these types of data are inherently relational, we propose a model that explicitly accounts for the interdependencies between images sharing common properties. We model the task as a binary labeling problem on a network, and use structured learning techniques to learn model parameters. We find that social-network metadata are useful in a variety of classification tasks, in many cases outperforming methods based on image content.Comment: ECCV 2012; 14 pages, 4 figure

    Analysis of Censored Sample Population with GA-SVM

    Get PDF
    This paper is intended to propose a class of shrunken estimators for kth power of scale parameter in censored samples from one-parameter exponential population when some apriori or guessed value of the parameter is available besides the sample information and analyses their properties. The proposed class of Shrunken estimator is compared with usual unbiased estimator and minimum mean square error (MMSE) estimator. Eventually, empirical study is carried out to exhibit the performance of some Shrunken estimators of the proposed class over the MSME estimator. It is found that certain of these estimators substantially improve the classical estimators even for the guessed values of the kth power of scale parameter much away from the true value, specially for censored samples with small sizes

    In Defense of MinHash Over SimHash

    Full text link
    MinHash and SimHash are the two widely adopted Locality Sensitive Hashing (LSH) algorithms for large-scale data processing applications. Deciding which LSH to use for a particular problem at hand is an important question, which has no clear answer in the existing literature. In this study, we provide a theoretical answer (validated by experiments) that MinHash virtually always outperforms SimHash when the data are binary, as common in practice such as search. The collision probability of MinHash is a function of resemblance similarity (R\mathcal{R}), while the collision probability of SimHash is a function of cosine similarity (S\mathcal{S}). To provide a common basis for comparison, we evaluate retrieval results in terms of S\mathcal{S} for both MinHash and SimHash. This evaluation is valid as we can prove that MinHash is a valid LSH with respect to S\mathcal{S}, by using a general inequality S2≤R≤S2−S\mathcal{S}^2\leq \mathcal{R}\leq \frac{\mathcal{S}}{2-\mathcal{S}}. Our worst case analysis can show that MinHash significantly outperforms SimHash in high similarity region. Interestingly, our intensive experiments reveal that MinHash is also substantially better than SimHash even in datasets where most of the data points are not too similar to each other. This is partly because, in practical data, often R≥Sz−S\mathcal{R}\geq \frac{\mathcal{S}}{z-\mathcal{S}} holds where zz is only slightly larger than 2 (e.g., z≤2.1z\leq 2.1). Our restricted worst case analysis by assuming Sz−S≤R≤S2−S\frac{\mathcal{S}}{z-\mathcal{S}}\leq \mathcal{R}\leq \frac{\mathcal{S}}{2-\mathcal{S}} shows that MinHash indeed significantly outperforms SimHash even in low similarity region. We believe the results in this paper will provide valuable guidelines for search in practice, especially when the data are sparse
    • …
    corecore