17,345 research outputs found

    Image Annotation and Retrieval with Generalized Gaussian Mixture Model Algorithm and Split Merge Expectation Maximalization Algorithm

    Get PDF
    ABSTRAKSI: Riset yang berhubungan tentang Image Annotation and Retrieval telah sangat berkembang saat ini. Dimulai dengan mimpi tentang bagaimana cara untuk mengorganisasikan sekumpulan citra skala besar tanpa melihat dulu isi citra tersebut, dan pada tahun 90 an muncul ide untuk mengorganisasikan citra dengan melihat isi dari citra tersebut atau lebih sering disebut dengan Content-Based Image Retrieval (CBIR). Salah satu metode baru yang dapat digunakan untuk meretrieve citra adalah Supervised Learning of Semantic Classes. Dalam membentuk model matematis, Supervised Learning standar menggunakan Gaussian Mixture Model dan Expectation Maximaliztion untuk Maximum Likelihood Estimationnya. Dalam tugas akhir ini, penulis berusaha mengganti model matematis pada Supervised Learning tersebut menggunakan Generalized Gaussian Mixture Model untuk mixture model-nya dan Split Merge Expectation Maximaliztion untuk Maximum Likelihood Estimation-nya. Berdasarkan hasil uji, secara umum metode Supervised Learning dengan GGMM-SMEM menghasilkan citra retrieve yang lebih akurat dibanding dengan menggunakan GMM-EMKata Kunci : content based image retrieval, image annotation, image retrieval, supervised learning of semantic classes, gaussian mixture model, generalized gaussian mixture model, expectation maximalization, split merge expectation maximalization, maximum likelihood eABSTRACT: Research about Image Annotation and Retrieval now is more developed. Starting from dream about how to organize a group of large scale images without knowing it\u27s content and at early 90\u27s there is an idea to organize images with knowing it\u27s content, it\u27s call Content-Based Image Retrieval (CBIR). One of new method that can use to retrieve image is Supervised Learning of Sematic Classes. In making mathematic models, Supervised Learning standard using Gaussian Mixture Model for mixture model and Expectation Maximaliztion for Maximum Likelihood Estimation. In this paper, author want to change how to make the mathematic models in Supervised Learning with Generalized Gaussian Mixture Model and Split Merge Expectation Maximaliztion for Maximum Likelihood Estimation. Based on result, generaly SML method with GGMM-SMEM retrieve images more accurate than SML method with GMM-EMKeyword: content based image retrieval, image annotation, image retrieval, supervised learning of semantic classes, gaussian mixture model, generalized gaussian mixture model, expectation maximalization, split merge expectation maximalization, maximum likelihood

    Using segmented objects in ostensive video shot retrieval

    Get PDF
    This paper presents a system for video shot retrieval in which shots are retrieved based on matching video objects using a combination of colour, shape and texture. Rather than matching on individual objects, our system supports sets of query objects which in total reflect the user’s object-based information need. Our work also adapts to a shifting user information need by initiating the partitioning of a user’s search into two or more distinct search threads, which can be followed by the user in sequence. This is an automatic process which maps neatly to the ostensive model for information retrieval in that it allows a user to place a virtual checkpoint on their search, explore one thread or aspect of their information need and then return to that checkpoint to then explore an alternative thread. Our system is fully functional and operational and in this paper we illustrate several design decisions we have made in building it

    Fisher Vectors Derived from Hybrid Gaussian-Laplacian Mixture Models for Image Annotation

    Full text link
    In the traditional object recognition pipeline, descriptors are densely sampled over an image, pooled into a high dimensional non-linear representation and then passed to a classifier. In recent years, Fisher Vectors have proven empirically to be the leading representation for a large variety of applications. The Fisher Vector is typically taken as the gradients of the log-likelihood of descriptors, with respect to the parameters of a Gaussian Mixture Model (GMM). Motivated by the assumption that different distributions should be applied for different datasets, we present two other Mixture Models and derive their Expectation-Maximization and Fisher Vector expressions. The first is a Laplacian Mixture Model (LMM), which is based on the Laplacian distribution. The second Mixture Model presented is a Hybrid Gaussian-Laplacian Mixture Model (HGLMM) which is based on a weighted geometric mean of the Gaussian and Laplacian distribution. An interesting property of the Expectation-Maximization algorithm for the latter is that in the maximization step, each dimension in each component is chosen to be either a Gaussian or a Laplacian. Finally, by using the new Fisher Vectors derived from HGLMMs, we achieve state-of-the-art results for both the image annotation and the image search by a sentence tasks.Comment: new version includes text synthesis by an RNN and experiments with the COCO benchmar

    Large Scale Visual Recommendations From Street Fashion Images

    Full text link
    We describe a completely automated large scale visual recommendation system for fashion. Our focus is to efficiently harness the availability of large quantities of online fashion images and their rich meta-data. Specifically, we propose four data driven models in the form of Complementary Nearest Neighbor Consensus, Gaussian Mixture Models, Texture Agnostic Retrieval and Markov Chain LDA for solving this problem. We analyze relative merits and pitfalls of these algorithms through extensive experimentation on a large-scale data set and baseline them against existing ideas from color science. We also illustrate key fashion insights learned through these experiments and show how they can be employed to design better recommendation systems. Finally, we also outline a large-scale annotated data set of fashion images (Fashion-136K) that can be exploited for future vision research
    corecore