2 research outputs found

    Enhancing Automatic Annotation for Optimal Image Retrieval

    Get PDF
    Image search and retrieval based on content is very cumbersome task particularly when the image database is large. The accuracy of the retrieval as well as the processing speed are two important measures used for assessing and comparing the effectiveness of various systems. Text retrieval is more mature and advanced than image content retrieval. In this dissertation, the focus is on converting image content into text tags that can be easily searched using standard search engines where the size and speed issues of the database have been already dealt with. Therefore, image tagging becomes an essential tool for image retrieval from large image databases. Automation of image tagging has received considerable attention by many researchers in recent years. The optimal goal of image description is to automatically annotate images with tags that semantically represent the image content. The speed and accuracy of Image retrieval from large databases are few of the important domains that can benefit from automatic tagging. In this work, several state of the art image classification and image tagging techniques are reviewed. We propose a new self-learning multilayered tagging framework that can address the limitations of current approaches and provide mutual accuracy improvement between the recognition layer and the annotation layer. Our results indicate that the proposed framework can improve the overall accuracy of information retrieval in a variety of image databases

    Multi Domain Semantic Information Retrieval Based on Topic Model

    Get PDF
    Over the last decades, there have been remarkable shifts in the area of Information Retrieval (IR) as huge amount of information is increasingly accumulated on the Web. The gigantic information explosion increases the need for discovering new tools that retrieve meaningful knowledge from various complex information sources. Thus, techniques primarily used to search and extract important information from numerous database sources have been a key challenge in current IR systems. Topic modeling is one of the most recent techniquesthat discover hidden thematic structures from large data collections without human supervision. Several topic models have been proposed in various fields of study and have been utilized extensively for many applications. Latent Dirichlet Allocation (LDA) is the most well-known topic model that generates topics from large corpus of resources, such as text, images, and audio.It has been widely used in many areas in information retrieval and data mining, providing efficient way of identifying latent topics among document collections. However, LDA has a drawback that topic cohesion within a concept is attenuated when estimating infrequently occurring words. Moreover, LDAseems not to consider the meaning of words, but rather to infer hidden topics based on a statisticalapproach. However, LDA can cause either reduction in the quality of topic words or increase in loose relations between topics. In order to solve the previous problems, we propose a domain specific topic model that combines domain concepts with LDA. Two domain specific algorithms are suggested for solving the difficulties associated with LDA. The main strength of our proposed model comes from the fact that it narrows semantic concepts from broad domain knowledge to a specific one which solves the unknown domain problem. Our proposed model is extensively tested on various applications, query expansion, classification, and summarization, to demonstrate the effectiveness of the model. Experimental results show that the proposed model significantly increasesthe performance of applications
    corecore