227,979 research outputs found

    Hybrid Information Retrieval Model For Web Images

    Full text link
    The Bing Bang of the Internet in the early 90's increased dramatically the number of images being distributed and shared over the web. As a result, image information retrieval systems were developed to index and retrieve image files spread over the Internet. Most of these systems are keyword-based which search for images based on their textual metadata; and thus, they are imprecise as it is vague to describe an image with a human language. Besides, there exist the content-based image retrieval systems which search for images based on their visual information. However, content-based type systems are still immature and not that effective as they suffer from low retrieval recall/precision rate. This paper proposes a new hybrid image information retrieval model for indexing and retrieving web images published in HTML documents. The distinguishing mark of the proposed model is that it is based on both graphical content and textual metadata. The graphical content is denoted by color features and color histogram of the image; while textual metadata are denoted by the terms that surround the image in the HTML document, more particularly, the terms that appear in the tags p, h1, and h2, in addition to the terms that appear in the image's alt attribute, filename, and class-label. Moreover, this paper presents a new term weighting scheme called VTF-IDF short for Variable Term Frequency-Inverse Document Frequency which unlike traditional schemes, it exploits the HTML tag structure and assigns an extra bonus weight for terms that appear within certain particular HTML tags that are correlated to the semantics of the image. Experiments conducted to evaluate the proposed IR model showed a high retrieval precision rate that outpaced other current models.Comment: LACSC - Lebanese Association for Computational Sciences, http://www.lacsc.org/; International Journal of Computer Science & Emerging Technologies (IJCSET), Vol. 3, No. 1, February 201

    A model for mobile content filtering on non-interactive recommendation systems

    Get PDF
    To overcome the problem of information overloading in mobile communication, a recommendation system can be used to help mobile device users. However, there are problems relating to sparsity of information from a first-time user in regard to initial rating of the content and the retrieval of relevant items. In order for the user to experience personalized content delivery via the mobile recommendation system, content filtering is necessary. This paper proposes an integrated method by using classification and association rule techniques for extracting knowledge from mobile content in a user's profile. The knowledge can be used to establish a model for new users and first rater on mobile content. The model recommends relevant content in the early stage during the connection based on the user's profile. The proposed method also facilitates association to be generated to link the first rater items to the top items identified from the outcomes of the classification and clustering processes. This can address the problem of sparsity in initial rating and new user's connection for non-interactive recommendation systems

    Using association rule mining to enrich semantic concepts for video retrieval

    Get PDF
    In order to achieve true content-based information retrieval on video we should analyse and index video with high-level semantic concepts in addition to using user-generated tags and structured metadata like title, date, etc. However the range of such high-level semantic concepts, detected either manually or automatically, usually limited compared to the richness of information content in video and the potential vocabulary of available concepts for indexing. Even though there is work to improve the performance of individual concept classifiers, we should strive to make the best use of whatever partial sets of semantic concept occurrences are available to us. We describe in this paper our method for using association rule mining to automatically enrich the representation of video content through a set of semantic concepts based on concept co-occurrence patterns. We describe our experiments on the TRECVid 2005 video corpus annotated with the 449 concepts of the LSCOM ontology. The evaluation of our results shows the usefulness of our approach

    Association-based image retrieval

    Get PDF
    With advances in the computer technology and the World Wide Web there has been an explosion in the amount and complexity of multimedia data that are generated, stored, transmitted, analyzed, and accessed. In order to extract useful information from this huge amount of data, many content-based image retrieval (CBIR) systems have been developed in the last decade. A typical CBIR system captures image features that represent image properties such as color, texture, or shape of objects in the query image and try to retrieve images from the database with similar features. Recent advances in CBIR systems include relevance feedback based interactive systems. The main advantage of CBIR systems with relevance feedback is that these systems take into account the gap between the high-level concepts and low-level features and subjectivity of human perception of visual content. In this paper, we propose a new approach for image storage and retrieval called association-based image retrieval (ABIR). We try to mimic human memory. The human brain stores and retrieves images by association. We use a generalized bi-directional associative memory (GBAM) to store associations between feature vectors. The results of our simulation are presented in the paper

    Information Retrieval using Context Based Document Indexing

    Get PDF
    Information retrieval is task of retrieving relevant information according to query of user. A brief idea is presented in this paper about document retrieval using context based indexing approach. Here lexical association is used to separate content carrying terms and background terms. Content carrying terms are used as they give idea about theme of the document. Indexing weight calculation is done for content carrying terms. Lexical association measure is used to calculate indexing weight of terms. The term having higher indexing weight is considered as important and sentence which contains these terms is also important. When user enters search query, the important terms are matched with the terms with higher weights in order to retrieve documents. The documents which are relevant are retrieved according to importance of sentences. Using this approach information can be retrieved efficiently

    Information Retrieval Using Context Based Document Indexing and Term Graph

    Get PDF
    Information retrieval is task of retrieving relevant information according to query of user. An idea is presented in this paper about document retrieval using context based indexing and term weighting approach. Here lexical association is used to separate content carrying terms and background terms. Content carrying terms are used as they give idea about theme of the document. Indexing weight calculation is done for content carrying terms. Lexical association measure is used to calculate indexing weight of terms. The term having higher indexing weight is considered as important and sentence which contains these terms is also important. The summary of document is prepared. The graph of word approach is used here for information retrieval. The terms are weighted according to in-degree of vertices in document graph. When user enters search query, the important terms are matched with the terms with higher weights in order to retrieve documents. The documents which are relevant are retrieved according to weight of terms. Weight of term is determined using term graph. Term weight – Inverse document frequency scoring function is used to retrieve relevant documents. Using this approach information can be retrieved efficiently. Performance of retrieval will be improved as time required to search documents is less using proposed approach

    Topic-based mixture language modelling

    Get PDF
    This paper describes an approach for constructing a mixture of language models based on simple statistical notions of semantics using probabilistic models developed for information retrieval. The approach encapsulates corpus-derived semantic information and is able to model varying styles of text. Using such information, the corpus texts are clustered in an unsupervised manner and a mixture of topic-specific language models is automatically created. The principal contribution of this work is to characterise the document space resulting from information retrieval techniques and to demonstrate the approach for mixture language modelling. A comparison is made between manual and automatic clustering in order to elucidate how the global content information is expressed in the space. We also compare (in terms of association with manual clustering and language modelling accuracy) alternative term-weighting schemes and the effect of singular value decomposition dimension reduction (latent semantic analysis). Test set perplexity results using the British National Corpus indicate that the approach can improve the potential of statistical language modelling. Using an adaptive procedure, the conventional model may be tuned to track text data with a slight increase in computational cost

    Context based Document Indexing and Retrieval using Big Data Analytics - A Review

    Get PDF
    In past few years it is observed that the internet usage is been grown wider all over the world, hence, the data generation and usage is been increased rapidly by the users, the data generated in different forms may or may not be structured. The usage of internet by individuals and organizations have been grown so, there is increasing quantity and diversity of digital data in the form of documents, became available to the end users. The Storage, Maintenance and organization of such huge data in databases is a challenging task. So, there is a great need of efficient and effective retrieval technique which focuses on improving the accuracy of document retrieval. In this paper we are going to discuss about document retrieval using context based indexing approach. Here lexical association between terms is used to separate content carrying terms and other-terms. Content carrying terms are used as they give idea about theme of the document. Indexing weight calculation is done for content carrying terms. Lexical association measure is used to calculate indexing weight of terms. The term having higher indexing weight is considered as important and sentence which contains these terms is also important. When user enters search query, the important terms are matched with the terms with higher weights in order to retrieve documents. The explicit semantic relation or frequent co-occurrence of terms is been considered in this context based indexing

    Video Retrieval using Content Based Approach

    Get PDF
    Information retrieval systems are living a significant position in our daily life for getting the necessary information. Many text retrieval systems are accessible and are functioning effectively. Even though internet is complete with media like images, audio and video, retrieval systems for these media are unusual and have not achieved sensation as that of text retrieval systems. Video retrieval systems are helpful in many applications. There is a high request for useful and capable tool for video association and retrieval based on each user's requirement. Videos are divided into set of frames(images) then they are classified into content based image retrieval. We are developing a content based video retrieval system, which makes use of ontology to make the retrieval process intelligent

    A Study of Information Fragment Association in Information Management and Retrieval Applications

    Get PDF
    As we strive to identify useful information sifting through the vast number of resources available to us, we often find that the desired information is residing in a small section within a larger body of content which does not necessarily contain similar information. This can make this Information Fragment difficult to find. A Web search engine may not provide a good ranking to a page of unrelated content if it contains only a very small yet invaluable piece of relevant information. This means that our processes often fail to bring together related Information Fragments. We can easily conceive of two Information Fragments which according to a scholar bear a strong association with each other, yet contain no common keywords enabling them to be collocated by a keyword search.This dissertation attempts to address this issue by determining the benefits of enhancing information management and retrieval applications by providing users with the capability of establishing and storing associations between Information Fragments. It estimates the extent to which the efficiency and quality of information retrieval can be improved if users are allowed to capture mental associations they form while reading Information Fragments and share these associations with others using a functional registry-based design. In order to test these benefits three subject groups were recruited and assigned tasks involving Information Fragments. The first two tasks compared the performance and usability of a mainstream social bookmarking tool with a tool enhanced with Information Fragment Association capabilities. The tests demonstrated that the use of Information Fragment Association offers significant advantages both in the efficiency of retrieval and user satisfaction. Analysis of the results of the third task demonstrated that a mainstream Web search engine performed poorly in collocating interrelated fragments when a query designed to retrieve the one of these fragments was submitted. The fourth task demonstrated that Information Fragment Association improves the precision and recall of searches performed on Information Fragment datasets.The results of this study indicate that mainstream information management and retrieval applications provide inadequate support for Information Fragment retrieval and that their enhancement with Information Fragment Association capabilities would be beneficial
    corecore