8 research outputs found

    Overview of the wikipediaMM task at ImageCLEF 2008

    Get PDF
    The wikipediaMM task provides a testbed for the system-oriented evaluation of ad-hoc retrieval from a large collection of Wikipedia images. It became a part of the ImageCLEF evaluation campaign in 2008 with the aim of investigating the use of visual and textual sources in combination for improving the retrieval performance. This paper presents an overview of the task¿s resources, topics, assessments, participants' approaches, and main results

    Document expansion for image retrieval

    Get PDF
    Successful information retrieval requires e�ective matching between the user's search request and the contents of relevant documents. Often the request entered by a user may not use the same topic relevant terms as the authors' of the documents. One potential approach to address problems of query-document term mismatch is document expansion to include additional topically relevant indexing terms in a document which may encourage its retrieval when relevant to queries which do not match its original contents well. We propose and evaluate a new document expansion method using external resources. While results of previous research have been inconclusive in determining the impact of document expansion on retrieval e�ectiveness, our method is shown to work e�ectively for text-based image retrieval of short image annotation documents. Our approach uses the Okapi query expansion algorithm as a method for document expansion. We further show improved performance can be achieved by using a \document reduction" approach to include only the signi�cant terms in a document in the expansion process. Our experiments on the WikipediaMM task at ImageCLEF 2008 show an increase of 16.5% in mean average precision (MAP) compared to a variation of Okapi BM25 retrieval model. To compare document expansion with query expansion, we also test query expansion from an external resource which leads an improvement by 9.84% in MAP over our baseline. Our conclusion is that the document expansion with document reduction and in combination with query expansion produces the overall best retrieval results for shortlength document retrieval. For this image retrieval task, we also concluded that query expansion from external resource does not outperform the document expansion method

    From XML Retrieval to Semantic Search and Beyond:The INEX, SBS, and MC2 Labs of CLEF 2012-2018

    Get PDF

    Utilizing external resources for enriching information retrieval

    Get PDF
    Information retrieval (IR) seeks to support users in finding information relevant to their information needs. One obstacle for many IR algorithms to achieve better results in many IR tasks is that there is insufficient information available to enable relevant content to be identified. For example, users typically enter very short queries, in text-based image retrieval where textual annotations often describe the content of the images inadequately, or there is insufficient user log data for personalization of the search process. This thesis explores the problem of inadequate data in IR tasks. We propose methods for Enriching Information Retrieval (ENIR) which address various challenges relating to insufficient data in IR. Applying standard methods to address these problems can face unexpected challenges. For example, standard query expansion methods assume that the target collection contains sufficient data to be able to identify relevant terms to add to the original query to improve retrieval effectiveness. In the case of short documents, this assumption is not valid. One strategy to address this problem is document side expansion which has been largely overlooked in the past research. Similarly, topic modeling in personalized search often lacks the knowledge required to form adequate models leading to mismatch problems when trying to apply these models improve search. This thesis focuses on methods of ENIR for tasks affected by problems of insufficient data. To achieve ENIR, our overall solution is to include external resources for ENIR. This research focuses on developing methods for two typical ENIR tasks: text-based image retrieval and personalized web data search. In this research, the main relevant areas within existing IR research are relevance feedback and personalized modeling. ENIR is shown to be effective to augment existing knowledge in these classical areas. The areas of relevance feedback and personalized modeling are strongly correlated since user modeling and document modeling in personalized retrieval enrich the data from both sides of the query and document, which is similar to query and document expansion in relevance feedback. Enriching IR is the key challenge in these areas for IR. By addressing these two research areas, this thesis provides a prototype for an external resource based search solution. The experimental results show external resources can play a key role in enriching IR
    corecore