77,945 research outputs found

    MIRACLE-FI at ImageCLEFphoto 2008: Experiences in merging text-based and content-based retrievals

    Get PDF
    This paper describes the participation of the MIRACLE consortium at the ImageCLEF Photographic Retrieval task of ImageCLEF 2008. In this is new participation of the group, our first purpose is to evaluate our own tools for text-based retrieval and for content-based retrieval using different similarity metrics and the aggregation OWA operator to fuse the three topic images. From the MIRACLE last year experience, we implemented a new merging module combining the text-based and the content-based information in three different ways: FILTER-N, ENRICH and TEXT-FILTER. The former approaches try to improve the text-based baseline results using the content-based results lists. The last one was used to select the relevant images to the content-based module. No clustering strategies were analyzed. Finally, 41 runs were submitted: 1 for the text-based baseline, 10 content-based runs, and 30 mixed experiments merging text and content-based results. Results in general can be considered nearly acceptable comparing with the best results of other groups. Obtained results from textbased retrieval are better than content-based. Merging both textual and visual retrieval we improve the text-based baseline when applying the ENRICH merging algorithm although visual results are lower than textual ones. From these results we were going to try to improve merged results by clustering methods applied to this image collection

    FĂ­schlĂĄr-TRECVid2004: Combined text- and image-based searching of video archives

    Get PDF
    The Fischlar-TRECVid-2004 system was developed for Dublin City University's participation in the 2004 TRECVid video information retrieval benchmarking activity. The system allows search and retrieval of video shots from over 60 hours of content. The shot retrieval engine employed is based on a combination of query text matched against spoken dialogue combined with image-image matching where a still image (sourced externally), or a keyframe (from within the video archive itself), is matched against all keyframes in the video archive. Three separate text retrieval engines are employed for closed caption text, automatic speech recognition and video OCR. Visual shot matching is primarily based on MPEG-7 low-level descriptors. The system supports relevance feedback at the shot level enabling augmentation and refinement using relevant shots located by the user. Two variants of the system were developed, one that supports both text- and image-based searching and one that supports image only search. A user evaluation experiment compared the use of the two systems. Results show that while the system combining text- and image-based searching achieves greater retrieval effectiveness, users make more varied and extensive queries with the image only based searching version

    Applying digital content management to support localisation

    Get PDF
    The retrieval and presentation of digital content such as that on the World Wide Web (WWW) is a substantial area of research. While recent years have seen huge expansion in the size of web-based archives that can be searched efficiently by commercial search engines, the presentation of potentially relevant content is still limited to ranked document lists represented by simple text snippets or image keyframe surrogates. There is expanding interest in techniques to personalise the presentation of content to improve the richness and effectiveness of the user experience. One of the most significant challenges to achieving this is the increasingly multilingual nature of this data, and the need to provide suitably localised responses to users based on this content. The Digital Content Management (DCM) track of the Centre for Next Generation Localisation (CNGL) is seeking to develop technologies to support advanced personalised access and presentation of information by combining elements from the existing research areas of Adaptive Hypermedia and Information Retrieval. The combination of these technologies is intended to produce significant improvements in the way users access information. We review key features of these technologies and introduce early ideas for how these technologies can support localisation and localised content before concluding with some impressions of future directions in DCM

    Information-Theoretic Active Learning for Content-Based Image Retrieval

    Full text link
    We propose Information-Theoretic Active Learning (ITAL), a novel batch-mode active learning method for binary classification, and apply it for acquiring meaningful user feedback in the context of content-based image retrieval. Instead of combining different heuristics such as uncertainty, diversity, or density, our method is based on maximizing the mutual information between the predicted relevance of the images and the expected user feedback regarding the selected batch. We propose suitable approximations to this computationally demanding problem and also integrate an explicit model of user behavior that accounts for possible incorrect labels and unnameable instances. Furthermore, our approach does not only take the structure of the data but also the expected model output change caused by the user feedback into account. In contrast to other methods, ITAL turns out to be highly flexible and provides state-of-the-art performance across various datasets, such as MIRFLICKR and ImageNet.Comment: GCPR 2018 paper (14 pages text + 2 pages references + 6 pages appendix

    Exploiting multimedia content : a machine learning based approach

    Get PDF
    Advisors: Prof. M Gopal, Prof. Santanu Chaudhury. Date and location of PhD thesis defense: 10 September 2013, Indian Institute of Technology DelhiThis thesis explores use of machine learning for multimedia content management involving single/multiple features, modalities and concepts. We introduce shape based feature for binary patterns and apply it for recognition and retrieval application in single and multiple feature based architecture. The multiple feature based recognition and retrieval frameworks are based on the theory of multiple kernel learning (MKL). A binary pattern recognition framework is presented by combining the binary MKL classifiers using a decision directed acyclic graph. The evaluation is shown for Indian script character recognition, and MPEG7 shape symbol recognition. A word image based document indexing framework is presented using the distance based hashing (DBH) defined on learned pivot centres. We use a new multi-kernel learning scheme using a Genetic Algorithm for developing a kernel DBH based document image retrieval system. The experimental evaluation is presented on document collections of Devanagari, Bengali and English scripts. Next, methods for document retrieval using multi-modal information fusion are presented. Text/Graphics segmentation framework is presented for documents having a complex layout. We present a novel multi-modal document retrieval framework using the segmented regions. The approach is evaluated on English magazine pages. A document script identification framework is presented using decision level aggregation of page, paragraph and word level prediction. Latent Dirichlet Allocation based topic modelling with modified edit distance is introduced for the retrieval of documents having recognition inaccuracies. A multi-modal indexing framework for such documents is presented by a learning based combination of text and image based properties. Experimental results are shown on Devanagari script documents. Finally, we have investigated concept based approaches for multimedia analysis. A multi-modal document retrieval framework is presented by combining the generative and discriminative modelling for exploiting the cross-modal correlation between modalities. The combination is also explored for semantic concept recognition using multi-modal components of the same document, and different documents over a collection. An experimental evaluation of the framework is shown for semantic event detection in sport videos, and semantic labelling of components of multi-modal document images

    Combining textual and visual information processing for interactive video retrieval: SCHEMA's participation in TRECVID 2004

    Get PDF
    In this paper, the two different applications based on the Schema Reference System that were developed by the SCHEMA NoE for participation to the search task of TRECVID 2004 are illustrated. The first application, named ”Schema-Text”, is an interactive retrieval application that employs only textual information while the second one, named ”Schema-XM”, is an extension of the former, employing algorithms and methods for combining textual, visual and higher level information. Two runs for each application were submitted, I A 2 SCHEMA-Text 3, I A 2 SCHEMA-Text 4 for Schema-Text and I A 2 SCHEMA-XM 1, I A 2 SCHEMA-XM 2 for Schema-XM. The comparison of these two applications in terms of retrieval efficiency revealed that the combination of information from different data sources can provide higher efficiency for retrieval systems. Experimental testing additionally revealed that initially performing a text-based query and subsequently proceeding with visual similarity search using one of the returned relevant keyframes as an example image is a good scheme for combining visual and textual information

    Overview of the ImageCLEFphoto 2008 photographic retrieval task

    Get PDF
    ImageCLEFphoto 2008 is an ad-hoc photo retrieval task and part of the ImageCLEF evaluation campaign. This task provides both the resources and the framework necessary to perform comparative laboratory-style evaluation of visual information retrieval systems. In 2008, the evaluation task concentrated on promoting diversity within the top 20 results from a multilingual image collection. This new challenge attracted a record number of submissions: a total of 24 participating groups submitting 1,042 system runs. Some of the findings include that the choice of annotation language is almost negligible and the best runs are by combining concept and content-based retrieval methods

    An Efficient Gabor Walsh-Hadamard Transform Based Approach for Retrieving Brain Tumor Images from MRI

    Get PDF
    Brain tumors are a serious and death-defying disease for human life. Discovering an appropriate brain tumor image from a magnetic resonance imaging (MRI) archive is a challenging job for the radiologist. Most search engines retrieve images on the basis of traditional text-based approaches. The main challenge in the MRI image analysis is that low-level visual information captured by the MRI machine and the high-level information identified by the assessor. This semantic gap is addressed in this study by designing a new feature extraction technique. In this paper, we introduce Content-Based Medical Image retrieval (CBMIR) system for retrieval of brain tumor images from the large data. Firstly, we remove noise from MRI images employing several filtering techniques. Afterward, we design a feature extraction scheme combining Gabor filtering technique (which is mainly focused on specific frequency content at the image region) and Walsh-Hadamard transform (WHT) (conquer technique for easy configuration of image) for discovering representative features from MRI images. After that, for retrieving the accurate and reliable image, we employ Fuzzy C-Means clustering Minkowski distance metric that can evaluate the similarity between the query image and database images. The proposed methodology design was tested on a publicly available brain tumor MRI image database. The experimental results demonstrate that our proposed approach outperforms most of the existing techniques like Gabor, wavelet, and Hough transform in detecting brain tumors and also take less time. The proposed approach will be beneficial for radiologists and also for technologists to build an automatic decision support system that will produce reproducible and objective results with high accuracy

    Overview of the 2005 cross-language image retrieval track (ImageCLEF)

    Get PDF
    The purpose of this paper is to outline efforts from the 2005 CLEF crosslanguage image retrieval campaign (ImageCLEF). The aim of this CLEF track is to explore the use of both text and content-based retrieval methods for cross-language image retrieval. Four tasks were offered in the ImageCLEF track: a ad-hoc retrieval from an historic photographic collection, ad-hoc retrieval from a medical collection, an automatic image annotation task, and a user-centered (interactive) evaluation task that is explained in the iCLEF summary. 24 research groups from a variety of backgrounds and nationalities (14 countries) participated in ImageCLEF. In this paper we describe the ImageCLEF tasks, submissions from participating groups and summarise the main fndings

    Dublin City University at CLEF 2005: Experiments with the ImageCLEF St Andrew’s collection

    Get PDF
    The aim of the Dublin City University participation in the CLEF 2005 ImageCLEF St Andrew’s Collection task was to explore an alternative approach to exploiting text annotation and content-based retrieval in a novel combined way for pseudo relevance feedback (PRF). This method combines evidence from retrieved lists generated using text and content-based retrieval to determine which documents will be assumed relevant for the PRF process. Unfortunately the results show that while standard textbased PRF improves upon a no feedback text baseline, at present our new approach to combining evidence from text and content-based retrieval does not give further improve improvement
    • 

    corecore