55 research outputs found

    Concept-based video search with the PicSOM multimedia retrieval system

    Get PDF

    Innehållsbaserad sökning av hierarkiska objekt med PicSOM

    Get PDF
    The amounts of multimedia content available to the public have been increasing rapidly in the last decades and it is expected to grow exponentially in the years to come. This development puts an increasing emphasis on automated content-based information retrieval (CBIR) methods, which index and retrieve multimedia based on its contents. Such methods can automatically process huge amounts of data without the human intervention required by traditional methods (e.g. manual categorisation, entering of keywords). Unfortunately CBIR methods do have a serious problem: the so-called semantic gap between the low-level descriptions used by computer systems and the high-level concepts of humans. However, by emulating human skills such as understanding the contexts and relationships of the multimedia objects one might be able to bridge the semantic gap. To this end, this thesis proposes a method of using hierarchical objects combined with relevance sharing. The proposed method can incorporate natural relationships between multimedia objects and take advantage of these in the retrieval process, hopefully improving the retrieval accuracy considerably. The literature survey part of the thesis consists of a review of content-based information retrieval in general and also looks at multimodal fusion in CBIR systems and how that has been implemented previously in different scenarios. The work performed for this thesis includes the implementation of hierarchical objects and multimodal relevance sharing into the PicSOM CBIR system. Also extensive experiments with different kinds of multimedia and other hierarchical objects (segmented images, web-link structures and video retrieval) were performed to evaluate the usefulness of the hierarchical objects paradigm. Keywords: content-based retrieval, self-organizing map, multimedia database

    Measuring concept similarities in multimedia ontologies: analysis and evaluations

    Get PDF
    The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing

    Content-Based Image Retrieval Using Self-Organizing Maps

    Full text link

    So what can we actually do with content-based video retrieval?

    Get PDF
    In this talk I will give a roller-coaster survey of the state of the art in automatic video analysis, indexing, summarisation, search and browsing as demonstrated in the annual TRECVid benchmarking evaluation campaign. I will concentrate on content-based techniques for video management which form a complement to the dominant paradigm of metadata or tag-based video management and I will use example techniques to illustrate these

    Evaluation of pointer click relevance feedback in PicSOM : deliverable D1.2 of FP7 project nº 216529 PinView

    Get PDF
    This report presents the results of a series of experiments where knowledge of the most relevant part of images is given as additional information to a content-based image retrieval system. The most relevant parts have been identified by search-task-dependent pointer clicks on the images. As such they provide a rudimentary form of explicit enriched relevance feedback and to some extent mimic genuine implicit eye movement measurements which are essential ingredients of the PinView project

    Discriminative learning with application to interactive facial image retrieval

    Get PDF
    The amount of digital images is growing drastically and advanced tools for searching in large image collections are therefore becoming urgently needed. Content-based image retrieval is advantageous for such a task in terms of automatic feature extraction and indexing without human labor and subjectivity in image annotations. The semantic gap between high-level semantics and low-level visual features can be reduced by the relevance feedback technique. However, most existing interactive content-based image retrieval (ICBIR) systems require a substantial amount of human evaluation labor, which leads to the evaluation fatigue problem that heavily restricts the application of ICBIR. In this thesis a solution based on discriminative learning is presented. It extends an existing ICBIR system, PicSOM, towards practical applications. The enhanced ICBIR system allows users to input partial relevance which includes not only relevance extent but also relevance reason. A multi-phase retrieval with partial relevance can adapt to the user's searching intention in a from-coarse-to-fine manner. The retrieval performance can be improved by employing supervised learning as a preprocessing step before unsupervised content-based indexing. In this work, Parzen Discriminant Analysis (PDA) is proposed to extract discriminative components from images. PDA regularizes the Informative Discriminant Analysis (IDA) objective with a greatly accelerated optimization algorithm. Moreover, discriminative Self-Organizing Maps trained with resulting features can easily handle fuzzy categorizations. The proposed techniques have been applied to interactive facial image retrieval. Both a query example and a benchmark simulation study are presented, which indicate that the first image depicting the target subject can be retrieved in a small number of rounds

    Object identification within images

    Get PDF
    Mestrado em Engenharia de Computadores e TelemáticaO aumento de conteúdo digital armazenado em bases de dados é acompanhado por uma elevada importância atribuída à disponibilização de métodos eficientes para a sua pesquisa. No caso da pesquisa de imagens, esta é, normalmente, realizada através de “keywords”, o que, nem sempre garante resultados satisfatórios, uma vez que as “imagens estão para além das palavras”. Para melhorar este tema é necessário avaliar o conteúdo de cada imagem. Este trabalho propõem-se a divulgar um sistema que, inicialmente, de todas as imagens presentes numa base de dados, obtenha um conjunto de elevada qualidade para posterior processamento. Este método baseia-se na analise do histograma de cada imagem e respectiva distribuição dos contornos de cada objecto presente na mesma. A este conjunto de imagens obtido, para cada instância, são extraídas características que a identifiquem. Este passo, baseia-se na segmentação de imagens e classificação de características através de uma rede neuronal. Para testar a eficiência do método apresentado nesta tese, é feita a comparação entre as características de cada imagem com as restantes, e respectiva devolução de uma lista de imagens, ordenada por ordem decrescente de semelhança. Os nossos resultados provam que o nosso sistema pode produzir melhores resultados do que alguns sistemas existentes. ABSTRACT: The rise of digital content stored in large databases increased the importance of efficient algorithms for information retrieval. These algorithms are, usually, based on keywords which, for image retrieval, do not work properly, since “images are beyond words”. In order to improve image retrieval it is necessary to analyze the contents of each image. This work proposes a system that, firstly, will get a subset of high quality images from the entire database, which will help in further processing. This first method is based in the histogram and edge analysis. In a next method, for each element of the image set obtained, features are extracted. These features will identify each image in the database. In this step, an image segmentation technique and a classification with a neural network are used. This feature extraction process is tested doing comparison between each image features and all the target ones. Each image is associated with a list of images ordered by a similarity level, which allows us to conclude that our system produces better results than some other systems available

    Video-4-Video: using video for searching, classifying and summarising video

    Get PDF
    YouTube has meant that we are now becoming accustomed to searching for video clips, and finding them, for both work and leisure pursuits. But YouTube, like the Internet Archive, OpenVideo and almost everything other video library, doesn't use video to find video, it uses metadata, usually based on user generated content (UGC). But what if we don't know what we're looking for and the metadata doesn't help, or we have poor metadata or no UGC, can we use the video to find video ? Can we automatically derive semantic concepts directly from video which we can use for retrieval or summarisation ? Many dozens of research groups throughout the world work on the problems associated with content-based video search, content-based detection of semantic concepts, shot boundary detection, content-based summarisation and content-based event detection. In this presentation we give a summary of the achievements of almost a decade of research by the TRECVid community, including a report on performance of groups in different TRECVid tasks. We present the modus operandi of the annual TRECVid benchmarking, the problems associated with running an annual evaluation for nearly 100 research groups every year and an overview of the most successful approaches to each task
    corecore