83,868 research outputs found

    Post processing of multimedia information - concepts, problems, and techniques

    Full text link
    Currently, most research work on multimedia information processing is focused on multimedia information storage and retrieval, especially indexing and content-based access of multimedia information. We consider multimedia information processing should include one more level-post-processing. Here &quot;post-processing&quot; means further processing of retrieved multimedia information, which includes fusion of multimedia information and reasoning with multimedia information to reach new conclusions. In this paper, the three levels of multimedia information processing storage, retrieval, and post-processing- are discussed. The concepts and problems of multimedia information post-processing are identified. Potential techniques that can be used in post-processing are suggested, By highlighting the problems in multimedia information post-processing, hopefully this paper will stimulate further research on this important but ignored topic.<br /

    An affect-based video retrieval system with open vocabulary querying

    Get PDF
    Content-based video retrieval systems (CBVR) are creating new search and browse capabilities using metadata describing significant features of the data. An often overlooked aspect of human interpretation of multimedia data is the affective dimension. Incorporating affective information into multimedia metadata can potentially enable search using this alternative interpretation of multimedia content. Recent work has described methods to automatically assign affective labels to multimedia data using various approaches. However, the subjective and imprecise nature of affective labels makes it difficult to bridge the semantic gap between system-detected labels and user expression of information requirements in multimedia retrieval. We present a novel affect-based video retrieval system incorporating an open-vocabulary query stage based on WordNet enabling search using an unrestricted query vocabulary. The system performs automatic annotation of video data with labels of well defined affective terms. In retrieval annotated documents are ranked using the standard Okapi retrieval model based on open-vocabulary text queries. We present experimental results examining the behaviour of the system for retrieval of a collection of automatically annotated feature films of different genres. Our results indicate that affective annotation can potentially provide useful augmentation to more traditional objective content description in multimedia retrieval

    Language-based multimedia information retrieval

    Get PDF
    This paper describes various methods and approaches for language-based multimedia information retrieval, which have been developed in the projects POP-EYE and OLIVE and which will be developed further in the MUMIS project. All of these project aim at supporting automated indexing of video material by use of human language technologies. Thus, in contrast to image or sound-based retrieval methods, where both the query language and the indexing methods build on non-linguistic data, these methods attempt to exploit advanced text retrieval technologies for the retrieval of non-textual material. While POP-EYE was building on subtitles or captions as the prime language key for disclosing video fragments, OLIVE is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which then serve as the basis for text-based retrieval functionality

    Unsupervised Visual and Textual Information Fusion in Multimedia Retrieval - A Graph-based Point of View

    Full text link
    Multimedia collections are more than ever growing in size and diversity. Effective multimedia retrieval systems are thus critical to access these datasets from the end-user perspective and in a scalable way. We are interested in repositories of image/text multimedia objects and we study multimodal information fusion techniques in the context of content based multimedia information retrieval. We focus on graph based methods which have proven to provide state-of-the-art performances. We particularly examine two of such methods : cross-media similarities and random walk based scores. From a theoretical viewpoint, we propose a unifying graph based framework which encompasses the two aforementioned approaches. Our proposal allows us to highlight the core features one should consider when using a graph based technique for the combination of visual and textual information. We compare cross-media and random walk based results using three different real-world datasets. From a practical standpoint, our extended empirical analysis allow us to provide insights and guidelines about the use of graph based methods for multimodal information fusion in content based multimedia information retrieval.Comment: An extended version of the paper: Visual and Textual Information Fusion in Multimedia Retrieval using Semantic Filtering and Graph based Methods, by J. Ah-Pine, G. Csurka and S. Clinchant, submitted to ACM Transactions on Information System

    Large scale evaluations of multimedia information retrieval: the TRECVid experience

    Get PDF
    Information Retrieval is a supporting technique which underpins a broad range of content-based applications including retrieval, filtering, summarisation, browsing, classification, clustering, automatic linking, and others. Multimedia information retrieval (MMIR) represents those applications when applied to multimedia information such as image, video, music, etc. In this presentation and extended abstract we are primarily concerned with MMIR as applied to information in digital video format. We begin with a brief overview of large scale evaluations of IR tasks in areas such as text, image and music, just to illustrate that this phenomenon is not just restricted to MMIR on video. The main contribution, however, is a set of pointers and a summarisation of the work done as part of TRECVid, the annual benchmarking exercise for video retrieval tasks

    Abductive retrieval for multimedia information seeking

    Get PDF
    In this paper we discuss an approach to the retrieval of data annotated using the MPEG-7 multimedia description schema. In particular we describe a framework for the retrieval of annotated video samples that is based on principles from the area of abductive reasoning

    The relationship between IR and multimedia databases

    Get PDF
    Modern extensible database systems support multimedia data through ADTs. However, because of the problems with multimedia query formulation, this support is not sufficient.\ud \ud Multimedia querying requires an iterative search process involving many different representations of the objects in the database. The support that is needed is very similar to the processes in information retrieval.\ud \ud Based on this observation, we develop the miRRor architecture for multimedia query processing. We design a layered framework based on information retrieval techniques, to provide a usable query interface to the multimedia database.\ud \ud First, we introduce a concept layer to enable reasoning over low-level concepts in the database.\ud \ud Second, we add an evidential reasoning layer as an intermediate between the user and the concept layer.\ud \ud Third, we add the functionality to process the users' relevance feedback.\ud \ud We then adapt the inference network model from text retrieval to an evidential reasoning model for multimedia query processing.\ud \ud We conclude with an outline for implementation of miRRor on top of the Monet extensible database system

    Information extraction from multimedia web documents: an open-source platform and testbed

    No full text
    The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval

    Multimedia Information Retrieval

    Get PDF
    With recent advances in screen and mass storage technology, together with the on-going advances in computer power, many users of personal computers and low end workstations are now regularly manipulating non-textual information. This information may be in the form of drawings, graphs, animations, sound, or video (for example). With the increased usage of these media on computer systems there has not, however, been much work in the provision of access methods to non-textual computer based information. An increasingly common method for accessing large document bases of textual information is free text retrieval. In such systems users typically enter natural language queries. These are then matched against the textual documents in the system. It is often possible for the user to re-formulate a query by providing relevance feedback, this usually takes the form of the user informing the system that certain documents are indeed relevant to the current search. This information, together with the original query, is then used by the retrieval engine to provide an improved list of matched documents. Although free text retrieval provides reasonably effective access to large document bases it does not provide easy access to non-textual information. Various query based access methods to nontextual document bases are presented, but these are all restricted to specific domains and cannot be used in mixed media systems. Hypermedia, on the other hand, is an access method for document bases which is based on the user browsing through the document base rather than issuing queries. A set of interconnected paths are constructed through the base which the user may follow. Although providing poorer access to large document bases the browsing approach does provide very natural access to non-textual information. The recent explosion in hypermedia systems and discussion has been partly due to the requirement for access to mixed media document bases. Some work is reported which presents an integration of free text retrieval based queries with hypermedia. This provides a solution to the scaling problem of browsing based systems, these systems provide access to textual nodes by query or by browsing. Non-textual nodes are, however, still only accessible by browsing - either from the starting point of the document base or from a textual document which matched the query. A model of retrieval for non-textual documents is developed, this model is based on document's context within the hypermedia document base, as opposed to the document's content. If a non-textual document is connected to several textual documents, by paths in the hypermedia, then it is likely that the non-textual document will match the query whenever a high enough proportion of the textual documents match. This model of retrieval uses clustering techniques to calculate a descriptor for non-textual nodes so that they may be retrieved directly in response to a query. To establish that this model of retrieval for non-textual documents is worthwhile an experiment was run which used the text only CACM collection. Each record within the collection was initially treated as if it were non-textual and had a cluster based description calculated based on citations, this cluster based descriptor was then compared with the actual descriptor (calculated from the record's content) to establish how accurate the cluster descriptor was. As a base case the experiment was repeated using randomly created links, as opposed to citations. The results showed that for citation based links the cluster based descriptions had a mean correlating of 0.230 with the content based description (on a range from 0 to 1, where 1 represents a perfect match) and performed approximately six times better than when random links were used (mean random correlation was 0.037). This shows that citation based cluster descriptions of documents are significantly closer to the actual descriptions than random based links, and although the correlation is quite low, the cluster approach provides a useful technique for describing documents. The model of retrieval presented for non-textual documents relies upon a hypermedia structure existing in the document base, since the model cannot work if the documents are not linked together. A user interface to a document base which gives access to a retrieval engine and to hypermedia links can be based around three main categories: browsing only access, use the retrieval engine to support link creation; query only access, use links to provide access to non-text; query and browsing access Although the last user interface may initially appear most suitable for a document base which can support queries and browsing it is also potentially the most complex interface, and may require a more complex model of retrieval for users to successfully search the document base. A set of user tests were carried out to establish user behaviour and to consider interface issues concerning easy access to documents which are held on such document bases. These tests showed that, overall, no access method was clearly better or poorer than any other method. (Abstract shortened by ProQuest.)