15 research outputs found

    Automated annotation of multimedia audio data with affective labels for information management

    Get PDF
    The emergence of digital multimedia systems is creating many new opportunities for rapid access to huge content archives. In order to fully exploit these information sources, the content must be annotated with significant features. An important aspect of human interpretation of multimedia data, which is often overlooked, is the affective dimension. Such information is a potentially useful component for content-based classification and retrieval. Much of the affective information of multimedia content is contained within the audio data stream. Emotional features can be defined in terms of arousal and valence levels. In this study low-level audio features are extracted to calculate arousal and valence levels of multimedia audio streams. These are then mapped onto a set of keywords with predetermined emotional interpretations. Experimental results illustrate the use of this system to assign affective annotation to multimedia data

    Measuring the impact of temporal context on video retrieval

    Get PDF
    In this paper we describe the findings from the K-Space interactive video search experiments in TRECVid 2007, which examined the effects of including temporal context in video retrieval. The traditional approach to presenting video search results is to maximise recall by offering a user as many potentially relevant shots as possible within a limited amount of time. ‘Context’-oriented systems opt to allocate a portion of theresults presentation space to providing additional contextual cues about the returned results. In video retrieval these cues often include temporal information such as a shot’s location within the overall video broadcast and/or its neighbouring shots. We developed two interfaces with identical retrieval functionality in order to measure the effects of such context on user performance. The first system had a ‘recall-oriented’ interface, where results from a query were presented as a ranked list of shots. The second was ‘contextoriented’, with results presented as a ranked list of broadcasts. 10 users participated in the experiments, of which 8 were novices and 2 experts. Participants completed a number of retrieval topics using both the recall-oriented and context-oriented systems

    Evaluation campaigns and TRECVid

    Get PDF
    The TREC Video Retrieval Evaluation (TRECVid) is an international benchmarking activity to encourage research in video information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. TRECVid completed its fifth annual cycle at the end of 2005 and in 2006 TRECVid will involve almost 70 research organizations, universities and other consortia. Throughout its existence, TRECVid has benchmarked both interactive and automatic/manual searching for shots from within a video corpus, automatic detection of a variety of semantic and low-level video features, shot boundary detection and the detection of story boundaries in broadcast TV news. This paper will give an introduction to information retrieval (IR) evaluation from both a user and a system perspective, highlighting that system evaluation is by far the most prevalent type of evaluation carried out. We also include a summary of TRECVid as an example of a system evaluation benchmarking campaign and this allows us to discuss whether such campaigns are a good thing or a bad thing. There are arguments for and against these campaigns and we present some of them in the paper concluding that on balance they have had a very positive impact on research progress

    Learning the semantics of multimedia queries and concepts from a small number of examples

    Full text link

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Image Annotation and Topic Extraction Using Super-Word Latent Dirichlet

    Get PDF
    This research presents a multi-domain solution that uses text and images to iteratively improve automated information extraction. Stage I uses local text surrounding an embedded image to provide clues that help rank-order possible image annotations. These annotations are forwarded to Stage II, where the image annotations from Stage I are used as highly-relevant super-words to improve extraction of topics. The model probabilities from the super-words in Stage II are forwarded to Stage III where they are used to refine the automated image annotation developed in Stage I. All stages demonstrate improvement over existing equivalent algorithms in the literature

    Affect-based indexing and retrieval of multimedia data

    Get PDF
    Digital multimedia systems are creating many new opportunities for rapid access to content archives. In order to explore these collections using search, the content must be annotated with significant features. An important and often overlooked aspect o f human interpretation o f multimedia data is the affective dimension. The hypothesis o f this thesis is that affective labels o f content can be extracted automatically from within multimedia data streams, and that these can then be used for content-based retrieval and browsing. A novel system is presented for extracting affective features from video content and mapping it onto a set o f keywords with predetermined emotional interpretations. These labels are then used to demonstrate affect-based retrieval on a range o f feature films. Because o f the subjective nature o f the words people use to describe emotions, an approach towards an open vocabulary query system utilizing the electronic lexical database WordNet is also presented. This gives flexibility for search queries to be extended to include keywords without predetermined emotional interpretations using a word-similarity measure. The thesis presents the framework and design for the affectbased indexing and retrieval system along with experiments, analysis, and conclusions

    A Human-Centric Approach to Data Fusion in Post-Disaster Managment: The Development of a Fuzzy Set Theory Based Model

    Get PDF
    It is critical to provide an efficient and accurate information system in the post-disaster phase for individuals\u27 in order to access and obtain the necessary resources in a timely manner; but current map based post-disaster management systems provide all emergency resource lists without filtering them which usually leads to high levels of energy consumed in calculation. Also an effective post-disaster management system (PDMS) will result in distribution of all emergency resources such as, hospital, storage and transportation much more reasonably and be more beneficial to the individuals in the post disaster period. In this Dissertation, firstly, semi-supervised learning (SSL) based graph systems was constructed for PDMS. A Graph-based PDMS\u27 resource map was converted to a directed graph that presented by adjacent matrix and then the decision information will be conducted from the PDMS by two ways, one is clustering operation, and another is graph-based semi-supervised optimization process. In this study, PDMS was applied for emergency resource distribution in post-disaster (responses phase), a path optimization algorithm based ant colony optimization (ACO) was used for minimizing the cost in post-disaster, simulation results show the effectiveness of the proposed methodology. This analysis was done by comparing it with clustering based algorithms under improvement ACO of tour improvement algorithm (TIA) and Min-Max Ant System (MMAS) and the results also show that the SSL based graph will be more effective for calculating the optimization path in PDMS. This research improved the map by combining the disaster map with the initial GIS based map which located the target area considering the influence of disaster. First, all initial map and disaster map will be under Gaussian transformation while we acquired the histogram of all map pictures. And then all pictures will be under discrete wavelet transform (DWT), a Gaussian fusion algorithm was applied in the DWT pictures. Second, inverse DWT (iDWT) was applied to generate a new map for a post-disaster management system. Finally, simulation works were proposed and the results showed the effectiveness of the proposed method by comparing it to other fusion algorithms, such as mean-mean fusion and max-UD fusion through the evaluation indices including entropy, spatial frequency (SF) and image quality index (IQI). Fuzzy set model were proposed to improve the presentation capacity of nodes in this GIS based PDMS

    An investigation into weighted data fusion for content-based multimedia information retrieval

    Get PDF
    Content Based Multimedia Information Retrieval (CBMIR) is characterised by the combination of noisy sources of information which, in unison, are able to achieve strong performance. In this thesis we focus on the combination of ranked results from the independent retrieval experts which comprise a CBMIR system through linearly weighted data fusion. The independent retrieval experts are low-level multimedia features, each of which contains an indexing function and ranking algorithm. This thesis is comprised of two halves. In the first half, we perform a rigorous empirical investigation into the factors which impact upon performance in linearly weighted data fusion. In the second half, we leverage these finding to create a new class of weight generation algorithms for data fusion which are capable of determining weights at query-time, such that the weights are topic dependent
    corecore