7 research outputs found

    UTwente does Rich Speech Retrieval at MediaEval 2011

    Get PDF
    This paper describes the participation of the University of Twente team at the Rich Text Retrieval Task of the Media Eval Benchmark Initiative 2011. The goal of the task is to find entry points of relevant parts of videos to reduce the browsing effort of searchers. This is our first participation, therefore our main focus is to create a baseline system which can be improved in the future. We experiment with different evidence sources (ASR and meta data) together with a basic score combination function. We also experiment with different entry points relative to the segments found by the contained evidence

    UTwente does Brave New Tasks for MediaEval 2012: Searching and Hyperlinking

    Get PDF
    In this paper we report our experiments and results for the brave new searching and hyperlinking tasks for the MediaEval Benchmark Initiative 2012. The searching task involves nding target video segments based on a short natural language sentence query and the hyperlinking task involves nding links from the target video segments to other related video segments in the collection using a set of anchor segments in the videos that correspond to the textual search queries. To nd the starting points in the video, we only used speech transcripts and metadata as evidence source, however, other visual features (for e.g., faces, shots and keyframes) might also aect results for a query. We indexed speech transcripts and metadata, furthermore, the speech transcripts were indexed at speech segment level and at sentence level to improve the likelihood of nding jump-in-points. For linking video segments, we computed k-nearest neighbours of video segments using euclidean distance

    Enabling automatic provenance-based trust assessment of web content

    Get PDF

    Towards effective cross-lingual search of user-generated internet speech

    Get PDF
    The very rapid growth in user-generated social spoken content on online platforms is creating new challenges for Spoken Content Retrieval (SCR) technologies. There are many potential choices for how to design a robust SCR framework for UGS content, but the current lack of detailed investigation means that there is a lack of understanding of the specifc challenges, and little or no guidance available to inform these choices. This thesis investigates the challenges of effective SCR for UGS content, and proposes novel SCR methods that are designed to cope with the challenges of UGS content. The work presented in this thesis can be divided into three areas of contribution as follows. The first contribution of this work is critiquing the issues and challenges that in influence the effectiveness of searching UGS content in both mono-lingual and cross-lingual settings. The second contribution is to develop an effective Query Expansion (QE) method for UGS. This research reports that, encountered in UGS content, the variation in the length, quality and structure of the relevant documents can harm the effectiveness of QE techniques across different queries. Seeking to address this issue, this work examines the utilisation of Query Performance Prediction (QPP) techniques for improving QE in UGS, and presents a novel framework specifically designed for predicting of the effectiveness of QE. Thirdly, this work extends the utilisation of QPP in UGS search to improve cross-lingual search for UGS by predicting the translation effectiveness. The thesis proposes novel methods to estimate the quality of translation for cross-lingual UGS search. An empirical evaluation that demonstrates the quality of the proposed method on alternative translation outputs extracted from several Machine Translation (MT) systems developed for this task. The research then shows how this framework can be integrated in cross-lingual UGS search to find relevant translations for improved retrieval performance

    Spoken content retrieval beyond pipeline integration of automatic speech recognition and information retrieval

    Get PDF
    The dramatic increase in the creation of multimedia content is leading to the development of large archives in which a substantial amount of the information is in spoken form. Efficient access to this information requires effective spoken content retrieval (SCR) methods. Traditionally, SCR systems have focused on a pipeline integration of two fundamental technologies: transcription using automatic speech recognition (ASR) and search supported using text-based information retrieval (IR). Existing SCR approaches estimate the relevance of a spoken retrieval item based on the lexical overlap between a userā€™s query and the textual transcriptions of the items. However, the speech signal contains other potentially valuable non-lexical information that remains largely unexploited by SCR approaches. Particularly, acoustic correlates of speech prosody, that have been shown useful to identify salient words and determine topic changes, have not been exploited by existing SCR approaches. In addition, the temporal nature of multimedia content means that accessing content is a user intensive, time consuming process. In order to minimise user effort in locating relevant content, SCR systems could suggest playback points in retrieved content indicating the locations where the system believes relevant information may be found. This typically requires adopting a segmentation mechanism for splitting documents into smaller ā€œelementsā€ to be ranked and from which suitable playback points could be selected. Existing segmentation approaches do not generalise well to every possible information need or provide robustness to ASR errors. This thesis extends SCR beyond the standard ASR and IR pipeline approach by: (i) exploring the utilisation of prosodic information as complementary evidence of topical relevance to enhance current SCR approaches; (ii) determining elements of content that, when retrieved, minimise user search effort and provide increased robustness to ASR errors; and (iii) developing enhanced evaluation measures that could better capture the factors that affect user satisfaction in SCR

    Making music out of architecture and from-architecture-music-an oddyssey

    Get PDF
    These are the documents submitted for the First Review as work-in-progress, the first (longer) and the second (shorter) versions of the PhD research project to date, together with a summary titled The Final Proposal for PhD for First Review September 2019. Please note that the first version is unfinished and needs approximately another 30,000 words, questions answered, some further exploration of points raised in discussion and other relevant points, revision and editing. The second version is on-going. Please Note: The file titled Latest save of Making music out of architecture seems unable to be viewed in Preview perhaps due to its size. It can however be viewed from Download in which case please allow some time for this to occur. The other two documents can be viewed in Previe
    corecore