62 research outputs found

    The search and hyperlinking task at MediaEval 2014

    Get PDF
    The Search and Hyperlinking Task at MediaEval 2014 is the third edition of this task. As in previous versions, it consisted of two sub-tasks: (i) answering search queries from a collection of roughly 2700 hours of BBC broadcast TV material, and (ii) linking anchor segments from within the videos to other target segments within the video collection. For MediaEval 2014, both sub-tasks were based on an ad-hoc retrieval scenario, and were evaluated using a pooling procedure across participants submissions with crowdsourcing relevance assessment using Amazon Mechanical Turk

    DCU search runs at MediaEval 2014 search and hyperlinking

    Get PDF
    We described Dublin City University (DCU)'s participation in the Search sub-task of the Search and Hyperlinking Task at MediaEval 2014. Exploratory experiments were carried out to investigate the utility of prosodic prominence features in the task of retrieving relevant video segments from a collection of BBC videos. Normalised acoustic correlates of loudness, pitch, and duration were incorporated in a standard TF-IDF weighting scheme to increase weights for terms that were prominent in speech. Prosodic models outperformed a text-based TF-IDF baseline on the training set but failed to surpass the baseline on the test set

    Hierarchical Topic Models for Language-based Video Hyperlinking

    Get PDF
    International audienceWe investigate video hyperlinking based on speech transcripts , leveraging a hierarchical topical structure to address two essential aspects of hyperlinking, namely, serendipity control and link justification. We propose and compare different approaches exploiting a hierarchy of topic models as an intermediate representation to compare the transcripts of video segments. These hierarchical representations offer a basis to characterize the hyperlinks, thanks to the knowledge of the topics who contributed to the creation of the links, and to control serendipity by choosing to give more weights to either general or specific topics. Experiments are performed on BBC videos from the Search and Hyperlinking task at MediaEval. Link precisions similar to those of direct text comparison are achieved however exhibiting different targets along with a potential control of serendipity

    Investigating domain-independent NLP techniques for precise target selection in video hyperlinking

    Get PDF
    International audienceAutomatic generation of hyperlinks in multimedia video data is a subject with growing interest, as demonstrated by recent work undergone in the framework of the Search and Hyperlinking task within the Mediaeval benchmark initiative. In this paper, we compare NLP-based strategies for precise target selection in video hyperlinking exploiting speech material, with the goal of providing hyperlinks from a specified anchor to help information retrieval. We experimentally compare two approaches enabling to select short portions of videos which are relevant and possibly complementary with respect to the anchor. The first approach exploits a bipartite graph relating utterances and words to find the most relevant utterances. The second one uses explicit topic segmentation, whether hierarchical or not, to select the target segments. Experimental results are reported on the Mediaeval 2013 Search and Hyperlinking dataset which consists of BBC videos, demonstrating the interest of hierarchical topic segmentation for precise target selection

    IRISA and KUL at MediaEval 2014: Search and Hyperlinking Task

    Get PDF
    International audienceThis paper presents our approach and results in the hyper-linking sub-task at MediaEval 2014. A two step approach is implemented: relying on a topic segmentation technique, the first step consists in generating potential target segments; then, for each anchor, the best 20 target segments are selected according to two distinct strategies: the first one focuses on the identification of very similar targets using n-grams and named entities; the second one makes use of an intermediate structure built from topic models, which offers the possibility to control serendipity and to explain the links created

    SAVA at MediaEval 2015: search and anchoring in video archives

    Get PDF
    The Search and Anchoring in Video Archives (SAVA) task at MediaEval 2015 consists of two sub-tasks: (i) search for multimedia content within a video archive using multimodal queries referring to information contained in the audio and visual streams/content, and (ii) automatic selection of video segments within a list of videos that can be used as anchors for further hyperlinking within the archive. The task used a collection of roughly 2700 hours of the BBC broadcast TV material for the former sub-task, and about 70 les taken from this collection for the latter sub-task. The search sub-task is based on an ad-hoc retrieval scenario, and is evaluated using a pooling procedure across participants submissions with crowdsourcing relevance assessment using Amazon Mechanical Turk (MTurk). The evaluation used metrics that are variations of MAP adjusted for this task. For the anchor selection sub-task overlapping regions of interest across participants submissions were assessed using MTurk workers, and mean reciprocal rank (MRR), precision and recall were calculated for evaluation

    Pursuing a moving target: iterative use of benchmarking of a task to understand the task

    Get PDF
    Individual tasks carried out within benchmarking initiatives, or campaigns, enable direct comparison of alternative approaches to tackling shared research challenges and ideally promote new research ideas and foster communities of researchers interested in common or related scientific topics. When a task has a clear predefined use case, it might straightforwardly adopt a well established framework and methodology. For example, an ad hoc information retrieval task adopting the standard Cranfield paradigm. On the other hand, in cases of new and emerging tasks which pose more complex challenges in terms of use scenarios or dataset design, the development of a new task is far from a straightforward process. This letter summarises our reflections on our experiences as task organisers of the Search and Hyperlinking task from its origins as a Brave New Task at the MediaEval benchmarking campaign (2011–2014) to its current instantiation as a task at the NIST TRECVid benchmark (since 2015). We highlight the challenges encountered in the development of the task over a number of annual iterations, the solutions found so far, and our process for maintaining a vision for the ongoing advancement of the task’s ambition

    On the Selection of Anchors and Targets for Video Hyperlinking

    Full text link
    A problem not well understood in video hyperlinking is what qualifies a fragment as an anchor or target. Ideally, anchors provide good starting points for navigation, and targets supplement anchors with additional details while not distracting users with irrelevant, false and redundant information. The problem is not trivial for intertwining relationship between data characteristics and user expectation. Imagine that in a large dataset, there are clusters of fragments spreading over the feature space. The nature of each cluster can be described by its size (implying popularity) and structure (implying complexity). A principle way of hyperlinking can be carried out by picking centers of clusters as anchors and from there reach out to targets within or outside of clusters with consideration of neighborhood complexity. The question is which fragments should be selected either as anchors or targets, in one way to reflect the rich content of a dataset, and meanwhile to minimize the risk of frustrating user experience. This paper provides some insights to this question from the perspective of hubness and local intrinsic dimensionality, which are two statistical properties in assessing the popularity and complexity of data space. Based these properties, two novel algorithms are proposed for low-risk automatic selection of anchors and targets.Comment: ACM International Conference on Multimedia Retrieval (ICMR), 2017. (Oral

    Multimodal Reranking of Content-based Recommendations for Hyperlinking Video Snippets

    Get PDF
    In this paper, we present an approach for topic-level search and hyperlinking of video snippets, which relies on contentbased recommendation and multimodal re-ranking techniques. We identify topic-level segments using transcripts or subtitles and enrich them with other metadata. Segments are indexed in a word vector space. Given a text query or an anchor, the most similar segments are retrieved using cosine similarity scores, which are then combined with visual similarity scores, computed as the distance from the anchor's visual concept vector. This approach has performed well on the MediaEval 2013 Search and Hyperlinking task, evaluated over 1260 hours of BBC TV broadcast, in terms of overall mean average precision. Experiments showed that topic-segments based on transcripts from automatic speech recognition level systems (ASR) led to better performance than the ones based on subtitles for both search and hyperlinking. Moreover, by analyzing the effect of Multimodal re-ranking on hyperlinking performance, we emphasize the merits of rich visual information available in the anchors for the hyperlinking task, and the merits of ASR for large-scale search and hyperlinking

    Utilisation of metadata fields and query expansion in cross-lingual search of user-generated Internet video

    Get PDF
    Recent years have seen signicant eorts in the area of Cross Language Information Retrieval (CLIR) for text retrieval. This work initially focused on formally published content, but more recently research has begun to concentrate on CLIR for informal social media content. However, despite the current expansion in online multimedia archives, there has been little work on CLIR for this content. While there has been some limited work on Cross-Language Video Retrieval (CLVR) for professional videos, such as documentaries or TV news broadcasts, there has to date, been no signicant investigation of CLVR for the rapidly growing archives of informal user generated (UGC) content. Key differences between such UGC and professionally produced content are the nature and structure of the textual UGC metadata associated with it, as well as the form and quality of the content itself. In this setting, retrieval eectiveness may not only suer from translation errors common to all CLIR tasks, but also recognition errors associated with the automatic speech recognition (ASR) systems used to transcribe the spoken content of the video and with the informality and inconsistency of the associated user-created metadata for each video. This work proposes and evaluates techniques to improve CLIR effectiveness of such noisy UGC content. Our experimental investigation shows that dierent sources of evidence, e.g. the content from dierent elds of the structured metadata, significantly affect CLIR effectiveness. Results from our experiments also show that each metadata eld has a varying robustness to query expansion (QE) and hence can have a negative impact on the CLIR eectiveness. Our work proposes a novel adaptive QE technique that predicts the most reliable source for expansion and shows how this technique can be effective for improving CLIR effectiveness for UGC content
    • 

    corecore