10 research outputs found

    Idiap at MediaEval 2013: Search and Hyperlinking Task

    Get PDF
    The Idiap system for search and hyperlinking uses topic-based segmentation, content-based recommendation algorithms, and multimodal re-ranking. For both sub-tasks, our system performs better with automatic speech recognition output than with manual subtitles. For linking, the results benefit from the fusion of text and visual concepts detected in the anchors

    Multimodal Reranking of Content-based Recommendations for Hyperlinking Video Snippets

    Get PDF
    In this paper, we present an approach for topic-level search and hyperlinking of video snippets, which relies on contentbased recommendation and multimodal re-ranking techniques. We identify topic-level segments using transcripts or subtitles and enrich them with other metadata. Segments are indexed in a word vector space. Given a text query or an anchor, the most similar segments are retrieved using cosine similarity scores, which are then combined with visual similarity scores, computed as the distance from the anchor's visual concept vector. This approach has performed well on the MediaEval 2013 Search and Hyperlinking task, evaluated over 1260 hours of BBC TV broadcast, in terms of overall mean average precision. Experiments showed that topic-segments based on transcripts from automatic speech recognition level systems (ASR) led to better performance than the ones based on subtitles for both search and hyperlinking. Moreover, by analyzing the effect of Multimodal re-ranking on hyperlinking performance, we emphasize the merits of rich visual information available in the anchors for the hyperlinking task, and the merits of ASR for large-scale search and hyperlinking

    Hierarchical Topic Models for Language-based Video Hyperlinking

    Get PDF
    International audienceWe investigate video hyperlinking based on speech transcripts , leveraging a hierarchical topical structure to address two essential aspects of hyperlinking, namely, serendipity control and link justification. We propose and compare different approaches exploiting a hierarchy of topic models as an intermediate representation to compare the transcripts of video segments. These hierarchical representations offer a basis to characterize the hyperlinks, thanks to the knowledge of the topics who contributed to the creation of the links, and to control serendipity by choosing to give more weights to either general or specific topics. Experiments are performed on BBC videos from the Search and Hyperlinking task at MediaEval. Link precisions similar to those of direct text comparison are achieved however exhibiting different targets along with a potential control of serendipity

    Deliverable D1.6 Intelligent hypervideo analysis evaluation, final results

    Get PDF
    This deliverable describes the conducted evaluation activities for assessing the performance of a number of developed methods for intelligent hypervideo analysis and the usability of the implemented Editor Tool for supporting video annotation and enrichment. Based on the performance evaluations reported in D1.4 regarding a set of LinkedTV analysis components, we extended our experiments for assessing the effectiveness of newer versions of these methods as well as of entirely new techniques, concerning the accuracy and the time efficiency of the analysis. For this purpose, in-house experiments and participations at international benchmarking activities were made, and the outcomes are reported in this deliverable. Moreover, we present the results of user trials regarding the developed Editor Tool, where groups of experts assessed its usability and the supported functionalities, and evaluated the usefulness and the accuracy of the implemented video segmentation approaches based on the analysis requirements of the LinkedTV scenarios. By this deliverable we complete the reporting of WP1 evaluations that aimed to assess the efficiency of the developed multimedia analysis methods throughout the project, according to the analysis requirements of the LinkedTV scenarios

    Topic-Level Extractive Summarization of Lectures and Meetings Using a Snippet Similarity Graph

    Get PDF
    In this paper, we present an approach for topic-level video snippet-based extractive summarization, which relies on con tent-based recommendation techniques. We identify topic-level snippets using transcripts of all videos in the dataset and indexed these snippets globally in a word vector space. Generate snippet cosine similarity scores matrix, which are then utilized to compute top snippets to be utilized for summarization. We also compare the snippet similarity globally across all video snippets and locally within a video snippets. This approach has performed well on the AMI meeting corpus, in terms of ROUGE scores compare to state-of-the-art methods. Experiments showed that corpus like AMI meeting has large overlap between global and local snippet similarity of 80% and the ROUGE scores are comparable. Moreover, we applied proposed TopS summarizer in dierent scenarios on Video Lectures, to emphasize the merits of ease in utilizing summarizer with such content-based recommendation technique

    The MGB Challenge: Evaluating Multi-genre Broadcast Media Recognition

    Get PDF
    This paper describes the Multi-Genre Broadcast (MGB) Challenge at ASRU 2015, an evaluation focused on speech recognition, speaker diarization, and "lightly supervised" alignment of BBC TV recordings. The challenge training data covered the whole range of seven weeks BBC TV output across four channels, resulting in about 1,600 hours of broadcast audio. In addition several hundred million words of BBC subtitle text was provided for language modelling. A novel aspect of the evaluation was the exploration of speech recognition and speaker diarization in a longitudinal setting - i.e. recognition of several episodes of the same show, and speaker diarization across these episodes, linking speakers. The longitudinal tasks also offered the opportunity for systems to make use of supplied metadata including show title, genre tag, and date/time of transmission. This paper describes the task data and evaluation process used in the MGB challenge, and summarises the results obtained

    Enabling automatic provenance-based trust assessment of web content

    Get PDF

    Learning Explainable User Sentiment and Preferences for Information Filtering

    Get PDF
    In the last decade, online social networks have enabled people to interact in many ways with each other and with content. The digital traces of such actions reveal people's preferences towards online content such as news or products. These traces often result from interactions such as sharing or liking, but also from interactions in natural language. The continuous growth of the amount of content and of digital traces has led to information overload: surrounded by large volumes of information, people are facing difficulties when searching for information relevant to their interests. To improve user experience, information systems must be able to assist users in achieving their search goals, effectively and efficiently. This thesis is concerned with two important challenges that information systems need to address in order to significantly improve search experience and overcome information overload. First, these systems need to model accurately the variety of user traces, and second, they need to meaningfully explain search results and recommendations to users. To address these challenges, this thesis proposes novel methods based on machine learning to model user sentiment and preferences for information filtering systems, which are effective, scalable, and easily interpretable by humans. We focus on two prominent types of user traces in social networks: on the one hand, user comments accompanied by unary preferences such as likes, and on the other hand, user reviews accompanied by numerical preferences such as star ratings. In both cases, we advocate that by better understanding user text through mining its semantics and modeling its structure, we can not only improve information filtering, but also explain predictions to users. Within this context, we aim to answer three main research questions, namely: (i)~how do item semantics help to predict unary preferences; (ii)~how do sentiments of free-form user texts help to predict unary preferences; and (iii)~how to model fine-grained numerical preferences from user review texts. Our goal is to model and extract from user text the knowledge required to answer these questions, and to obtain insights on how to design better information filtering systems that are more effective and improve user experience. To answer the first question, we formulate the recommendation problem based on unary preferences as a top-N retrieval task and we define an appropriate dataset and metrics for measuring performance. Then, we propose and evaluate several content-based methods based on semantic similarities under presence or absence of preferences. To answer the second question, we propose a sentiment-aware neighborhood model which integrates the sentiment of user comments with unary preferences, either through fixed or through learned mapping functions. For the latter type, we propose a learning algorithm which adapts the sentiment of user comments to unary preferences at collective or individual levels. To answer the third question, we cast the problem of modeling user attitude toward aspects of items as a weakly supervised problem, and we propose a weighted multiple-instance learning method for solving it. Lastly, we show that the learned saliency weights, apart from being easily interpretable, are useful indicators for review segmentation and summarization

    Modeling Users' Information Needs in a Document Recommender for Meetings

    Get PDF
    People are surrounded by an unprecedented wealth of information. Access to it depends on the availability of suitable search engines, but even when these are available, people often do not initiate a search, because their current activity does not allow them, or they are not aware of the existence of this information. Just-in-time retrieval brings a radical change to the process of query-based retrieval, by proactively retrieving documents relevant to users' current activities, in an easily accessible and non-intrusive manner. This thesis presents a novel set of methods intended to improve the relevance of a just-in-time retrieval system, specifically a document recommender system designed for conversations, in terms of precision and diversity of results. Additionally, we designed an evaluation protocol to compare the proposed methods in the thesis with other ones using crowdsourcing. In contrast to previous systems, which model users' information needs by extracting keywords from clean and well-structured texts, this system models them from the conversation transcripts, which contain noise from automatic speech recognition (ASR) and have a free structure, often switching between several topics. To deal with these issues, we first propose a novel keyword extraction method which preserves both the relevance and the diversity of topics of the conversation, to properly capture possible users' needs with minimum ASR noise. Implicit queries are then built from these keywords. However, the presence of multiple unrelated topics in one query introduces significant noise into the retrieval results. To reduce this effect, we separate users' needs by topically clustering keyword sets into several subsets or implicit queries. We introduce a merging method which combines the results of multiple queries which are prepared from users' conversation to generate a concise, diverse and relevant list of documents. This method ensures that the system does not distract its users from their current conversation by frequently recommending them a large number of documents. Moreover, we address the problem of explicit queries that may be asked by users during a conversation. We introduce a query refinement method which leverages the conversation context to answer the users' information needs without asking for additional clarifications and therefore, again, avoiding to distract users during their conversation. Finally, we implemented the end-to-end document recommender system by integrating the ideas proposed in this thesis and then proposed an evaluation scenario with human users in a brainstorming meeting
    corecore