10 research outputs found

    Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown

    Get PDF
    The Video Browser Showdown addresses difficult video search challenges through an annual interactive evaluation campaign attracting research teams focusing on interactive video retrieval. The campaign aims to provide insights into the performance of participating interactive video retrieval systems, tested by selected search tasks on large video collections. For the first time in its ten year history, the Video Browser Showdown 2021 was organized in a fully remote setting and hosted a record number of sixteen scoring systems. In this paper, we describe the competition setting, tasks and results and give an overview of state-of-the-art methods used by the competing systems. By looking at query result logs provided by ten systems, we analyze differences in retrieval model performances and browsing times before a correct submission. Through advances in data gathering methodology and tools, we provide a comprehensive analysis of ad-hoc video search tasks, discuss results, task design and methodological challenges. We highlight that almost all top performing systems utilize some sort of joint embedding for text-image retrieval and enable specification of temporal context in queries for known-item search. Whereas a combination of these techniques drive the currently top performing systems, we identify several future challenges for interactive video search engines and the Video Browser Showdown competition itself

    Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBS

    Full text link
    This paper presents findings of the eleventh Video Browser Showdown competition, where sixteen teams competed in known-item and ad-hoc search tasks. Many of the teams utilized state-of-the-art video retrieval approaches that demonstrated high effectiveness in challenging search scenarios. In this paper, a broad survey of all utilized approaches is presented in connection with an analysis of the performance of participating teams. Specifically, both high-level performance indicators are presented with overall statistics as well as in-depth analysis of the performance of selected tools implementing result set logging. The analysis reveals evidence that the CLIP model represents a versatile tool for cross-modal video retrieval when combined with interactive search capabilities. Furthermore, the analysis investigates the effect of different users and text query properties on the performance in search tasks. Last but not least, lessons learned from search task preparation are presented, and a new direction for ad-hoc search based tasks at Video Browser Showdown is introduced

    Towards Explainable Interactive Multi-Modal Video Retrieval with vitrivr

    Get PDF
    This paper presents the most recent iteration of the vitrivr multimedia retrieval system for its participation in the Video Browser Showdown (VBS) 2021. Building on existing functionality for interactive multi-modal retrieval, we overhaul query formulation and results presentation for queries which specify temporal context, extend our database with index structures for similarity search and present experimental functionality aimed at improving the explainability of results with the objective of better supporting users in the selection of results and the provision of relevance feedback

    FIRST - Flexible interactive retrieval SysTem for visual lifelog exploration at LSC 2020

    Get PDF
    Lifelog can provide useful insights of our daily activities. It is essential to provide a flexible way for users to retrieve certain events or moments of interest, corresponding to a wide variation of query types. This motivates us to develop FIRST, a Flexible Interactive Retrieval SysTem, to help users to combine or integrate various query components in a flexible manner to handle different query scenarios, such as visual clustering data based on color histogram, visual similarity, GPS location, or scene attributes. We also employ personalized concept detection and image captioning to enhance image understanding from visual lifelog data, and develop an autoencoderlike approach for query text and image feature mapping. Furthermore, we refine the user interface of the retrieval system to better assist users in query expansion and verifying sequential events in a flexible temporal resolution to control the navigation speed through sequences of images

    Temporal multimodal video and lifelog retrieval

    Get PDF
    The past decades have seen exponential growth of both consumption and production of data, with multimedia such as images and videos contributing significantly to said growth. The widespread proliferation of smartphones has provided everyday users with the ability to consume and produce such content easily. As the complexity and diversity of multimedia data has grown, so has the need for more complex retrieval models which address the information needs of users. Finding relevant multimedia content is central in many scenarios, from internet search engines and medical retrieval to querying one's personal multimedia archive, also called lifelog. Traditional retrieval models have often focused on queries targeting small units of retrieval, yet users usually remember temporal context and expect results to include this. However, there is little research into enabling these information needs in interactive multimedia retrieval. In this thesis, we aim to close this research gap by making several contributions to multimedia retrieval with a focus on two scenarios, namely video and lifelog retrieval. We provide a retrieval model for complex information needs with temporal components, including a data model for multimedia retrieval, a query model for complex information needs, and a modular and adaptable query execution model which includes novel algorithms for result fusion. The concepts and models are implemented in vitrivr, an open-source multimodal multimedia retrieval system, which covers all aspects from extraction to query formulation and browsing. vitrivr has proven its usefulness in evaluation campaigns and is now used in two large-scale interdisciplinary research projects. We show the feasibility and effectiveness of our contributions in two ways: firstly, through results from user-centric evaluations which pit different user-system combinations against one another. Secondly, we perform a system-centric evaluation by creating a new dataset for temporal information needs in video and lifelog retrieval with which we quantitatively evaluate our models. The results show significant benefits for systems that enable users to specify more complex information needs with temporal components. Participation in interactive retrieval evaluation campaigns over multiple years provides insight into possible future developments and challenges of such campaigns

    Exquisitor:Interactive Learning for Multimedia

    Get PDF

    VIREO @ Video browser showdown 2020

    No full text
    corecore