15 research outputs found

    Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown

    Get PDF
    The Video Browser Showdown addresses difficult video search challenges through an annual interactive evaluation campaign attracting research teams focusing on interactive video retrieval. The campaign aims to provide insights into the performance of participating interactive video retrieval systems, tested by selected search tasks on large video collections. For the first time in its ten year history, the Video Browser Showdown 2021 was organized in a fully remote setting and hosted a record number of sixteen scoring systems. In this paper, we describe the competition setting, tasks and results and give an overview of state-of-the-art methods used by the competing systems. By looking at query result logs provided by ten systems, we analyze differences in retrieval model performances and browsing times before a correct submission. Through advances in data gathering methodology and tools, we provide a comprehensive analysis of ad-hoc video search tasks, discuss results, task design and methodological challenges. We highlight that almost all top performing systems utilize some sort of joint embedding for text-image retrieval and enable specification of temporal context in queries for known-item search. Whereas a combination of these techniques drive the currently top performing systems, we identify several future challenges for interactive video search engines and the Video Browser Showdown competition itself

    Temporal multimodal video and lifelog retrieval

    Get PDF
    The past decades have seen exponential growth of both consumption and production of data, with multimedia such as images and videos contributing significantly to said growth. The widespread proliferation of smartphones has provided everyday users with the ability to consume and produce such content easily. As the complexity and diversity of multimedia data has grown, so has the need for more complex retrieval models which address the information needs of users. Finding relevant multimedia content is central in many scenarios, from internet search engines and medical retrieval to querying one's personal multimedia archive, also called lifelog. Traditional retrieval models have often focused on queries targeting small units of retrieval, yet users usually remember temporal context and expect results to include this. However, there is little research into enabling these information needs in interactive multimedia retrieval. In this thesis, we aim to close this research gap by making several contributions to multimedia retrieval with a focus on two scenarios, namely video and lifelog retrieval. We provide a retrieval model for complex information needs with temporal components, including a data model for multimedia retrieval, a query model for complex information needs, and a modular and adaptable query execution model which includes novel algorithms for result fusion. The concepts and models are implemented in vitrivr, an open-source multimodal multimedia retrieval system, which covers all aspects from extraction to query formulation and browsing. vitrivr has proven its usefulness in evaluation campaigns and is now used in two large-scale interdisciplinary research projects. We show the feasibility and effectiveness of our contributions in two ways: firstly, through results from user-centric evaluations which pit different user-system combinations against one another. Secondly, we perform a system-centric evaluation by creating a new dataset for temporal information needs in video and lifelog retrieval with which we quantitatively evaluate our models. The results show significant benefits for systems that enable users to specify more complex information needs with temporal components. Participation in interactive retrieval evaluation campaigns over multiple years provides insight into possible future developments and challenges of such campaigns

    Open Challenges of Interactive Video Search and Evaluation

    Full text link
    During the last 10 years of Video Browser Showdown (VBS), there were many different approaches tested for known-item search and ad-hoc search tasks. Undoubtedly, teams incorporating state-of-the-art models from the machine learning domain had an advantage over teams focusing just on interactive interfaces. On the other hand, VBS results indicate that effective means of interaction with a search system is still necessary to accomplish challenging search tasks. In this tutorial, we summarize successful deep models tested at the Video Browser Showdown as well as interfaces designed on top of corresponding distance/similarity spaces. Our broad experience with competition organization and evaluation will be presented as well, focusing on promising findings and also challenging problems from the most recent iterations of the Video Browser Showdown

    Memento: a prototype lifelog search engine for LSC’21

    Get PDF
    In this paper, we introduce a new lifelog retrieval system called Memento that leverages semantic representations of images and textual queries projected into a common latent space to facilitate effective retrieval. It bridges the semantic gap between complex visual scenes/events and user information needs expressed as textual and faceted queries. The system, developed for the 2021 Lifelog Search Challenge also has a minimalist user interface that includes primary search, temporal search, and visual data filtering components

    LifeSeeker 3.0 : an interactive lifelog search engine for LSC’21

    Get PDF
    In this paper, we present the interactive lifelog retrieval engine developed for the LSC’21 comparative benchmarking challenge. The LifeSeeker 3.0 interactive lifelog retrieval engine is an enhanced version of our previous system participating in LSC’20 - LifeSeeker 2.0. The system is developed by both Dublin City University and the Ho Chi Minh City University of Science. The implementation of LifeSeeker 3.0 focuses on searching and filtering by text query using a weighted Bag-of-Words model with visual concept augmentation and three weighted vocabularies. The visual similarity search is improved using a bag of local convolutional features; while improving the previous version’s performance, enhancing query processing time, result displaying, and browsing support

    LifeMon: A MongoDB-Based Lifelog Retrieval Prototype

    Get PDF

    Flexible interactive retrieval SysTem 3.0 for visual lifelog exploration at LSC 2022

    Get PDF
    Building a retrieval system with lifelogging data is more complicated than with ordinary data due to the redundancies, blurriness, massive amount of data, various sources of information accompanying lifelogging data, and especially the ad-hoc nature of queries. The Lifelog Search Challenge (LSC) is a benchmarking challenge that encourages researchers and developers to push the boundaries in lifelog retrieval. For LSC'22, we develop FIRST 3.0, a novel and flexible system that leverages expressive cross-domain embeddings to enhance the searching process. Our system aims to adaptively capture the semantics of an image at different levels of detail. We also propose to augment our system with an external search engine to help our system with initial visual examples for unfamiliar concepts. Finally, we organize image data in hierarchical clusters based on their visual similarity and location to assist users in data exploration. Experiments show that our system is both fast and effective in handling various retrieval scenarios

    PhotoCube at the Lifelog Search Challenge 2021

    Get PDF

    Myscéal 2.0: a revised experimental interactive lifelog retrieval system for LSC'21

    Get PDF
    Building an interactive retrieval system for lifelogging contains many challenges due to massive multi-modal personal data besides the requirement of accuracy and rapid response for such a tool. The Lifelog Search Challenge (LSC) is the international lifelog retrieval competition that inspires researchers to develop their systems to cope with the challenges and evaluates the effectiveness of their solutions. In this paper, we upgrade our previous Myscéal 2.0 and present Myscéal 2.0 system for the LSC'21 with the improved features inspired by the novice users experiments. The experiments show that a novice user achieved more than half of the expert score on average. To mitigate the gap of them, some potential enhancements were identified and integrated to the enhanced version
    corecore