27 research outputs found

    A VR interface for browsing visual spaces at VBS2021

    Get PDF
    The Video Browser Showdown (VBS) is an annual competition in which each participant prepares an interactive video retrieval system and partakes in a live comparative evaluation at the annual MMMConference. In this paper, we introduce Eolas, which is a prototype video/image retrieval system incorporating a novel virtual reality (VR)interface. For VBS’21, Eolas represented each keyframe of the collection by an embedded feature in a latent vector space, into which a query would also be projected to facilitate retrieval within a VR environment. A user could then explore the space and perform one of a number of filter operations to traverse the space and locate the correct result

    Temporal multimodal video and lifelog retrieval

    Get PDF
    The past decades have seen exponential growth of both consumption and production of data, with multimedia such as images and videos contributing significantly to said growth. The widespread proliferation of smartphones has provided everyday users with the ability to consume and produce such content easily. As the complexity and diversity of multimedia data has grown, so has the need for more complex retrieval models which address the information needs of users. Finding relevant multimedia content is central in many scenarios, from internet search engines and medical retrieval to querying one's personal multimedia archive, also called lifelog. Traditional retrieval models have often focused on queries targeting small units of retrieval, yet users usually remember temporal context and expect results to include this. However, there is little research into enabling these information needs in interactive multimedia retrieval. In this thesis, we aim to close this research gap by making several contributions to multimedia retrieval with a focus on two scenarios, namely video and lifelog retrieval. We provide a retrieval model for complex information needs with temporal components, including a data model for multimedia retrieval, a query model for complex information needs, and a modular and adaptable query execution model which includes novel algorithms for result fusion. The concepts and models are implemented in vitrivr, an open-source multimodal multimedia retrieval system, which covers all aspects from extraction to query formulation and browsing. vitrivr has proven its usefulness in evaluation campaigns and is now used in two large-scale interdisciplinary research projects. We show the feasibility and effectiveness of our contributions in two ways: firstly, through results from user-centric evaluations which pit different user-system combinations against one another. Secondly, we perform a system-centric evaluation by creating a new dataset for temporal information needs in video and lifelog retrieval with which we quantitatively evaluate our models. The results show significant benefits for systems that enable users to specify more complex information needs with temporal components. Participation in interactive retrieval evaluation campaigns over multiple years provides insight into possible future developments and challenges of such campaigns

    Myscéal 2.0: a revised experimental interactive lifelog retrieval system for LSC'21

    Get PDF
    Building an interactive retrieval system for lifelogging contains many challenges due to massive multi-modal personal data besides the requirement of accuracy and rapid response for such a tool. The Lifelog Search Challenge (LSC) is the international lifelog retrieval competition that inspires researchers to develop their systems to cope with the challenges and evaluates the effectiveness of their solutions. In this paper, we upgrade our previous Myscéal 2.0 and present Myscéal 2.0 system for the LSC'21 with the improved features inspired by the novice users experiments. The experiments show that a novice user achieved more than half of the expert score on average. To mitigate the gap of them, some potential enhancements were identified and integrated to the enhanced version

    LifeSeeker 3.0 : an interactive lifelog search engine for LSC’21

    Get PDF
    In this paper, we present the interactive lifelog retrieval engine developed for the LSC’21 comparative benchmarking challenge. The LifeSeeker 3.0 interactive lifelog retrieval engine is an enhanced version of our previous system participating in LSC’20 - LifeSeeker 2.0. The system is developed by both Dublin City University and the Ho Chi Minh City University of Science. The implementation of LifeSeeker 3.0 focuses on searching and filtering by text query using a weighted Bag-of-Words model with visual concept augmentation and three weighted vocabularies. The visual similarity search is improved using a bag of local convolutional features; while improving the previous version’s performance, enhancing query processing time, result displaying, and browsing support

    Exquisitor:Interactive Learning for Multimedia

    Get PDF

    Visual access to lifelog data in a virtual environment

    Get PDF
    Continuous image capture via a wearable camera is currently one of the most popular methods to establish a comprehensive record of the entirety of an indi- vidual’s life experience, referred to in the research community as a lifelog. These vast image corpora are further enriched by content analysis and combined with additional data such as biometrics to generate as extensive a record of a person’s life as possible. However, interfacing with such datasets remains an active area of research, and despite the advent of new technology and a plethora of com- peting mediums for processing digital information, there has been little focus on newly emerging platforms such as virtual reality. We hypothesise that the increase in immersion, accessible spatial dimensions, and more, could provide significant benefits in the lifelogging domain over more conventional media. In this work, we motivate virtual reality as a viable method of lifelog exploration by performing an in-depth analysis using a novel application prototype built for the HTC Vive. This research also includes the development of a governing design framework for lifelog applications which supported the development of our prototype but is also intended to support the development of future such lifelog systems

    Graph-based indexing and retrieval of lifelog data

    Get PDF
    Understanding the relationship between objects in an image is an important challenge because it can help to describe actions in the image. In this paper, a graphical data structure, named “Scene Graph”, is utilized to represent an encoded informative visual relationship graph for an image, which we suggest has a wide range of potential applications. This scene graph is applied and tested in the popular domain of lifelogs, and specifically in the challenge of known-item retrieval from lifelogs. In this work, every lifelog image is represented by a scene graph, and at retrieval time, this scene graph is compared with the semantic graph, parsed from a textual query. The result is combined with location or date information to determine the matching items. The experiment shows that this technique can outperform a conventional method

    FIRST - Flexible interactive retrieval SysTem for visual lifelog exploration at LSC 2020

    Get PDF
    Lifelog can provide useful insights of our daily activities. It is essential to provide a flexible way for users to retrieve certain events or moments of interest, corresponding to a wide variation of query types. This motivates us to develop FIRST, a Flexible Interactive Retrieval SysTem, to help users to combine or integrate various query components in a flexible manner to handle different query scenarios, such as visual clustering data based on color histogram, visual similarity, GPS location, or scene attributes. We also employ personalized concept detection and image captioning to enhance image understanding from visual lifelog data, and develop an autoencoderlike approach for query text and image feature mapping. Furthermore, we refine the user interface of the retrieval system to better assist users in query expansion and verifying sequential events in a flexible temporal resolution to control the navigation speed through sequences of images
    corecore