6 research outputs found

    Temporal multimodal video and lifelog retrieval

    Get PDF
    The past decades have seen exponential growth of both consumption and production of data, with multimedia such as images and videos contributing significantly to said growth. The widespread proliferation of smartphones has provided everyday users with the ability to consume and produce such content easily. As the complexity and diversity of multimedia data has grown, so has the need for more complex retrieval models which address the information needs of users. Finding relevant multimedia content is central in many scenarios, from internet search engines and medical retrieval to querying one's personal multimedia archive, also called lifelog. Traditional retrieval models have often focused on queries targeting small units of retrieval, yet users usually remember temporal context and expect results to include this. However, there is little research into enabling these information needs in interactive multimedia retrieval. In this thesis, we aim to close this research gap by making several contributions to multimedia retrieval with a focus on two scenarios, namely video and lifelog retrieval. We provide a retrieval model for complex information needs with temporal components, including a data model for multimedia retrieval, a query model for complex information needs, and a modular and adaptable query execution model which includes novel algorithms for result fusion. The concepts and models are implemented in vitrivr, an open-source multimodal multimedia retrieval system, which covers all aspects from extraction to query formulation and browsing. vitrivr has proven its usefulness in evaluation campaigns and is now used in two large-scale interdisciplinary research projects. We show the feasibility and effectiveness of our contributions in two ways: firstly, through results from user-centric evaluations which pit different user-system combinations against one another. Secondly, we perform a system-centric evaluation by creating a new dataset for temporal information needs in video and lifelog retrieval with which we quantitatively evaluate our models. The results show significant benefits for systems that enable users to specify more complex information needs with temporal components. Participation in interactive retrieval evaluation campaigns over multiple years provides insight into possible future developments and challenges of such campaigns

    Visual access to lifelog data in a virtual environment

    Get PDF
    Continuous image capture via a wearable camera is currently one of the most popular methods to establish a comprehensive record of the entirety of an indi- vidual’s life experience, referred to in the research community as a lifelog. These vast image corpora are further enriched by content analysis and combined with additional data such as biometrics to generate as extensive a record of a person’s life as possible. However, interfacing with such datasets remains an active area of research, and despite the advent of new technology and a plethora of com- peting mediums for processing digital information, there has been little focus on newly emerging platforms such as virtual reality. We hypothesise that the increase in immersion, accessible spatial dimensions, and more, could provide significant benefits in the lifelogging domain over more conventional media. In this work, we motivate virtual reality as a viable method of lifelog exploration by performing an in-depth analysis using a novel application prototype built for the HTC Vive. This research also includes the development of a governing design framework for lifelog applications which supported the development of our prototype but is also intended to support the development of future such lifelog systems

    Improving instance search performance in video collections

    Get PDF
    This thesis presents methods to improve instance search and enhance user performance while browsing unstructured video collections. Through the use of computer vision and information retrieval techniques, we propose novel solutions to analyse visual content and build a search algorithm to address the challenges of visual instance search, while considering the constraints for practical applications. Firstly, we investigate methods to improve the effectiveness of instance search systems for finding object instances which occurred in unstructured video content. Using the bag of feature framework, we propose a novel algorithm to use the geometric correlation information between local features to improve the accuracy of local feature matching, thus improve the performance of instance search systems without introducing much computation cost. Secondly, we consider the scenario that the performance of instance search systems may drop due to the volume of visual content in large video collections. We introduce a search algorithm based on embedded coding to increase the effectiveness and efficiency of instance search systems. And we participate in the international video evaluation campaign, TREC Video Retrieval Evaluation, to comparatively evaluate the performance of our proposed methods. Finally, the exploration and navigation of visual content when browsing large unstructured video collections is considered. We propose methods to address such challenges and build an interactive video browsing tool to improve user performance while seeking interesting content over video collections. We construct a structured content representation with similarity graph using our proposed instance search technologies. Considering the constraints related to real world usability, we present a flexible interface based on faceted navigation to enhance user performance when completing video browsing tasks. This thesis shows that user performance can be enhanced by improving the effectiveness of instance search approaches, when seeking information in unstructured video collection. While covering many different aspects of improving instance search in this work, we outline three potential directions for future work: advanced feature representation, data driven rank and cloud-based search algorithms

    The Video Browser Showdown: a live evaluation of interactive video search tools

    Get PDF
    The Video Browser Showdown evaluates the performance of exploratory video search tools on a common data set in a common environment and in presence of the audience. The main goal of this competition is to enable researchers in the field of interactive video search to directly compare their tools at work. In this paper, we present results from the second Video Browser Showdown (VBS2013) and describe and evaluate the tools of all participating teams in detail. The evaluation results give insights on how exploratory video search tools are used and how they perform in direct comparison. Moreover, we compare the achieved performance to results from another user study where 16 participants employed a standard video player to complete the same tasks as performed in VBS2013. This comparison shows that the sophisticated tools enable better performance in general, but for some tasks common video players provide similar performance and could even outperform the expert tools. Our results highlight the need for further improvement of professional tools for interactive search in videos

    Adapting content based video retrieval systems to accommodate the novice user on mobile devices.

    Get PDF
    With recent uptake in the usage of mobile devices, such as smartphones and tablets, increasing at an exponential rate, these devices have become part of everyday life. This high yield of information access comes at a cost. With still limited input metrics, it is prudent to develop content based techniques to filter the amount of content that is returned, for example, from search requests to video search engines. In addition, such handheld devices are used by a highly heterogeneous user community, including people with little or no experience. In this work, we focus on the latter, i.e. such casual users (‘novices’), and target video search and retrieval. We begin by examining new methods of developing related Content-Based Multimedia Information Retrieval systems for novices on handheld tablet devices. We analyze the shortcomings of traditional desktop systems which favor the expert user formulating complex queries and focus on the simplicity of design and interaction on tablet devices. We create and test three prototype demonstrators over three years of the TRECVid known item search task in order to determine the best features and appropriate usage to attain both high quality, usability, and precision from our novice users. In the first experiment, we determine that novice users perform similarly to an expert user group, one major premise of this research. In our second experiment, we analyze methods which can be applied automatically to aid novice users, thus enhancing their search performance. Our final experiment deals with different visualization approaches which can further aid the users. Overall, our results show that each year our systems made an incremental improvement. The 2011 TRECVid system performed best of all submissions in that year, despite the reduced complexity, enabling novice users to perform equally well as experts and experienced searchers

    DCU at MMM 2013 video browser showdown

    Get PDF
    This paper describes a handheld video browser that in corporates shot boundary detection, key frame extraction, semantic content analysis, key frame browsing, and similarity search
    corecore