38 research outputs found

    Temporal multimodal video and lifelog retrieval

    Get PDF
    The past decades have seen exponential growth of both consumption and production of data, with multimedia such as images and videos contributing significantly to said growth. The widespread proliferation of smartphones has provided everyday users with the ability to consume and produce such content easily. As the complexity and diversity of multimedia data has grown, so has the need for more complex retrieval models which address the information needs of users. Finding relevant multimedia content is central in many scenarios, from internet search engines and medical retrieval to querying one's personal multimedia archive, also called lifelog. Traditional retrieval models have often focused on queries targeting small units of retrieval, yet users usually remember temporal context and expect results to include this. However, there is little research into enabling these information needs in interactive multimedia retrieval. In this thesis, we aim to close this research gap by making several contributions to multimedia retrieval with a focus on two scenarios, namely video and lifelog retrieval. We provide a retrieval model for complex information needs with temporal components, including a data model for multimedia retrieval, a query model for complex information needs, and a modular and adaptable query execution model which includes novel algorithms for result fusion. The concepts and models are implemented in vitrivr, an open-source multimodal multimedia retrieval system, which covers all aspects from extraction to query formulation and browsing. vitrivr has proven its usefulness in evaluation campaigns and is now used in two large-scale interdisciplinary research projects. We show the feasibility and effectiveness of our contributions in two ways: firstly, through results from user-centric evaluations which pit different user-system combinations against one another. Secondly, we perform a system-centric evaluation by creating a new dataset for temporal information needs in video and lifelog retrieval with which we quantitatively evaluate our models. The results show significant benefits for systems that enable users to specify more complex information needs with temporal components. Participation in interactive retrieval evaluation campaigns over multiple years provides insight into possible future developments and challenges of such campaigns

    Overview of imageCLEFlifelog 2019: solve my life puzzle and lifelog Moment retrieval

    Get PDF
    This paper describes ImageCLEFlifelog 2019, the third edition of the Lifelog task. In this edition, the task was composed of two subtasks (challenges): the Lifelog Moments Retrieval (LMRT) challenge that followed the same format as in the previous edition, and the Solve My Life Puzzle (Puzzle), a brand new task that focused on rearranging lifelog moments in temporal order. ImageCLEFlifelog 2019 received noticeably higher submissions than the previous editions, with ten teams participating resulting in a total number of 109 runs

    Evaluating Information Retrieval and Access Tasks

    Get PDF
    This open access book summarizes the first two decades of the NII Testbeds and Community for Information access Research (NTCIR). NTCIR is a series of evaluation forums run by a global team of researchers and hosted by the National Institute of Informatics (NII), Japan. The book is unique in that it discusses not just what was done at NTCIR, but also how it was done and the impact it has achieved. For example, in some chapters the reader sees the early seeds of what eventually grew to be the search engines that provide access to content on the World Wide Web, today’s smartphones that can tailor what they show to the needs of their owners, and the smart speakers that enrich our lives at home and on the move. We also get glimpses into how new search engines can be built for mathematical formulae, or for the digital record of a lived human life. Key to the success of the NTCIR endeavor was early recognition that information access research is an empirical discipline and that evaluation therefore lay at the core of the enterprise. Evaluation is thus at the heart of each chapter in this book. They show, for example, how the recognition that some documents are more important than others has shaped thinking about evaluation design. The thirty-three contributors to this volume speak for the many hundreds of researchers from dozens of countries around the world who together shaped NTCIR as organizers and participants. This book is suitable for researchers, practitioners, and students—anyone who wants to learn about past and present evaluation efforts in information retrieval, information access, and natural language processing, as well as those who want to participate in an evaluation task or even to design and organize one

    Recuperação e identificação de momentos em imagens

    Get PDF
    In our modern society almost anyone is able to capture moments and record events due to the ease accessibility to smartphones. This leads to the question, if we record so much of our life how can we easily retrieve specific moments? The answer to this question would open the door for a big leap in human life quality. The possibilities are endless, from trivial problems like finding a photo of a birthday cake to being capable of analyzing the progress of mental illnesses in patients or even tracking people with infectious diseases. With so much data being created everyday, the answer to this question becomes more complex. There is no stream lined approach to solve the problem of moment localization in a large dataset of images and investigations into this problem have only started a few years ago. ImageCLEF is one competition where researchers participate and try to achieve new and better results in the task of moment retrieval. This complex problem, along with the interest in participating in the ImageCLEF Lifelog Moment Retrieval Task posed a good challenge for the development of this dissertation. The proposed solution consists in developing a system capable of retriving images automatically according to specified moments described in a corpus of text without any sort of user interaction and using only state-of-the-art image and text processing methods. The developed retrieval system achieves this objective by extracting and categorizing relevant information from text while being able to compute a similarity score with the extracted labels from the image processing stage. In this way, the system is capable of telling if images are related to the specified moment in text and therefore able to retrieve the pictures accordingly. In the ImageCLEF Life Moment Retrieval 2020 subtask the proposed automatic retrieval system achieved a score of 0.03 in the F1-measure@10 evaluation methodology. Even though this scores are not competitve when compared to other teams systems scores, the built system presents a good baseline for future work.Na sociedade moderna, praticamente qualquer pessoa consegue capturar momentos e registar eventos devido à facilidade de acesso a smartphones. Isso leva à questão, se registamos tanto da nossa vida, como podemos facilmente recuperar momentos específicos? A resposta a esta questão abriria a porta para um grande salto na qualidade da vida humana. As possibilidades são infinitas, desde problemas triviais como encontrar a foto de um bolo de aniversário até ser capaz de analisar o progresso de doenças mentais em pacientes ou mesmo rastrear pessoas com doenças infecciosas. Com tantos dados a serem criados todos os dias, a resposta a esta pergunta torna-se mais complexa. Não existe uma abordagem linear para resolver o problema da localização de momentos num grande conjunto de imagens e investigações sobre este problema começaram há apenas poucos anos. O ImageCLEF é uma competição onde investigadores participam e tentam alcançar novos e melhores resultados na tarefa de recuperação de momentos a cada ano. Este problema complexo, em conjunto com o interesse em participar na tarefa ImageCLEF Lifelog Moment Retrieval, apresentam-se como um bom desafio para o desenvolvimento desta dissertação. A solução proposta consiste num sistema capaz de recuperar automaticamente imagens de momentos descritos em formato de texto, sem qualquer tipo de interação de um utilizador, utilizando apenas métodos estado da arte de processamento de imagem e texto. O sistema de recuperação desenvolvido alcança este objetivo através da extração e categorização de informação relevante de texto enquanto calcula um valor de similaridade com os rótulos extraídos durante a fase de processamento de imagem. Dessa forma, o sistema consegue dizer se as imagens estão relacionadas ao momento especificado no texto e, portanto, é capaz de recuperar as imagens de acordo. Na subtarefa ImageCLEF Life Moment Retrieval 2020, o sistema de recuperação automática de imagens proposto alcançou uma pontuação de 0.03 na metodologia de avaliação F1-measure@10. Mesmo que estas pontuações não sejam competitivas quando comparadas às pontuações de outros sistemas de outras equipas, o sistema construído apresenta-se como uma boa base para trabalhos futuros.Mestrado em Engenharia Eletrónica e Telecomunicaçõe

    Organising and structuring a visual diary using visual interest point detectors

    Get PDF
    As wearable cameras become more popular, researchers are increasingly focusing on novel applications to manage the large volume of data these devices produce. One such application is the construction of a Visual Diary from an individual’s photographs. Microsoft’s SenseCam, a device designed to passively record a Visual Diary and cover a typical day of the user wearing the camera, is an example of one such device. The vast quantity of images generated by these devices means that the management and organisation of these collections is not a trivial matter. We believe wearable cameras, such as SenseCam, will become more popular in the future and the management of the volume of data generated by these devices is a key issue. Although there is a significant volume of work in the literature in the object detection and recognition and scene classification fields, there is little work in the area of setting detection. Furthermore, few authors have examined the issues involved in analysing extremely large image collections (like a Visual Diary) gathered over a long period of time. An algorithm developed for setting detection should be capable of clustering images captured at the same real world locations (e.g. in the dining room at home, in front of the computer in the office, in the park, etc.). This requires the selection and implementation of suitable methods to identify visually similar backgrounds in images using their visual features. We present a number of approaches to setting detection based on the extraction of visual interest point detectors from the images. We also analyse the performance of two of the most popular descriptors - Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF).We present an implementation of a Visual Diary application and evaluate its performance via a series of user experiments. Finally, we also outline some techniques to allow the Visual Diary to automatically detect new settings, to scale as the image collection continues to grow substantially over time, and to allow the user to generate a personalised summary of their data

    Semantic interpretation of events in lifelogging

    Get PDF
    The topic of this thesis is lifelogging, the automatic, passive recording of a person’s daily activities and in particular, on performing a semantic analysis and enrichment of lifelogged data. Our work centers on visual lifelogged data, such as taken from wearable cameras. Such wearable cameras generate an archive of a person’s day taken from a first-person viewpoint but one of the problems with this is the sheer volume of information that can be generated. In order to make this potentially very large volume of information more manageable, our analysis of this data is based on segmenting each day’s lifelog data into discrete and non-overlapping events corresponding to activities in the wearer’s day. To manage lifelog data at an event level, we define a set of concepts using an ontology which is appropriate to the wearer, applying automatic detection of concepts to these events and then semantically enriching each of the detected lifelog events making them an index into the events. Once this enrichment is complete we can use the lifelog to support semantic search for everyday media management, as a memory aid, or as part of medical analysis on the activities of daily living (ADL), and so on. In the thesis, we address the problem of how to select the concepts to be used for indexing events and we propose a semantic, density- based algorithm to cope with concept selection issues for lifelogging. We then apply activity detection to classify everyday activities by employing the selected concepts as high-level semantic features. Finally, the activity is modeled by multi-context representations and enriched by Semantic Web technologies. The thesis includes an experimental evaluation using real data from users and shows the performance of our algorithms in capturing the semantics of everyday concepts and their efficacy in activity recognition and semantic enrichment

    System for activity tracking of patients with chronic kidney disease

    Get PDF
    Many people suffering from chronic kidney disease are in need of a kidney transplant. A problem in the health care is that patients cannot undergo surgery if they have too much belly fat. Regular exercise and physical activity are therefore crucial for this group of people. A project group at the Skåne University Hospital has recently been established to help patients with chronic kidney disease to lose weight. A question they asked themselves was whether it was possible to use activity tracking devices in the project. The purpose of this thesis is to design and evaluate a system that can be used to track the physical activities of the patients. The system will be built using a Sony SmartBand, a wristband collecting and analyzing data about a user’s daily activities

    Data Management for Dynamic Multimedia Analytics and Retrieval

    Get PDF
    Multimedia data in its various manifestations poses a unique challenge from a data storage and data management perspective, especially if search, analysis and analytics in large data corpora is considered. The inherently unstructured nature of the data itself and the curse of dimensionality that afflicts the representations we typically work with in its stead are cause for a broad range of issues that require sophisticated solutions at different levels. This has given rise to a huge corpus of research that puts focus on techniques that allow for effective and efficient multimedia search and exploration. Many of these contributions have led to an array of purpose-built, multimedia search systems. However, recent progress in multimedia analytics and interactive multimedia retrieval, has demonstrated that several of the assumptions usually made for such multimedia search workloads do not hold once a session has a human user in the loop. Firstly, many of the required query operations cannot be expressed by mere similarity search and since the concrete requirement cannot always be anticipated, one needs a flexible and adaptable data management and query framework. Secondly, the widespread notion of staticity of data collections does not hold if one considers analytics workloads, whose purpose is to produce and store new insights and information. And finally, it is impossible even for an expert user to specify exactly how a data management system should produce and arrive at the desired outcomes of the potentially many different queries. Guided by these shortcomings and motivated by the fact that similar questions have once been answered for structured data in classical database research, this Thesis presents three contributions that seek to mitigate the aforementioned issues. We present a query model that generalises the notion of proximity-based query operations and formalises the connection between those queries and high-dimensional indexing. We complement this by a cost-model that makes the often implicit trade-off between query execution speed and results quality transparent to the system and the user. And we describe a model for the transactional and durable maintenance of high-dimensional index structures. All contributions are implemented in the open-source multimedia database system Cottontail DB, on top of which we present an evaluation that demonstrates the effectiveness of the proposed models. We conclude by discussing avenues for future research in the quest for converging the fields of databases on the one hand and (interactive) multimedia retrieval and analytics on the other

    A privacy-aware and secure system for human memory augmentation

    Get PDF
    The ubiquity of digital sensors embedded in today's mobile and wearable devices (e.g., smartphones, wearable cameras, wristbands) has made technology more intertwined with our life. Among many other things, this allows us to seamlessly log our daily experiences in increasing numbers and quality, a process known as ``lifelogging''. This practice produces a great amount of pictures and videos that can potentially improve human memory. Consider how a single photograph can bring back distant childhood memories, or how a song can help us reminisce about our last vacation. Such a vision of a ``memory augmentation system'' can offer considerable benefits, but it also raises new security and privacy challenges. Maybe obviously, a system that captures everywhere we go, and everything we say, see, and do, is greatly increasing the danger to our privacy. Any data breach of such a memory repository, whether accidental or malicious, could negatively impact both our professional and private reputation. In addition, the threat of memory manipulation might be the most worrisome aspect of a memory augmentation system: if an attacker is able to remove, add, or change our captured information, the resulting data may implant memories in our heads that never took place, or, in turn, accelerate the loss of other memories. Starting from such key challenges, this thesis investigates how to design secure memory augmentation systems. In the course of this research, we develop tools and prototypes that can be applied by researchers and system engineers to develop pervasive applications that help users capture and later recall episodic memories in a secure fashion. We build trusted sensors and protocols to securely capture and store experience data, and secure software for the secure and privacy-aware exchange of experience data with others. We explore the suitability of various access control models to put users in control of the plethora of data that the system captures on their behalf. We also explore the possibility of using in situ physical gestures to control different aspects regarding the capturing and sharing of experience data. Ultimately, this thesis contributes to the design and development of secure systems for memory augmentation

    The future of social is personal: the potential of the personal data store

    No full text
    This chapter argues that technical architectures that facilitate the longitudinal, decentralised and individual-centric personal collection and curation of data will be an important, but partial, response to the pressing problem of the autonomy of the data subject, and the asymmetry of power between the subject and large scale service providers/data consumers. Towards framing the scope and role of such Personal Data Stores (PDSes), the legalistic notion of personal data is examined, and it is argued that a more inclusive, intuitive notion expresses more accurately what individuals require in order to preserve their autonomy in a data-driven world of large aggregators. Six challenges towards realising the PDS vision are set out: the requirement to store data for long periods; the difficulties of managing data for individuals; the need to reconsider the regulatory basis for third-party access to data; the need to comply with international data handling standards; the need to integrate privacy-enhancing technologies; and the need to future-proof data gathering against the evolution of social norms. The open experimental PDS platform INDX is introduced and described, as a means of beginning to address at least some of these six challenges
    corecore