4 research outputs found

    Advanced content-based semantic scene analysis and information retrieval: the SCHEMA project

    Get PDF
    The aim of the SCHEMA Network of Excellence is to bring together a critical mass of universities, research centers, industrial partners and end users, in order to design a reference system for content-based semantic scene analysis, interpretation and understanding. Relevant research areas include: content-based multimedia analysis and automatic annotation of semantic multimedia content, combined textual and multimedia information retrieval, semantic -web, MPEG-7 and MPEG-21 standards, user interfaces and human factors. In this paper, recent advances in content-based analysis, indexing and retrieval of digital media within the SCHEMA Network are presented. These advances will be integrated in the SCHEMA module-based, expandable reference system

    Similarity Pyramids for Browsing and Organization of Large Image Databases

    No full text
    The advent of large image databases (>10,000) has created a need for tools which can search and organize images automatically by their content. This paper presents a method for designing a hierarchical browsing environment which we call a similarity pyramid. The similarity pyramid groups similar images together while allowing users to view the database at varying levels of resolution. We show that the similarity pyramid is best constructed using agglomerative (bottom-up) clustering methods, and present a fast-sparse clustering method which dramatically reduces both memory and computation over conventional methods. We then present an objective measure of pyramid organization called dispersion, and we use it to show that our fast-sparse clustering method produces better similarity pyramids than top down approaches

    Organising and structuring a visual diary using visual interest point detectors

    Get PDF
    As wearable cameras become more popular, researchers are increasingly focusing on novel applications to manage the large volume of data these devices produce. One such application is the construction of a Visual Diary from an individual’s photographs. Microsoft’s SenseCam, a device designed to passively record a Visual Diary and cover a typical day of the user wearing the camera, is an example of one such device. The vast quantity of images generated by these devices means that the management and organisation of these collections is not a trivial matter. We believe wearable cameras, such as SenseCam, will become more popular in the future and the management of the volume of data generated by these devices is a key issue. Although there is a significant volume of work in the literature in the object detection and recognition and scene classification fields, there is little work in the area of setting detection. Furthermore, few authors have examined the issues involved in analysing extremely large image collections (like a Visual Diary) gathered over a long period of time. An algorithm developed for setting detection should be capable of clustering images captured at the same real world locations (e.g. in the dining room at home, in front of the computer in the office, in the park, etc.). This requires the selection and implementation of suitable methods to identify visually similar backgrounds in images using their visual features. We present a number of approaches to setting detection based on the extraction of visual interest point detectors from the images. We also analyse the performance of two of the most popular descriptors - Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF).We present an implementation of a Visual Diary application and evaluate its performance via a series of user experiments. Finally, we also outline some techniques to allow the Visual Diary to automatically detect new settings, to scale as the image collection continues to grow substantially over time, and to allow the user to generate a personalised summary of their data

    Designing and evaluating a user interface for continous embedded lifelogging based on physical context

    Get PDF
    PhD ThesisAn increase in both personal information and storage capacity has encouraged people to store and archive their life experience in multimedia formats. The usefulness of such large amounts of data will remain inadequate without the development of both retrieval techniques and interfaces that help people access and navigate their personal collections. The research described in this thesis investigates lifelogging technology from the perspective of the psychology of memory and human-computer interaction. The research described seeks to increase my understanding of what data can trigger memories and how I might use this insight to retrieve past life experiences in interfaces to lifelogging technology. The review of memory and previous research on lifelogging technology allows and support me to establish a clear understanding of how memory works and design novel and effective memory cues; whilst at the same time I critiqued existing lifelogging systems and approaches to retrieving memories of past actions and activities. In the initial experiments I evaluated the design and implementation of a prototype which exposed numerous problems both in the visualisation of data and usability. These findings informed the design of novel lifelogging prototype to facilitate retrieval. I assessed the second prototype and determined how an improved system supported access and retrieval of users’ past life experiences, in particular, how users group their data into events, how they interact with their data, and the classes of memories that it supported. In this doctoral thesis I found that visualizing the movements of users’ hands and bodies facilitated grouping activities into events when combined with the photos and other data captured at the same time. In addition, the movements of the user's hand and body and the movements of some objects can promote an activity recognition or support user detection and grouping of them into events. Furthermore, the ability to search for specific movements significantly reduced the amount of time that it took to retrieve data related to specific events. I revealed three major strategies that users followed to understand the combined data: skimming sequences, cross sensor jumping and continued scanning
    corecore