25 research outputs found

    Learning and mining from personal digital archives

    Get PDF
    Given the explosion of new sensing technologies, data storage has become significantly cheaper and consequently, people increasingly rely on wearable devices to create personal digital archives. Lifelogging is the act of recording aspects of life in digital format for a variety of purposes such as aiding human memory, analysing human lifestyle and diet monitoring. In this dissertation we are concerned with Visual Lifelogging, a form of lifelogging based on the passive capture of photographs by a wearable camera. Cameras, such as Microsoft's SenseCam can record up to 4,000 images per day as well as logging data from several incorporated sensors. Considering the volume, complexity and heterogeneous nature of such data collections, it is a signifcant challenge to interpret and extract knowledge for the practical use of lifeloggers and others. In this dissertation, time series analysis methods have been used to identify and extract useful information from temporal lifelogging images data, without benefit of prior knowledge. We focus, in particular, on three fundamental topics: noise reduction, structure and characterization of the raw data; the detection of multi-scale patterns; and the mining of important, previously unknown repeated patterns in the time series of lifelog image data. Firstly, we show that Detrended Fluctuation Analysis (DFA) highlights the feature of very high correlation in lifelogging image collections. Secondly, we show that study of equal-time Cross-Correlation Matrix demonstrates atypical or non-stationary characteristics in these images. Next, noise reduction in the Cross-Correlation Matrix is addressed by Random Matrix Theory (RMT) before Wavelet multiscaling is used to characterize the `most important' or `unusual' events through analysis of the associated dynamics of the eigenspectrum. A motif discovery technique is explored for detection of recurring and recognizable episodes of an individual's image data. Finally, we apply these motif discovery techniques to two known lifelog data collections, All I Have Seen (AIHS) and NTCIR-12 Lifelog, in order to examine multivariate recurrent patterns of multiple-lifelogging users

    AH '16: Proceedings of the 7th Augmented Human International Conference 2016

    Get PDF

    Digital life stories: Semi-automatic (auto)biographies within lifelog collections

    Get PDF
    Our life stories enable us to reflect upon and share our personal histories. Through emerging digital technologies the possibility of collecting life experiences digitally is increasingly feasible; consequently so is the potential to create a digital counterpart to our personal narratives. In this work, lifelogging tools are used to collect digital artifacts continuously and passively throughout our day. These include images, documents, emails and webpages accessed; texts messages and mobile activity. This range of data when brought together is known as a lifelog. Given the complexity, volume and multimodal nature of such collections, it is clear that there are significant challenges to be addressed in order to achieve coherent and meaningful digital narratives of our events from our life histories. This work investigates the construction of personal digital narratives from lifelog collections. It examines the underlying questions, issues and challenges relating to construction of personal digital narratives from lifelogs. Fundamentally, it addresses how to organize and transform data sampled from an individual’s day-to-day activities into a coherent narrative account. This enquiry is enabled by three 20-month long-term lifelogs collected by participants and produces a narrative system which enables the semi-automatic construction of digital stories from lifelog content. Inspired by probative studies conducted into current practices of curation, from which a set of fundamental requirements are established, this solution employs a 2-dimensional spatial framework for storytelling. It delivers integrated support for the structuring of lifelog content and its distillation into storyform through information retrieval approaches. We describe and contribute flexible algorithmic approaches to achieve both. Finally, this research inquiry yields qualitative and quantitative insights into such digital narratives and their generation, composition and construction. The opportunities for such personal narrative accounts to enable recollection, reminiscence and reflection with the collection owners are established and its benefit in sharing past personal experience experiences is outlined. Finally, in a novel investigation with motivated third parties we demonstrate the opportunities such narrative accounts may have beyond the scope of the collection owner in: personal, societal and cultural explorations, artistic endeavours and as a generational heirloom

    The role of context in image annotation and recommendation

    Get PDF
    With the rise of smart phones, lifelogging devices (e.g. Google Glass) and popularity of image sharing websites (e.g. Flickr), users are capturing and sharing every aspect of their life online producing a wealth of visual content. Of these uploaded images, the majority are poorly annotated or exist in complete semantic isolation making the process of building retrieval systems difficult as one must firstly understand the meaning of an image in order to retrieve it. To alleviate this problem, many image sharing websites offer manual annotation tools which allow the user to “tag” their photos, however, these techniques are laborious and as a result have been poorly adopted; Sigurbjörnsson and van Zwol (2008) showed that 64% of images uploaded to Flickr are annotated with < 4 tags. Due to this, an entire body of research has focused on the automatic annotation of images (Hanbury, 2008; Smeulders et al., 2000; Zhang et al., 2012a) where one attempts to bridge the semantic gap between an image’s appearance and meaning e.g. the objects present. Despite two decades of research the semantic gap still largely exists and as a result automatic annotation models often offer unsatisfactory performance for industrial implementation. Further, these techniques can only annotate what they see, thus ignoring the “bigger picture” surrounding an image (e.g. its location, the event, the people present etc). Much work has therefore focused on building photo tag recommendation (PTR) methods which aid the user in the annotation process by suggesting tags related to those already present. These works have mainly focused on computing relationships between tags based on historical images e.g. that NY and timessquare co-exist in many images and are therefore highly correlated. However, tags are inherently noisy, sparse and ill-defined often resulting in poor PTR accuracy e.g. does NY refer to New York or New Year? This thesis proposes the exploitation of an image’s context which, unlike textual evidences, is always present, in order to alleviate this ambiguity in the tag recommendation process. Specifically we exploit the “what, who, where, when and how” of the image capture process in order to complement textual evidences in various photo tag recommendation and retrieval scenarios. In part II, we combine text, content-based (e.g. # of faces present) and contextual (e.g. day-of-the-week taken) signals for tag recommendation purposes, achieving up to a 75% improvement to precision@5 in comparison to a text-only TF-IDF baseline. We then consider external knowledge sources (i.e. Wikipedia & Twitter) as an alternative to (slower moving) Flickr in order to build recommendation models on, showing that similar accuracy could be achieved on these faster moving, yet entirely textual, datasets. In part II, we also highlight the merits of diversifying tag recommendation lists before discussing at length various problems with existing automatic image annotation and photo tag recommendation evaluation collections. In part III, we propose three new image retrieval scenarios, namely “visual event summarisation”, “image popularity prediction” and “lifelog summarisation”. In the first scenario, we attempt to produce a rank of relevant and diverse images for various news events by (i) removing irrelevant images such memes and visual duplicates (ii) before semantically clustering images based on the tweets in which they were originally posted. Using this approach, we were able to achieve over 50% precision for images in the top 5 ranks. In the second retrieval scenario, we show that by combining contextual and content-based features from images, we are able to predict if it will become “popular” (or not) with 74% accuracy, using an SVM classifier. Finally, in chapter 9 we employ blur detection and perceptual-hash clustering in order to remove noisy images from lifelogs, before combining visual and geo-temporal signals in order to capture a user’s “key moments” within their day. We believe that the results of this thesis show an important step towards building effective image retrieval models when there lacks sufficient textual content (i.e. a cold start)

    The corporate blog as an emerging genre of computer-mediated communication: features, constraints, discourse situation

    Get PDF
    Digital technology is increasingly impacting how we keep informed, how we communicate professionally and privately, and how we initiate and maintain relationships with others. The function and meaning of new forms of computer-mediated communication (CMC) is not always clear to users on the onset and must be negotiated by communities, institutions and individuals alike. Are chatrooms and virtual environments suitable for business communication? Is email increasingly a channel for work-related, formal communication and thus "for old people", as especially young Internet users flock to Social Networking Sites (SNSs)? Cornelius Puschmann examines the linguistic and rhetorical properties of the weblog, another relatively young genre of CMC, to determine its function in private and professional (business) communication. He approaches the question of what functions blogs realize for authors and readers and argues that corporate blogs, which, like blogs by private individuals, are a highly diverse in terms of their form, function and intended audience, essentially mimic key characteristics of private blogs in order to appear open, non-persuasive and personal, all essential qualities for companies that wish to make a positive impression on their constituents.Digital technology is increasingly impacting how we keep informed, how we communicate professionally and privately, and how we initiate and maintain relationships with others. The function and meaning of new forms of computer-mediated communication (CMC) is not always clear to users on the onset and must be negotiated by communities, institutions and individuals alike. Are chatrooms and virtual environments suitable for business communication? Is email increasingly a channel for work-related, formal communication and thus "for old people", as especially young Internet users flock to Social Networking Sites (SNSs)? Cornelius Puschmann examines the linguistic and rhetorical properties of the weblog, another relatively young genre of CMC, to determine its function in private and professional (business) communication. He approaches the question of what functions blogs realize for authors and readers and argues that corporate blogs, which, like blogs by private individuals, are a highly diverse in terms of their form, function and intended audience, essentially mimic key characteristics of private blogs in order to appear open, non-persuasive and personal, all essential qualities for companies that wish to make a positive impression on their constituents

    Representation of subjectivity in the diary films and videos of Jonas Mekas

    Get PDF
    This thesis explores the representation of subjectivity in the diary films, videos and online projects of Jonas Mekas. In Chapter 1 the intersection of avant-garde and documentary practices are traced to establish that avant-garde filmmakers prioritised subjective representations of the historical world before it became more widely adopted by documentarians. Chapters 2 and 3 then focus on the representation of Mekas’ subjectivity in film, while Chapters 4 and 5 focus on video and the Internet. By employing this chronological and technological structure the theoretical and historical debates around subjectivity in film is established in the first three chapters, then re-explored through the prism of new technology. The chapters also operate in pairs that bridge the technological divide to emphasise the different ways that the textual representation of subjectivity is fragmented: Chapters 2 and 4 both focus on the articulation of subjectivity at the moment of shooting, while Chapters 3 and 5 explore the organisation of that material, where new layers of self-inscription are applied. The chapters therefore expand and complicate issues of split subjectivity within new technological frameworks, while remaining focused on attempting to understand Mekas’ diary practice

    Data Mining and Visualization of Large Human Behavior Data Sets

    Get PDF

    A quantified past : fieldwork and design for remembering a data-driven life

    Get PDF
    PhD ThesisA ‘data-driven life’ has become an established feature of present and future technological visions. Smart homes, smart cities, an Internet of Things, and particularly the Quantified Self movement are all premised on the pervasive datafication of many aspects of everyday life. This thesis interrogates the human experience of such a data-driven life, by conceptualising, investigating, and speculating about these personal informatics tools as new technologies of memory. With respect to existing discourses in Human-Computer Interaction, Memory Studies and Critical Data Studies, I argue that the prevalence of quantified data and metrics is creating fundamentally new and distinct records of everyday life: a quantified past. To address this, I first conduct qualitative, and idiographic fieldwork – with long-term self-trackers, and subsequently with users of ‘smart journals’ – to investigate how this data-driven record mediates the experience of remembering. Further, I undertake a speculative and design-led inquiry to explore context of a ’quantified wedding’. Adopting a context where remembering is centrally valued, this Research through Design project demonstrates opportunities and develops considerations for the design of data-driven tools for remembering. Crucially, while speculative, this project maintains a central focus on individual experience, and introduces an innovative methodological approach ‘Speculative Enactments’ for engaging participants meaningfully in speculative inquiry. The outcomes of this conceptual, empirical and speculative inquiry are multiple. I present, and interpret, a variety of rich descriptions of existing and anticipated practices of remembering with data. Introducing six experiential qualities of data, and reflecting on how data requires selectivity and construction to meaningfully account for one’s life, I argue for the design of ‘Documentary Informatics’. This perspective fundamentally reimagines the roles and possibilities for personal informatics tools; it looks beyond the current present-focused and goal-oriented paradigm of a data-driven life, to propose a more poetic orientation to recording one’s life with quantified data
    corecore