19 research outputs found

    Continuous Health Interface Event Retrieval

    Full text link
    Knowing the state of our health at every moment in time is critical for advances in health science. Using data obtained outside an episodic clinical setting is the first step towards building a continuous health estimation system. In this paper, we explore a system that allows users to combine events and data streams from different sources to retrieve complex biological events, such as cardiovascular volume overload. These complex events, which have been explored in biomedical literature and which we call interface events, have a direct causal impact on relevant biological systems. They are the interface through which the lifestyle events influence our health. We retrieve the interface events from existing events and data streams by encoding domain knowledge using an event operator language.Comment: ACM International Conference on Multimedia Retrieval 2020 (ICMR 2020), held in Dublin, Ireland from June 8-11, 202

    A lifelogging system supporting multimodal access

    Get PDF
    Today, technology has progressed to allow us to capture our lives digitally such as taking pictures, recording videos and gaining access to WiFi to share experiences using smartphones. People’s lifestyles are changing. One example is from the traditional memo writing to the digital lifelog. Lifelogging is the process of using digital tools to collect personal data in order to illustrate the user’s daily life (Smith et al., 2011). The availability of smartphones embedded with different sensors such as camera and GPS has encouraged the development of lifelogging. It also has brought new challenges in multi-sensor data collection, large volume data storage, data analysis and appropriate representation of lifelog data across different devices. This study is designed to address the above challenges. A lifelogging system was developed to collect, store, analyse, and display multiple sensors’ data, i.e. supporting multimodal access. In this system, the multi-sensor data (also called data streams) is firstly transmitted from smartphone to server only when the phone is being charged. On the server side, six contexts are detected namely personal, time, location, social, activity and environment. Events are then segmented and a related narrative is generated. Finally, lifelog data is presented differently on three widely used devices which are the computer, smartphone and E-book reader. Lifelogging is likely to become a well-accepted technology in the coming years. Manual logging is not possible for most people and is not feasible in the long-term. Automatic lifelogging is needed. This study presents a lifelogging system which can automatically collect multi-sensor data, detect contexts, segment events, generate meaningful narratives and display the appropriate data on different devices based on their unique characteristics. The work in this thesis therefore contributes to automatic lifelogging development and in doing so makes a valuable contribution to the development of the field

    Smartphone picture organization: a hierarchical approach

    Get PDF
    We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to each topic by relying on a lexical database. Topic-related categories are then estimated by using a set of topic-specific Convolutional Neuronal Networks. To validate our approach, we ensemble and make public a large dataset of more than 8,000 smartphone pictures from 40 persons. Experimental results demonstrate major user satisfaction with respect to state of the art solutions in terms of organization.Peer ReviewedPreprin

    Semantic interpretation of events in lifelogging

    Get PDF
    The topic of this thesis is lifelogging, the automatic, passive recording of a person’s daily activities and in particular, on performing a semantic analysis and enrichment of lifelogged data. Our work centers on visual lifelogged data, such as taken from wearable cameras. Such wearable cameras generate an archive of a person’s day taken from a first-person viewpoint but one of the problems with this is the sheer volume of information that can be generated. In order to make this potentially very large volume of information more manageable, our analysis of this data is based on segmenting each day’s lifelog data into discrete and non-overlapping events corresponding to activities in the wearer’s day. To manage lifelog data at an event level, we define a set of concepts using an ontology which is appropriate to the wearer, applying automatic detection of concepts to these events and then semantically enriching each of the detected lifelog events making them an index into the events. Once this enrichment is complete we can use the lifelog to support semantic search for everyday media management, as a memory aid, or as part of medical analysis on the activities of daily living (ADL), and so on. In the thesis, we address the problem of how to select the concepts to be used for indexing events and we propose a semantic, density- based algorithm to cope with concept selection issues for lifelogging. We then apply activity detection to classify everyday activities by employing the selected concepts as high-level semantic features. Finally, the activity is modeled by multi-context representations and enriched by Semantic Web technologies. The thesis includes an experimental evaluation using real data from users and shows the performance of our algorithms in capturing the semantics of everyday concepts and their efficacy in activity recognition and semantic enrichment

    The role of context in image annotation and recommendation

    Get PDF
    With the rise of smart phones, lifelogging devices (e.g. Google Glass) and popularity of image sharing websites (e.g. Flickr), users are capturing and sharing every aspect of their life online producing a wealth of visual content. Of these uploaded images, the majority are poorly annotated or exist in complete semantic isolation making the process of building retrieval systems difficult as one must firstly understand the meaning of an image in order to retrieve it. To alleviate this problem, many image sharing websites offer manual annotation tools which allow the user to “tag” their photos, however, these techniques are laborious and as a result have been poorly adopted; Sigurbjörnsson and van Zwol (2008) showed that 64% of images uploaded to Flickr are annotated with < 4 tags. Due to this, an entire body of research has focused on the automatic annotation of images (Hanbury, 2008; Smeulders et al., 2000; Zhang et al., 2012a) where one attempts to bridge the semantic gap between an image’s appearance and meaning e.g. the objects present. Despite two decades of research the semantic gap still largely exists and as a result automatic annotation models often offer unsatisfactory performance for industrial implementation. Further, these techniques can only annotate what they see, thus ignoring the “bigger picture” surrounding an image (e.g. its location, the event, the people present etc). Much work has therefore focused on building photo tag recommendation (PTR) methods which aid the user in the annotation process by suggesting tags related to those already present. These works have mainly focused on computing relationships between tags based on historical images e.g. that NY and timessquare co-exist in many images and are therefore highly correlated. However, tags are inherently noisy, sparse and ill-defined often resulting in poor PTR accuracy e.g. does NY refer to New York or New Year? This thesis proposes the exploitation of an image’s context which, unlike textual evidences, is always present, in order to alleviate this ambiguity in the tag recommendation process. Specifically we exploit the “what, who, where, when and how” of the image capture process in order to complement textual evidences in various photo tag recommendation and retrieval scenarios. In part II, we combine text, content-based (e.g. # of faces present) and contextual (e.g. day-of-the-week taken) signals for tag recommendation purposes, achieving up to a 75% improvement to precision@5 in comparison to a text-only TF-IDF baseline. We then consider external knowledge sources (i.e. Wikipedia & Twitter) as an alternative to (slower moving) Flickr in order to build recommendation models on, showing that similar accuracy could be achieved on these faster moving, yet entirely textual, datasets. In part II, we also highlight the merits of diversifying tag recommendation lists before discussing at length various problems with existing automatic image annotation and photo tag recommendation evaluation collections. In part III, we propose three new image retrieval scenarios, namely “visual event summarisation”, “image popularity prediction” and “lifelog summarisation”. In the first scenario, we attempt to produce a rank of relevant and diverse images for various news events by (i) removing irrelevant images such memes and visual duplicates (ii) before semantically clustering images based on the tweets in which they were originally posted. Using this approach, we were able to achieve over 50% precision for images in the top 5 ranks. In the second retrieval scenario, we show that by combining contextual and content-based features from images, we are able to predict if it will become “popular” (or not) with 74% accuracy, using an SVM classifier. Finally, in chapter 9 we employ blur detection and perceptual-hash clustering in order to remove noisy images from lifelogs, before combining visual and geo-temporal signals in order to capture a user’s “key moments” within their day. We believe that the results of this thesis show an important step towards building effective image retrieval models when there lacks sufficient textual content (i.e. a cold start)

    Organising and structuring a visual diary using visual interest point detectors

    Get PDF
    As wearable cameras become more popular, researchers are increasingly focusing on novel applications to manage the large volume of data these devices produce. One such application is the construction of a Visual Diary from an individual’s photographs. Microsoft’s SenseCam, a device designed to passively record a Visual Diary and cover a typical day of the user wearing the camera, is an example of one such device. The vast quantity of images generated by these devices means that the management and organisation of these collections is not a trivial matter. We believe wearable cameras, such as SenseCam, will become more popular in the future and the management of the volume of data generated by these devices is a key issue. Although there is a significant volume of work in the literature in the object detection and recognition and scene classification fields, there is little work in the area of setting detection. Furthermore, few authors have examined the issues involved in analysing extremely large image collections (like a Visual Diary) gathered over a long period of time. An algorithm developed for setting detection should be capable of clustering images captured at the same real world locations (e.g. in the dining room at home, in front of the computer in the office, in the park, etc.). This requires the selection and implementation of suitable methods to identify visually similar backgrounds in images using their visual features. We present a number of approaches to setting detection based on the extraction of visual interest point detectors from the images. We also analyse the performance of two of the most popular descriptors - Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF).We present an implementation of a Visual Diary application and evaluate its performance via a series of user experiments. Finally, we also outline some techniques to allow the Visual Diary to automatically detect new settings, to scale as the image collection continues to grow substantially over time, and to allow the user to generate a personalised summary of their data

    Recuperação e identificação de momentos em imagens

    Get PDF
    In our modern society almost anyone is able to capture moments and record events due to the ease accessibility to smartphones. This leads to the question, if we record so much of our life how can we easily retrieve specific moments? The answer to this question would open the door for a big leap in human life quality. The possibilities are endless, from trivial problems like finding a photo of a birthday cake to being capable of analyzing the progress of mental illnesses in patients or even tracking people with infectious diseases. With so much data being created everyday, the answer to this question becomes more complex. There is no stream lined approach to solve the problem of moment localization in a large dataset of images and investigations into this problem have only started a few years ago. ImageCLEF is one competition where researchers participate and try to achieve new and better results in the task of moment retrieval. This complex problem, along with the interest in participating in the ImageCLEF Lifelog Moment Retrieval Task posed a good challenge for the development of this dissertation. The proposed solution consists in developing a system capable of retriving images automatically according to specified moments described in a corpus of text without any sort of user interaction and using only state-of-the-art image and text processing methods. The developed retrieval system achieves this objective by extracting and categorizing relevant information from text while being able to compute a similarity score with the extracted labels from the image processing stage. In this way, the system is capable of telling if images are related to the specified moment in text and therefore able to retrieve the pictures accordingly. In the ImageCLEF Life Moment Retrieval 2020 subtask the proposed automatic retrieval system achieved a score of 0.03 in the F1-measure@10 evaluation methodology. Even though this scores are not competitve when compared to other teams systems scores, the built system presents a good baseline for future work.Na sociedade moderna, praticamente qualquer pessoa consegue capturar momentos e registar eventos devido à facilidade de acesso a smartphones. Isso leva à questão, se registamos tanto da nossa vida, como podemos facilmente recuperar momentos específicos? A resposta a esta questão abriria a porta para um grande salto na qualidade da vida humana. As possibilidades são infinitas, desde problemas triviais como encontrar a foto de um bolo de aniversário até ser capaz de analisar o progresso de doenças mentais em pacientes ou mesmo rastrear pessoas com doenças infecciosas. Com tantos dados a serem criados todos os dias, a resposta a esta pergunta torna-se mais complexa. Não existe uma abordagem linear para resolver o problema da localização de momentos num grande conjunto de imagens e investigações sobre este problema começaram há apenas poucos anos. O ImageCLEF é uma competição onde investigadores participam e tentam alcançar novos e melhores resultados na tarefa de recuperação de momentos a cada ano. Este problema complexo, em conjunto com o interesse em participar na tarefa ImageCLEF Lifelog Moment Retrieval, apresentam-se como um bom desafio para o desenvolvimento desta dissertação. A solução proposta consiste num sistema capaz de recuperar automaticamente imagens de momentos descritos em formato de texto, sem qualquer tipo de interação de um utilizador, utilizando apenas métodos estado da arte de processamento de imagem e texto. O sistema de recuperação desenvolvido alcança este objetivo através da extração e categorização de informação relevante de texto enquanto calcula um valor de similaridade com os rótulos extraídos durante a fase de processamento de imagem. Dessa forma, o sistema consegue dizer se as imagens estão relacionadas ao momento especificado no texto e, portanto, é capaz de recuperar as imagens de acordo. Na subtarefa ImageCLEF Life Moment Retrieval 2020, o sistema de recuperação automática de imagens proposto alcançou uma pontuação de 0.03 na metodologia de avaliação F1-measure@10. Mesmo que estas pontuações não sejam competitivas quando comparadas às pontuações de outros sistemas de outras equipas, o sistema construído apresenta-se como uma boa base para trabalhos futuros.Mestrado em Engenharia Eletrónica e Telecomunicaçõe

    An Outlook into the Future of Egocentric Vision

    Full text link
    What will the future be? We wonder! In this survey, we explore the gap between current research in egocentric vision and the ever-anticipated future, where wearable computing, with outward facing cameras and digital overlays, is expected to be integrated in our every day lives. To understand this gap, the article starts by envisaging the future through character-based stories, showcasing through examples the limitations of current technology. We then provide a mapping between this future and previously defined research tasks. For each task, we survey its seminal works, current state-of-the-art methodologies and available datasets, then reflect on shortcomings that limit its applicability to future research. Note that this survey focuses on software models for egocentric vision, independent of any specific hardware. The paper concludes with recommendations for areas of immediate explorations so as to unlock our path to the future always-on, personalised and life-enhancing egocentric vision.Comment: We invite comments, suggestions and corrections here: https://openreview.net/forum?id=V3974SUk1

    Advancing the objective measurement of physical activity and sedentary behaviour context

    Get PDF
    Objective data from national surveillance programmes show that, on average, individuals accumulate high amounts of sedentary time per day and only a small minority of adults achieve physical activity guidelines. One potential explanation for the failure of interventions to increase population levels of physical activity or decrease sedentary time is that research to date has been unable to identify the specific behavioural levers in specific contexts needed to change behaviour. Novel technology is emerging with the potential to elucidate these specific behavioural contexts and thus identify these specific behavioural levers. Therefore the aims of this four study thesis were to identify novel technologies capable of measuring the behavioural context, to evaluate and validate the most promising technology and to then pilot this technology to assess the behavioural context of older adults, shown by surveillance programmes to be the least physically active and most sedentary age group. Study one Purpose: To identify, via a systematic review, technologies which have been used or could be used to measure the location of physical activity or sedentary behaviour. Methods: Four electronic databases were searched using key terms built around behaviour, technology and location. To be eligible for inclusion papers were required to be published in English and describe a wearable or portable technology or device capable of measuring location. Searches were performed from the inception of the database up to 04/02/2015. Searches were also performed using three internet search engines. Specialised software was used to download search results and thus mitigate the potential pitfalls of changing search algorithms. Results: 188 research papers met the inclusion criteria. Global positioning systems were the most widely used location technology in the published research, followed by wearable cameras and Radio-frequency identification. Internet search engines identified 81 global positioning systems, 35 real-time locating systems and 21 wearable cameras. Conclusion: The addition of location information to existing measures of physical activity and sedentary behaviour will provide important behavioural information. Study Two Purpose: This study investigated the Actigraph proximity feature across three experiments. The aim of Experiment One was to assess the basic characteristics of the Actigraph RSSI signal across a range of straight line distances. Experiment Two aimed to assess the level of receiver device signal detection in a single room under unobstructed conditions, when various obstructions are introduced and the impacts these obstructions have on the intra and inter unit variability of the RSSI signal. Finally, Experiment Three aimed to assess signal contamination across multiple rooms (i.e. one beacon being detected in multiple rooms). Methods: Across all experiments, the receiver(s) collected data at 10 second epochs, the highest resolution possible. In Experiment One two devices, one receiver and one beacon, were placed opposite each other at 10cm increments for one minute at each distance. The RSSI-distance relationship was then visually assessed for linearity. In Experiment Two, a test room was demarcated into 0.5 x 0.5 m grids with receivers simultaneously placed in each demarcated grid. This process was then repeated under wood, metal and human obstruction conditions. Descriptive tallies were used to assess the signal detection achieved for each receiver from each beacon in each grid. Mean RSSI signal was calculated for each condition alongside intra and inter-unit standard deviation, coefficient of variation and standard error of the measurement. In Experiment Three, a test apartment was used with three beacons placed across two rooms. The researcher then completed simulated conditions for 10 minutes each across the two rooms. The percentage of epochs where a signal was detected from each of the three beacons across each test condition was then calculated. Results: In Experiment One, the relationship between RSSI and distance was found to be non-linear. In Experiment Two, high signal detection was achieved in all conditions; however, there was a large degree of intra and inter-unit variability in RSSI. In Experiment Three, there was a large degree of multi-room signal contamination. Conclusion: The Actigraph proximity feature can provide a binary indicator of room level location. Study Three Purpose: To use novel technology in three small feasibility trials to ascertain where the greatest utility can be demonstrated. Methods: Feasibility Trial One assessed the concurrent validity of electrical energy monitoring and wearable cameras as measures of television viewing. Feasibility Trial Two utilised indoor location monitoring to assess where older adult care home residents accumulate their sedentary time. Lastly, Feasibility Trial Three investigated the use of proximity sensors to quantify exposure to a height adjustable desk Results: Feasibility Trial One found that on average the television is switched on for 202 minutes per day but is visible in just 90 minutes of wearable camera images with a further 52 minutes where the participant is in their living room but the television is not visible in the image. Feasibility Trial Two found that residents were highly sedentary (sitting for an average of 720 minutes per day) and spent the majority of their time in their own rooms with more time spent in communal areas in the morning than in the afternoon. Feasibility Trial Three found a discrepancy between self-reported work hours and objectively measured office dwell time. Conclusion: The feasibility trials outlined in this study show the utility of objectively measuring context to provide more detailed and refined data. Study Four Purpose: To objectively measure the context of sedentary behaviour in the most sedentary age group, older adults. Methods: 26 residents and 13 staff were recruited from two care homes. Each participant wore an Actigraph GT9X on their non-dominant wrist and a LumoBack posture sensor on their lower back for one week. The Actigraph recorded proximity every 10 seconds and acceleration at 100 Hz. LumoBack data were provided as summaries per 5 minutes. Beacon Actigraphs were placed around each care home in the resident s rooms, communal areas and corridors. Proximity and posture data were combined in 5 minute epochs with descriptive analysis of average time spent sitting in each area produced. Acceleration data were summarised into 10 second epochs and combined with proximity data to show the average count per epoch in each area of the care home. Mann-Whitney tests were performed to test for differences between care homes. Results: No significant differences were found between Care Home One and Care Home Two in the amount of time spent sitting in communal areas of the care home (301 minutes per day and 39 minutes per day respectively, U=23, p=0.057) or in the amount of time residents spent sitting in their own room (215 minutes per day and 337 minutes per day in Care Home One and Two respectively, U=32, p=0.238). In both care homes, accelerometer measured average movement increases with the number of residents in the communal area. Conclusion: The Actigraph proximity system was able to quantify the context of sedentary behaviour in older adults. This enabled the identification of levers for behaviour change which can be used to reduce sedentary time in this group. Overall conclusion: There are a large number of technologies available with the potential to measure the context of physical activity or sedentary time. The Actigraph proximity feature is one such technology. This technology is able to provide a binary measure of proximity via the detection or non-detection of Bluetooth signal: however, the variability of the signal prohibits distance estimation. The Actigraph proximity feature, in combination with a posture sensor, is able to elucidate the context of physical activity and sedentary time

    Urban Informatics

    Get PDF
    This open access book is the first to systematically introduce the principles of urban informatics and its application to every aspect of the city that involves its functioning, control, management, and future planning. It introduces new models and tools being developed to understand and implement these technologies that enable cities to function more efficiently – to become ‘smart’ and ‘sustainable’. The smart city has quickly emerged as computers have become ever smaller to the point where they can be embedded into the very fabric of the city, as well as being central to new ways in which the population can communicate and act. When cities are wired in this way, they have the potential to become sentient and responsive, generating massive streams of ‘big’ data in real time as well as providing immense opportunities for extracting new forms of urban data through crowdsourcing. This book offers a comprehensive review of the methods that form the core of urban informatics from various kinds of urban remote sensing to new approaches to machine learning and statistical modelling. It provides a detailed technical introduction to the wide array of tools information scientists need to develop the key urban analytics that are fundamental to learning about the smart city, and it outlines ways in which these tools can be used to inform design and policy so that cities can become more efficient with a greater concern for environment and equity
    corecore