176 research outputs found

    Privacy-preserving human mobility and activity modelling

    Get PDF
    The exponential proliferation of digital trends and worldwide responses to the COVID-19 pandemic thrust the world into digitalization and interconnectedness, pushing increasingly new technologies/devices/applications into the market. More and more intimate data of users are collected for positive analysis purposes of improving living well-being but shared with/without the user's consent, emphasizing the importance of making human mobility and activity models inclusive, private, and fair. In this thesis, I develop and implement advanced methods/algorithms to model human mobility and activity in terms of temporal-context dynamics, multi-occupancy impacts, privacy protection, and fair analysis. The following research questions have been thoroughly investigated: i) whether the temporal information integrated into the deep learning networks can improve the prediction accuracy in both predicting the next activity and its timing; ii) how is the trade-off between cost and performance when optimizing the sensor network for multiple-occupancy smart homes; iii) whether the malicious purposes such as user re-identification in human mobility modelling could be mitigated by adversarial learning; iv) whether the fairness implications of mobility models and whether privacy-preserving techniques perform equally for different groups of users. To answer these research questions, I develop different architectures to model human activity and mobility. I first clarify the temporal-context dynamics in human activity modelling and achieve better prediction accuracy by appropriately using the temporal information. I then design a framework MoSen to simulate the interaction dynamics among residents and intelligent environments and generate an effective sensor network strategy. To relieve users' privacy concerns, I design Mo-PAE and show that the privacy of mobility traces attains decent protection at the marginal utility cost. Last but not least, I investigate the relations between fairness and privacy and conclude that while the privacy-aware model guarantees group fairness, it violates the individual fairness criteria.Open Acces

    Mobility in Unsupervised Word Embeddings for Knowledge Extraction—The Scholars’ Trajectories across Research Topics

    Get PDF
    In the knowledge discovery field of the Big Data domain the analysis of geographic positioning and mobility information plays a key role. At the same time, in the Natural Language Processing (NLP) domain pre-trained models such as BERT and word embedding algorithms such as Word2Vec enabled a rich encoding of words that allows mapping textual data into points of an arbitrary multi-dimensional space, in which the notion of proximity reflects an association among terms or topics. The main contribution of this paper is to show how analytical tools, traditionally adopted to deal with geographic data to measure the mobility of an agent in a time interval, can also be effectively applied to extract knowledge in a semantic realm, such as a semantic space of words and topics, looking for latent trajectories that can benefit the properties of neural network latent representations. As a case study, the Scopus database was queried about works of highly cited researchers in recent years. On this basis, we performed a dynamic analysis, for measuring the Radius of Gyration as an index of the mobility of researchers across scientific topics. The semantic space is built from the automatic analysis of the paper abstracts of each author. In particular, we evaluated two different methodologies to build the semantic space and we found that Word2Vec embeddings perform better than the BERT ones for this task. Finally, The scholars’ trajectories show some latent properties of this model, which also represent new scientific contributions of this work. These properties include (i) the correlation between the scientific mobility and the achievement of scientific results, measured through the H-index; (ii) differences in the behavior of researchers working in different countries and subjects; and (iii) some interesting similarities between mobility patterns in this semantic realm and those typically observed in the case of human mobility

    Contextualized Diachronic Word Representations

    Get PDF
    International audienceDiachronic word embeddings play a key role in capturing interesting patterns about how language evolves over time. Most of the existing work focuses on studying corpora spanning across several decades, which is understandably still not a possibility when working on social media-based user-generated content. In this work, we address the problem of studying semantic changes in a large Twitter corpus collected over five years, a much shorter period than what is usually the norm in di-achronic studies. We devise a novel attentional model, based on Bernoulli word embeddings, that are conditioned on contextual extra-linguistic (social) features such as network, spatial and socioeconomic variables, which are associated with Twitter users, as well as topic-based features. We posit that these social features provide an inductive bias that helps our model to overcome the narrow time-span regime problem. Our extensive experiments reveal that our proposed model is able to capture subtle semantic shifts without being biased towards frequency cues and also works well when certain con-textual features are absent. Our model fits the data better than current state-of-the-art dynamic word embedding models and therefore is a promising tool to study diachronic semantic changes over small time periods

    DouFu: A Double Fusion Joint Learning Method For Driving Trajectory Representation

    Full text link
    Driving trajectory representation learning is of great significance for various location-based services, such as driving pattern mining and route recommendation. However, previous representation generation approaches tend to rarely address three challenges: 1) how to represent the intricate semantic intentions of mobility inexpensively; 2) complex and weak spatial-temporal dependencies due to the sparsity and heterogeneity of the trajectory data; 3) route selection preferences and their correlation to driving behavior. In this paper, we propose a novel multimodal fusion model, DouFu, for trajectory representation joint learning, which applies multimodal learning and attention fusion module to capture the internal characteristics of trajectories. We first design movement, route, and global features generated from the trajectory data and urban functional zones and then analyze them respectively with the attention encoder or feed forward network. The attention fusion module incorporates route features with movement features to create a better spatial-temporal embedding. With the global semantic feature, DouFu produces a comprehensive embedding for each trajectory. We evaluate representations generated by our method and other baseline models on classification and clustering tasks. Empirical results show that DouFu outperforms other models in most of the learning algorithms like the linear regression and the support vector machine by more than 10%.Comment: 11 pages, 7 figure

    Annotating, Understanding, and Predicting Long-term Video Memorability

    Get PDF
    International audienceMemorability can be regarded as a useful metric of video importance to help make a choice between competing videos. Research on computational understanding of video memorability is however in its early stages. There is no available dataset for modelling purposes, and the few previous attempts provided protocols to collect video memorability data that would be difficult to generalize. Furthermore, the computational features needed to build a robust memorability predictor remain largely undiscovered. In this article, we propose a new protocol to collect long-term video memorability annotations. We measure the memory performances of 104 participants from weeks to years after memorization to build a dataset of 660 videos for video memorability prediction. This dataset is made available for the research community. We then analyze the collected data in order to better understand video memorability, in particular the effects of response time, duration of memory retention and repetition of visualization on video memorability. We finally investigate the use of various types of audio and visual features and build a computational model for video memorability prediction. We conclude that high level visual semantics help better predict the memorability of videos

    Context-aware multi-head self-attentional neural network model for next location prediction

    Full text link
    Accurate activity location prediction is a crucial component of many mobility applications and is particularly required to develop personalized, sustainable transportation systems. Despite the widespread adoption of deep learning models, next location prediction models lack a comprehensive discussion and integration of mobility-related spatio-temporal contexts. Here, we utilize a multi-head self-attentional (MHSA) neural network that learns location transition patterns from historical location visits, their visit time and activity duration, as well as their surrounding land use functions, to infer an individual's next location. Specifically, we adopt point-of-interest data and latent Dirichlet allocation for representing locations' land use contexts at multiple spatial scales, generate embedding vectors of the spatio-temporal features, and learn to predict the next location with an MHSA network. Through experiments on two large-scale GNSS tracking datasets, we demonstrate that the proposed model outperforms other state-of-the-art prediction models, and reveal the contribution of various spatio-temporal contexts to the model's performance. Moreover, we find that the model trained on population data achieves higher prediction performance with fewer parameters than individual-level models due to learning from collective movement patterns. We also reveal mobility conducted in the recent past and one week before has the largest influence on the current prediction, showing that learning from a subset of the historical mobility is sufficient to obtain an accurate location prediction result. We believe that the proposed model is vital for context-aware mobility prediction. The gained insights will help to understand location prediction models and promote their implementation for mobility applications.Comment: updated Discussion section; accepted by Transportation Research Part

    A CNN-LSTM for predicting mortality in the ICU

    Get PDF
    An accurate predicted mortality is crucial to healthcare as it provides an empirical risk estimate for prognostic decision making, patient stratification and hospital benchmarking. Current prediction methods in practice are severity of disease scoring systems that usually involve a fixed set of admission attributes and summarized physiological data. These systems are prone to bias and require substantial manual effort which necessitates an updated approach which can account for most shortcomings. Clinical observation notes allow for recording highly subjective data on the patient that can possibly facilitate higher discrimination. Moreover, deep learning models can automatically extract and select features without human input.This thesis investigates the potential of a combination of a deep learning model and notes for predicting mortality with a higher accuracy. A custom architecture, called CNN-LSTM, is conceptualized for mapping multiple notes compiled in a hospital stay to a mortality outcome. It employs both convolutional and recurrent layers with the former capturing semantic relationships in individual notes independently and the latter capturing temporal relationships between concurrent notes in a hospital stay. This approach is compared to three severity of disease scoring systems with a case study on the MIMIC-III dataset. Experiments are set up to assess the CNN-LSTM for predicting mortality using only the notes from the first 24, 12 and 48 hours of a patient stay. The model is trained using K-fold cross-validation with k=5 and the mortality probability calculated by the three severity scores on the held-out set is used as the baseline. It is found that the CNN-LSTM outperforms the baseline on all experiments which serves as a proof-of-concept of how notes and deep learning can better outcome prediction
    • …
    corecore