118 research outputs found

    Modeling, Predicting and Capturing Human Mobility

    Get PDF
    Realistic models of human mobility are critical for modern day applications, specifically for recommendation systems, resource planning and process optimization domains. Given the rapid proliferation of mobile devices equipped with Internet connectivity and GPS functionality today, aggregating large sums of individual geolocation data is feasible. The thesis focuses on methodologies to facilitate data-driven mobility modeling by drawing parallels between the inherent nature of mobility trajectories, statistical physics and information theory. On the applied side, the thesis contributions lie in leveraging the formulated mobility models to construct prediction workflows by adopting a privacy-by-design perspective. This enables end users to derive utility from location-based services while preserving their location privacy. Finally, the thesis presents several approaches to generate large-scale synthetic mobility datasets by applying machine learning approaches to facilitate experimental reproducibility

    On the prediction of clinical outcomes using Heart Rate Variability estimated from wearable devices

    Get PDF
    This thesis explores the use of Heart Rate Variability as a tool for predicting health outcomes, focusing on data derived from photoplethysmography (PPG) sensors in wrist-worn wearable devices such as smartwatches. These devices offer a unique opportunity for cost-effective, continuous, and unobtrusive monitoring of heart health. However, PPG data is susceptible to motion artefacts, challenging the reliability of Heart Rate Variability metrics derived from it. A critical finding of this research is the unreliability of specific frequency-domain Heart Rate Variability features, such as the Sympathovagal Balance Index (SVI), due to low signal-to-noise ratio in certain frequency bands. Conversely, the thesis demonstrates that most HRV features, including Root Mean Square of Successive Differences between normal heartbeats (RMSSD) and Standard Deviation of Normal heartbeats (SDNN), can be reliably extracted under conditions of motion, such as during physical activity or recovery from exercise. This is achieved by employing accelerometry data from wearable devices to filter out unreliable PPG data. The thesis also addresses the issue of missing data in Heart Rate Variability analysis, a consequence of motion artefacts and the energy-saving strategies of wearable devices. By exploring different interpolation methods and their effects on Heart Rate Variability features, this research identifies the best approaches for handling missing data. Particularly, it recommends operating on timestamp time-series over duration time-series, contradicting traditional Heart Rate Variability preprocessing practices. Quadratic interpolation in the time domain was identified as the most effective method, introducing minimal error across numerous Heart Rate Variability features, contrary to interpolation in the duration domain. The research presented in this thesis evaluates Heart Rate Variability features derived from ultra-short measurement windows, demonstrating the feasibility of accurately estimating RMSSD and SDNN using 30-second and 1-minute time windows, respectively. This study, unique in assessing the effect of missing values on ultra-short Heart Rate Variability data, reveals that missing values significantly impact SDNN estimations while moderately affecting RMSSD. The analysis highlights that ultra-short inter-beat interval time series limit the assessment of very low frequency (VLF) components, increasing bias in SDNN estimates. This finding is particularly significant in light of the prevalent use of SDNN in commercial wearables, underscoring its importance for continuous heart health monitoring. The study notes that the shorter the measurement window and the greater the amount of missing values, the larger the bias observed in SDNN. A novel aspect of the thesis is the creation of an innovative mathematical model designed to estimate the impact of circadian rhythms on resting heart rate. This model stands out for its computational efficiency, making it particularly suitable for data obtained from wearable devices. It surpasses the single component cosinor model in accuracy, demonstrated by a lower root mean square error (RMSE) in predicting future heart rate values. Additionally, it retains the advantage of providing easily interpretable parameters, such as MESOR, Acrophase, and Amplitude, which are essential for assessing changes in heart activity. The thesis demonstrates that Heart Rate data can accurately estimate SDNN24 (the Standard Deviation of NN intervals over 24 hours), with a difference of about 0.22±11.47 (RMSE = 53.81 and r2=0.97r^2 = 0.97). This finding indicates that despite being fragmentary, 24-hour HR data from wrist-worn fitness devices is adequate for estimating SDNN24 and assessing health status, as evidenced by an F1 score of 0.97. The robustness of SDNN24 estimation against noisy data suggests that wrist-worn wearables are capable of reliably monitoring cardiovascular health on a continuous basis, thus facilitating early interventions in response to changes in Sinoatrial Node activity. The final part of the thesis introduces an innovative approach to health outcome prediction, employing Heart Rate Variability data gathered during exercise alongside Electronic Health Record data. Employing Large Language Models to process EHR data and Convolutional AutoEncoders for Heart Rate Variability analysis, this approach reveals the untapped potential of exercise Heart Rate Variability data in health monitoring and prediction. Deep Learning models incorporating Heart Rate Variability data demonstrated enhanced predictive accuracy for cardiovascular diseases (CVD), coronary heart disease (CHD), and Angina, evidenced by higher Area Under the Curve (AUC) scores compared to models using only Electronic Health Records and demographic/behavioural data. The highest AUC scores achieved were 0.71 for CVD, 0.74 for CHD, and 0.73 for Angina. In conclusion, this thesis contributes to the field of biomedical engineering by enhancing the understanding and application of HRV analysis in health outcome prediction using wearable device data. It offers insights for future work in continuous, unobtrusive health monitoring and underscores the need for further research in this rapidly evolving domain

    Sequence Determinants of the Individual and Collective Behaviour of Intrinsically Disordered Proteins

    Get PDF
    Intrinsically disordered proteins and protein regions (IDPs) represent around thirty percent of the eukaryotic proteome. IDPs do not fold into a set three dimensional structure, but instead exist in an ensemble of inter-converting states. Despite being disordered, IDPs are decidedly not random; well-defined - albeit transient - local and long-range interactions give rise to an ensemble with distinct statistical biases over many length-scales. Among a variety of cellular roles, IDPs drive and modulate the formation of phase separated intracellular condensates, non-stoichiometric assemblies of protein and nucleic acid that serve many functions. In this work, we have explored how the amino acid sequence of IDPs determines their conformational behaviour, and how sequence and single chain behaviour influence their collective behaviour in the context of phase separation. In part I, in a series of studies, we used simulation, theory, and statistical analysis coupled with a wide range of experimental approaches to uncover novel rules that further explore how primary sequence and local structure influence the global and local behaviour of disordered proteins, with direct implications for protein function and evolution. We found that amino acid sidechains counteract the intrinsic collapse of the peptide backbone, priming the backbone for interaction and providing a fully reconciliatory explanation for the mechanism of action associated with the denaturants urea and GdmCl. We discovered that proline can engender a conformational buffering effect in IDPs to counteract standard electrostatic effects, and that the patterning those proline residues can be a crucial determinant of the conformational ensemble. We developed a series of tools for analysing primary sequences on a proteome wide scale and used them to discover that different organisms can have substantially different average sequence properties. Finally, we determined that for the normally folded protein NTL9, the unfolded state under folding conditions is relatively expanded but has well defined native and non-native structural preferences. In part II, we identified a novel mode of phase separation in biology, and explored how this could be tuned through sequence design. We discovered that phase separated liquids can be many orders of magnitude more dilute than simple mean-field theories would predict, and developed an analytic framework to explain and understand this phenomenon. Finally, we designed, developed and implemented a novel lattice-based simulation engine (PIMMS) to provide sequence-specific insight into the determinants of conformational behaviour and phase separation. PIMMS allows us to accurately and rapidly generate sequence-specific conformational ensembles and run simulations of hundreds of polymers with the goal of allowing us to systematically elucidate the link between primary sequence of phase separation

    Seventh Biennial Report : June 2003 - March 2005

    No full text

    Data Mining and Visualization of Large Human Behavior Data Sets

    Get PDF

    Structure-oriented prediction in complex networks

    Get PDF
    Complex systems are extremely hard to predict due to its highly nonlinear interactions and rich emergent properties. Thanks to the rapid development of network science, our understanding of the structure of real complex systems and the dynamics on them has been remarkably deepened, which meanwhile largely stimulates the growth of effective prediction approaches on these systems. In this article, we aim to review different network-related prediction problems, summarize and classify relevant prediction methods, analyze their advantages and disadvantages, and point out the forefront as well as critical challenges of the field

    Modelling individual accessibility using Bayesian networks: A capabilities approach

    Get PDF
    The ability of an individual to reach and engage with basic services such as healthcare, education and activities such as employment is a fundamental aspect of their wellbeing. Within transport studies, accessibility is considered to be a valuable concept that can be used to generate insights on issues related to social exclusion due to limited access to transport options. Recently, researchers have attempted to link accessibility with popular theories of social justice such as Amartya Sen's Capabilities Approach (CA). Such studies have set the theoretical foundations on the way accessibility can be expressed through the CA, however, attempts to operationalise this approach remain fragmented and predominantly qualitative in nature. The data landscape however, has changed over the last decade providing an unprecedented quantity of transport related data at an individual level. Mobility data from dfferent sources have the potential to contribute to the understanding of individual accessibility and its relation to phenomena such as social exclusion. At the same time, the unlabelled nature of such data present a considerable challenge, as a non-trivial step of inference is required if one is to deduce the transportation modes used and activities reached. This thesis develops a novel framework for accessibility modelling using the CA as theoretical foundation. Within the scope of this thesis, this is used to assess the levels of equality experienced by individuals belonging to different population groups and its link to transport related social exclusion. In the proposed approach, activities reached and transportation modes used are considered manifestations of individual hidden capabilities. A modelling framework using dynamic Bayesian networks is developed to quantify and assess the relationships and dynamics of the different components in fluencing the capabilities sets. The developed approach can also provide inferential capabilities for activity type and transportation mode detection, making it suitable for use with unlabelled mobility data such as Automatic Fare Collection Systems (AFC), mobile phone and social media. The usefulness of the proposed framework is demonstrated through three case studies. In the first case study, mobile phone data were used to explore the interaction of individuals with different public transportation modes. It was found that assumptions about individual mobility preferences derived from travel surveys may not always hold, providing evidence for the significance of personal characteristics to the choices of transportation modes. In the second case, the proposed framework is used for activity type inference, testing the limits of accuracy that can be achieved from unlabelled social media data. A combination of the previous case studies, the third case further defines a generative model which is used to develop the proposed capabilities approach to accessibility model. Using data from London's Automatic Fare Collection Systems (AFC) system, the elements of the capabilities set are explicitly de ned and linked with an individual's personal characteristics, external variables and functionings. The results are used to explore the link between social exclusion and transport disadvantage, revealing distinct patterns that can be attributed to different accessibility levels
    corecore