6,308 research outputs found

    Mining top-k regular episodes from sensor streams

    Get PDF
    International audienceThe monitoring of human activities plays an important role in health-care applications and for the data mining community. Existing approaches work on activities recognition occurring in sensor data streams. However, regular behaviors have not been studied. Thus, we here introduce a new approach to discover top-k most regular episodes from sensors streams, TKRES. The top-k approach allows us to control the size of the output, thus preventing overwhelming result analysis for the supervisor. TKRES is based on the use of a simple top-k list and a k-tree structure for maintaining the top-k episodes and their occurrence information. We also investigate and report the performances of TKRES on two real-life smart home datasets

    Trade marketing analytics in consumer goods industry

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Information Systems and Technologies ManagementWe address transparency of trade spends in consumer goods industry and propose a set of business performance indicators that follow Pareto (80/20) rule – a popular concept in optimization problem solving. Discovery of power laws in behaviors of travelling sales persons, buying patterns of customers, popularity of products, and market demand fluctuations – all that leads to better-informed decisions among all those involved into planning, execution, and post-promotion evaluation. Practical result of our work is a prototype implementation of proposed measures. The most remarkable finding – consistency of travelling sales person journey between customer locations. Loyalty to brand, or brand market power – whatever forces field sales representatives to put at least one product of market player of interest into nearly every market basket – fits into small world model. This behavior not only changes from person to person, but also remains the same after reassignment into different territory. For industrialization stage of this project, we outline key design considerations for information system capable of handling real-time workload scalable to petabytes. We built our analyses for collaborative processes of integrated planning that requires joint effort of multidisciplinary team. Field tests demonstrate how insights from data can trigger business transformation. That is why we end up with recommendation for system integrators to include Knowledge Discovery into information system deployment projects

    Context-aware Background Application Scheduling in Interactive Mobile Systems

    Get PDF
    Department of Computer Science and EngineeringEach individual's usage behavior on mobile devices depend on a variety of factors such as time, location, and previous actions. Hence, context-awareness provides great opportunities to make the networking and the computing capabilities of mobile systems to be more personalized and more efficient in managing their resources. To this end, we first reveal new findings from our own Android user experiment: (i) the launching probabilities of applications follow Zipf's law, and (ii) inter-running and running times of applications conform to log-normal distributions. We also find contextual dependencies between application usage patterns, for which we classify contexts autonomously with unsupervised learning methods. Using the knowledge acquired, we develop a context-aware application scheduling framework, CAS that adaptively unloads and preloads background applications for a joint optimization in which the energy saving is maximized and the user discomfort from the scheduling is minimized. Our trace-driven simulations with 96 user traces demonstrate that the context-aware design of CAS enables it to outperform existing process scheduling algorithms. Our implementation of CAS over Android platforms and its end-to-end evaluations verify that its human involved design indeed provides substantial user-experience gains in both energy and application launching latency.ope

    Identifying road user classes based on repeated trip behaviour using Bluetooth data

    Get PDF
    © 2018 The Authors Analysing the repeated trip behaviour of travellers, including trip frequency and intrapersonal variability, can provide insights into traveller needs, flexibility and knowledge of the network, as well as inputs for models including learning and/or behaviour change. Data from emerging data sources provide new opportunities to examine repeated trip making on the road network. Point-to-point sensor data, for example from Bluetooth detectors, is collected using fixed detectors installed next to roads which can record unique identifiers of passing vehicles or travellers which can then be matched across space and time. Such data is used in this research to segment road users based on their repeated trip making behaviour, as has been done in public transportation research using smart card data to understand different categories of users. Rather than deciding on traveller segmentation based on a priori assumptions, the method provides a data driven approach to cluster together travellers who have similar trip regularity and variability between days. Measures which account for the strengths and weaknesses of point-to-point sensor data are presented for (a) spatial variability, using Sequence Alignment, and (b) time of day variability, using Model Based Clustering. The proposed method is also applied to one year of data from 23 fixed Bluetooth detectors in a town in northwest England. The data consists of almost 7.5 million trips made by over 300,000 travellers. Applying the proposed methods allows three traveller user classes to be identified: infrequent, frequent, and very frequent. Interestingly, the spatial and time of day variability characteristics of each user class are distinct and are not linearly correlated with trip frequency. The frequent travellers are observed 1–5 times per week on average and make up 57% of the trips recorded during the year. Focusing on these frequent travellers, it is shown that these can be further separated into those with high spatial and time of day variability and those with low spatial and time of day variability. Understanding the distribution of travellers and trips across these user classes, as well as the repeated trip characteristics of each user class, can inform further data collection and the development of policies targeting the needs of specific travellers

    Prediction assisted fast handovers for seamless IP mobility

    Get PDF
    Word processed copy.Includes bibliographical references (leaves 94-98).This research investigates the techniques used to improve the standard Mobile IP handover process and provide proactivity in network mobility management. Numerous fast handover proposals in the literature have recently adopted a cross-layer approach to enhance movement detection functionality and make terminal mobility more seamless. Such fast handover protocols are dependent on an anticipated link-layer trigger or pre-trigger to perform pre-handover service establishment operations. This research identifies the practical difficulties involved in implementing this type of trigger and proposes an alternative solution that integrates the concept of mobility prediction into a reactive fast handover scheme

    Identifying Hidden Visits from Sparse Call Detail Record Data

    Full text link
    Despite a large body of literature on trip inference using call detail record (CDR) data, a fundamental understanding of their limitations is lacking. In particular, because of the sparse nature of CDR data, users may travel to a location without being revealed in the data, which we refer to as a "hidden visit". The existence of hidden visits hinders our ability to extract reliable information about human mobility and travel behavior from CDR data. In this study, we propose a data fusion approach to obtain labeled data for statistical inference of hidden visits. In the absence of complementary data, this can be accomplished by extracting labeled observations from more granular cellular data access records, and extracting features from voice call and text messaging records. The proposed approach is demonstrated using a real-world CDR dataset of 3 million users from a large Chinese city. Logistic regression, support vector machine, random forest, and gradient boosting are used to infer whether a hidden visit exists during a displacement observed from CDR data. The test results show significant improvement over the naive no-hidden-visit rule, which is an implicit assumption adopted by most existing studies. Based on the proposed model, we estimate that over 10% of the displacements extracted from CDR data involve hidden visits. The proposed data fusion method offers a systematic statistical approach to inferring individual mobility patterns based on telecommunication records

    What Does The Crowd Say About You? Evaluating Aggregation-based Location Privacy

    Get PDF
    Information about people’s movements and the locations they visit enables an increasing number of mobility analytics applications, e.g., in the context of urban and transportation planning, In this setting, rather than collecting or sharing raw data, entities often use aggregation as a privacy protection mechanism, aiming to hide individual users’ location traces. Furthermore, to bound information leakage from the aggregates, they can perturb the input of the aggregation or its output to ensure that these are differentially private. In this paper, we set to evaluate the impact of releasing aggregate location time-series on the privacy of individuals contributing to the aggregation. We introduce a framework allowing us to reason about privacy against an adversary attempting to predict users’ locations or recover their mobility patterns. We formalize these attacks as inference problems, and discuss a few strategies to model the adversary’s prior knowledge based on the information she may have access to. We then use the framework to quantify the privacy loss stemming from aggregate location data, with and without the protection of differential privacy, using two real-world mobility datasets. We find that aggregates do leak information about individuals’ punctual locations and mobility profiles. The density of the observations, as well as timing, play important roles, e.g., regular patterns during peak hours are better protected than sporadic movements. Finally, our evaluation shows that both output and input perturbation offer little additional protection, unless they introduce large amounts of noise ultimately destroying the utility of the data

    Protecting privacy of semantic trajectory

    Get PDF
    The growing ubiquity of GPS-enabled devices in everyday life has made large-scale collection of trajectories feasible, providing ever-growing opportunities for human movement analysis. However, publishing this vulnerable data is accompanied by increasing concerns about individuals’ geoprivacy. This thesis has two objectives: (1) propose a privacy protection framework for semantic trajectories and (2) develop a Python toolbox in ArcGIS Pro environment for non-expert users to enable them to anonymize trajectory data. The former aims to prevent users’ re-identification when knowing the important locations or any random spatiotemporal points of users by swapping their important locations to new locations with the same semantics and unlinking the users from their trajectories. This is accomplished by converting GPS points into sequences of visited meaningful locations and moves and integrating several anonymization techniques. The second component of this thesis implements privacy protection in a way that even users without deep knowledge of anonymization and coding skills can anonymize their data by offering an all-in-one toolbox. By proposing and implementing this framework and toolbox, we hope that trajectory privacy is better protected in research
    • 

    corecore