6,308 research outputs found
Mining top-k regular episodes from sensor streams
International audienceThe monitoring of human activities plays an important role in health-care applications and for the data mining community. Existing approaches work on activities recognition occurring in sensor data streams. However, regular behaviors have not been studied. Thus, we here introduce a new approach to discover top-k most regular episodes from sensors streams, TKRES. The top-k approach allows us to control the size of the output, thus preventing overwhelming result analysis for the supervisor. TKRES is based on the use of a simple top-k list and a k-tree structure for maintaining the top-k episodes and their occurrence information. We also investigate and report the performances of TKRES on two real-life smart home datasets
Trade marketing analytics in consumer goods industry
Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Information Systems and Technologies ManagementWe address transparency of trade spends in consumer goods industry and propose a set of business performance indicators that follow Pareto (80/20) rule â a popular concept in optimization problem solving. Discovery of power laws in behaviors of travelling sales persons, buying patterns of customers, popularity of products, and market demand fluctuations â all that leads to better-informed decisions among all those involved into planning, execution, and post-promotion evaluation. Practical result of our work is a prototype implementation of proposed measures.
The most remarkable finding â consistency of travelling sales person journey between customer locations. Loyalty to brand, or brand market power â whatever forces field sales representatives to put at least one product of market player of interest into nearly every market basket â fits into small world model. This behavior not only changes from person to person, but also remains the same after reassignment into different territory.
For industrialization stage of this project, we outline key design considerations for information system capable of handling real-time workload scalable to petabytes. We built our analyses for collaborative processes of integrated planning that requires joint effort of multidisciplinary team. Field tests demonstrate how insights from data can trigger business transformation. That is why we end up with recommendation for system integrators to include Knowledge Discovery into information system deployment projects
Context-aware Background Application Scheduling in Interactive Mobile Systems
Department of Computer Science and EngineeringEach individual's usage behavior on mobile devices depend on a variety of factors such as time, location, and previous actions. Hence, context-awareness provides great opportunities to make the networking and the computing capabilities of mobile systems to be more personalized and more efficient in managing their resources. To this end, we first reveal new findings from our own Android user experiment: (i) the launching probabilities of applications follow Zipf's law, and (ii) inter-running and running times of applications conform to log-normal distributions. We also find contextual dependencies between application usage patterns, for which we classify contexts autonomously with unsupervised learning methods. Using the knowledge acquired, we develop a context-aware application scheduling framework, CAS that adaptively unloads and preloads background applications for a joint optimization in which the energy saving is maximized and the user discomfort from the scheduling is minimized. Our trace-driven simulations with 96 user traces demonstrate that the context-aware design of CAS enables it to outperform existing process scheduling algorithms. Our implementation of CAS over Android platforms and its end-to-end evaluations verify that its human involved design indeed provides substantial user-experience gains in both energy and application launching latency.ope
Identifying road user classes based on repeated trip behaviour using Bluetooth data
© 2018 The Authors Analysing the repeated trip behaviour of travellers, including trip frequency and intrapersonal variability, can provide insights into traveller needs, flexibility and knowledge of the network, as well as inputs for models including learning and/or behaviour change. Data from emerging data sources provide new opportunities to examine repeated trip making on the road network. Point-to-point sensor data, for example from Bluetooth detectors, is collected using fixed detectors installed next to roads which can record unique identifiers of passing vehicles or travellers which can then be matched across space and time. Such data is used in this research to segment road users based on their repeated trip making behaviour, as has been done in public transportation research using smart card data to understand different categories of users. Rather than deciding on traveller segmentation based on a priori assumptions, the method provides a data driven approach to cluster together travellers who have similar trip regularity and variability between days. Measures which account for the strengths and weaknesses of point-to-point sensor data are presented for (a) spatial variability, using Sequence Alignment, and (b) time of day variability, using Model Based Clustering. The proposed method is also applied to one year of data from 23 fixed Bluetooth detectors in a town in northwest England. The data consists of almost 7.5 million trips made by over 300,000 travellers. Applying the proposed methods allows three traveller user classes to be identified: infrequent, frequent, and very frequent. Interestingly, the spatial and time of day variability characteristics of each user class are distinct and are not linearly correlated with trip frequency. The frequent travellers are observed 1â5 times per week on average and make up 57% of the trips recorded during the year. Focusing on these frequent travellers, it is shown that these can be further separated into those with high spatial and time of day variability and those with low spatial and time of day variability. Understanding the distribution of travellers and trips across these user classes, as well as the repeated trip characteristics of each user class, can inform further data collection and the development of policies targeting the needs of specific travellers
Prediction assisted fast handovers for seamless IP mobility
Word processed copy.Includes bibliographical references (leaves 94-98).This research investigates the techniques used to improve the standard Mobile IP handover process and provide proactivity in network mobility management. Numerous fast handover proposals in the literature have recently adopted a cross-layer approach to enhance movement detection functionality and make terminal mobility more seamless. Such fast handover protocols are dependent on an anticipated link-layer trigger or pre-trigger to perform pre-handover service establishment operations. This research identifies the practical difficulties involved in implementing this type of trigger and proposes an alternative solution that integrates the concept of mobility prediction into a reactive fast handover scheme
Identifying Hidden Visits from Sparse Call Detail Record Data
Despite a large body of literature on trip inference using call detail record
(CDR) data, a fundamental understanding of their limitations is lacking. In
particular, because of the sparse nature of CDR data, users may travel to a
location without being revealed in the data, which we refer to as a "hidden
visit". The existence of hidden visits hinders our ability to extract reliable
information about human mobility and travel behavior from CDR data. In this
study, we propose a data fusion approach to obtain labeled data for statistical
inference of hidden visits. In the absence of complementary data, this can be
accomplished by extracting labeled observations from more granular cellular
data access records, and extracting features from voice call and text messaging
records. The proposed approach is demonstrated using a real-world CDR dataset
of 3 million users from a large Chinese city. Logistic regression, support
vector machine, random forest, and gradient boosting are used to infer whether
a hidden visit exists during a displacement observed from CDR data. The test
results show significant improvement over the naive no-hidden-visit rule, which
is an implicit assumption adopted by most existing studies. Based on the
proposed model, we estimate that over 10% of the displacements extracted from
CDR data involve hidden visits. The proposed data fusion method offers a
systematic statistical approach to inferring individual mobility patterns based
on telecommunication records
What Does The Crowd Say About You? Evaluating Aggregation-based Location Privacy
Information about peopleâs movements and the
locations they visit enables an increasing number of mobility
analytics applications, e.g., in the context of urban and transportation
planning, In this setting, rather than collecting or
sharing raw data, entities often use aggregation as a privacy
protection mechanism, aiming to hide individual usersâ location
traces. Furthermore, to bound information leakage from
the aggregates, they can perturb the input of the aggregation
or its output to ensure that these are differentially private.
In this paper, we set to evaluate the impact of releasing aggregate
location time-series on the privacy of individuals contributing
to the aggregation. We introduce a framework allowing
us to reason about privacy against an adversary attempting
to predict usersâ locations or recover their mobility patterns.
We formalize these attacks as inference problems, and
discuss a few strategies to model the adversaryâs prior knowledge
based on the information she may have access to. We
then use the framework to quantify the privacy loss stemming
from aggregate location data, with and without the protection
of differential privacy, using two real-world mobility datasets.
We find that aggregates do leak information about individualsâ
punctual locations and mobility profiles. The density of
the observations, as well as timing, play important roles, e.g.,
regular patterns during peak hours are better protected than
sporadic movements. Finally, our evaluation shows that both
output and input perturbation offer little additional protection,
unless they introduce large amounts of noise ultimately destroying
the utility of the data
Protecting privacy of semantic trajectory
The growing ubiquity of GPS-enabled devices in everyday life has made large-scale collection of trajectories feasible, providing ever-growing opportunities for human movement analysis. However, publishing this vulnerable data is accompanied by increasing concerns about individualsâ geoprivacy. This thesis has two objectives: (1) propose a privacy protection framework for semantic trajectories and (2) develop a Python toolbox in ArcGIS Pro environment for non-expert users to enable them to anonymize trajectory data. The former aims to prevent usersâ re-identification when knowing the important locations or any random spatiotemporal points of users by swapping their important locations to new locations with the same semantics and unlinking the users from their trajectories. This is accomplished by converting GPS points into sequences of visited meaningful locations and moves and integrating several anonymization techniques. The second component of this thesis implements privacy protection in a way that even users without deep knowledge of anonymization and coding skills can anonymize their data by offering an all-in-one toolbox. By proposing and implementing this framework and toolbox, we hope that trajectory privacy is better protected in research
- âŠ