265 research outputs found

    Estimating Movement from Mobile Telephony Data

    Get PDF
    Mobile enabled devices are ubiquitous in modern society. The information gathered by their normal service operations has become one of the primary data sources used in the understanding of human mobility, social connection and information transfer. This thesis investigates techniques that can extract useful information from anonymised call detail records (CDR). CDR consist of mobile subscriber data related to people in connection with the network operators, the nature of their communication activity (voice, SMS, data, etc.), duration of the activity and starting time of the activity and servicing cell identification numbers of both the sender and the receiver when available. The main contributions of the research are a methodology for distance measurements which enables the identification of mobile subscriber travel paths and a methodology for population density estimation based on significant mobile subscriber regions of interest. In addition, insights are given into how a mobile network operator may use geographically located subscriber data to create new revenue streams and improved network performance. A range of novel algorithms and techniques underpin the development of these methodologies. These include, among others, techniques for CDR feature extraction, data visualisation and CDR data cleansing. The primary data source used in this body of work was the CDR of Meteor, a mobile network operator in the Republic of Ireland. The Meteor network under investigation has just over 1 million customers, which represents approximately a quarter of the country’s 4.6 million inhabitants, and operates using both 2G and 3G cellular telephony technologies. Results show that the steady state vector analysis of modified Markov chain mobility models can return population density estimates comparable to population estimates obtained through a census. Evaluated using a test dataset, results of travel path identification showed that developed distance measurements achieved greater accuracy when classifying the routes CDR journey trajectories took compared to traditional trajectory distance measurements. Results from subscriber segmentation indicate that subscribers who have perceived similar relationships to geographical features can be grouped based on weighted steady state mobility vectors. Overall, this thesis proposes novel algorithms and techniques for the estimation of movement from mobile telephony data addressing practical issues related to sampling, privacy and spatial uncertainty

    Privacy Leakage through Sensory Data on Smart Devices

    Get PDF
    Mobile devices are becoming more and more indispensable in people’s daily life. They bring variety of conveniences. However, many privacy issues also arise along with the ubiquitous usage of smart devices. Nowadays, people rely on smart devices for business and work, thus much sensitive information is released. Although smart device manufactures spend much effort to provide system level strategies for privacy preservation, lots of studies have shown that these strategies are far from perfect. In this dissertation, many privacy risks are explored. Smart devices are becoming more and more powerful as more and more sensors are embedded into smart devices. In this thesis, the relationship between sensory data and a user’s location information is analyzed first. A novel inference model and a corresponding algorithm are proposed to infer a user’s location information solely based on sensory data. The proposed approach is validated towards real-world sensory data. Another privacy issue investigated in this thesis is the inference of user behaviors based on sensory data. From extensive experiment results, it is observed that there is a strong correlation between sensory data and the tap position on a smart device’s screen. A sensory data collection app is developed to collect sensory data from more than 100 volunteers. A conventional neural network based method is proposed to infer a user’s input on a smart phone. The proposed inference model and algorithm are compared with several previous methods through extensive experiments. The results show that our method has much better accuracy. Furthermore, based on this inference model, several possible ways to steal private information are illustrated

    Trajectory Reconstruction and Mobility Pattern Analysis Based on Call Detail Record Data

    Get PDF
    Tehnoloogiad, mis kasutavad geograafilisi andmeid, on muutunud meie igapäevaelu tähtsaks osaks. Tänu sellele on kasvanud asukoha andmetemassiliine salvestamine ja kaevandamine. Seni on GPS tehnoloogiad olnud põhiliseks geograafiliste andmete kogumismeetodiks. Sellega paralleelselt on populaarsust kogunud mobiiliandmete kasutamine positsiooni tuvastamiseks ja liikumismustrite analüüsimiseks. Mobiiliandmete (CDR) põhjal trajektooride taastamiseks on vajalik meetodite kohendamine selleks, et tulemused oleksid korrektsed. Tänu sellele, et telekommunikatsiooni ettevõtted on alustanud suuremat koostööd ja hakanud CDR-andmeid järjest rohkem avalikustama, on mobiiliandmete kasutamine mitmetel aladel suurenenud. Töödeldud mobiiliandmed aitavad anda ülevaadet rahvastiku liikumisest erinevates ulatustes. Samal ajal on trajektooride taastamine CDR-andmetest kohati raskendatud võrreldes GPS-andmetega. Suurimaks probleemiks on algus- ja lõpp-positsioonide asukoha määramine, mis on veelgi enam raskendatud juhul kui objekt liigub.Selle lõputöö eesmärgiks on trajektooride taastamine anonüümsete kasutajatepoolt genereeritud CDR-andmete põhjal. Tulemuste valideerimine GPS-andmetega, mis on loodud paralleelselt mobiiliandmetega ning on vajalik selleks, et määrata saadud trajektooride täpsust. Loodud trajektoore saab kasutada objektide, sealhulgas ka inimeste, liikumismustrite analüüsimiseks ja rahvastiku paiknemise tuvastamiseks, mis aitab linnade planeerimisel ja infrastruktuuride optimeerimisel. Lõputöö väljunditeks on trajektooride taastamine ja täpsuse analüüsimine, lisaks sellele inimese liikumismudelite tuvastamine ja tihedamini külastatavate asukohtade identifitseerimine nagu näiteks kodu, töökoht ja poed.Up until now, GPS data has been greatly used for collecting highlyprecise locational data from moving objects including humans. In contrast, mobile phone data is becoming more and more popular in the last few years. The usage of mobile phone data, that is also known as CDR data, has many benefits over the widely used GPS. This means that the methods used for example in GPS trajectory reconstruction, need to have modifications made be compatible with CDR data.The fact that telecommunication companies have started to cooperate moreand share the CDR data with the public is also a boost to the usage of CDRdata. The processed and analyzed CDR data can be used to get an overview ofcrowd movement in different scales, for example traveling inside a city as opposed to between countries. Extracting trajectories from CDR data has numerous complications.This is due to the fact that the data might not be continuous anddiscovering of the starting point of the object in motion is complicated.The goal of this thesis is to use CDR data in the reconstruction of trajectoriesmade by an anonymous user and to validate the results with GPS data generated in parallel to the CDR data. Reconstructed trajectories can be used for movement analysis and population displacement and would help city planning by optimizing the infrastructures.Outcomes of this thesis are the reconstructed trajectories based on CDR dataand the precisions of final paths. Also, the frequency of CDR events is analyzedin addition to distance distribution. After that the areas that the user visits most frequently are extracted, such as home and work locations

    An Overview of Moving Object Trajectory Compression Algorithms

    Get PDF
    Compression technology is an efficient way to reserve useful and valuable data as well as remove redundant and inessential data from datasets. With the development of RFID and GPS devices, more and more moving objects can be traced and their trajectories can be recorded. However, the exponential increase in the amount of such trajectory data has caused a series of problems in the storage, processing, and analysis of data. Therefore, moving object trajectory compression undoubtedly becomes one of the hotspots in moving object data mining. To provide an overview, we survey and summarize the development and trend of moving object compression and analyze typical moving object compression algorithms presented in recent years. In this paper, we firstly summarize the strategies and implementation processes of classical moving object compression algorithms. Secondly, the related definitions about moving objects and their trajectories are discussed. Thirdly, the validation criteria are introduced for evaluating the performance and efficiency of compression algorithms. Finally, some application scenarios are also summarized to point out the potential application in the future. It is hoped that this research will serve as the steppingstone for those interested in advancing moving objects mining

    Real-Time Prediction of Gamers Behavior Using Variable Order Markov and Big Data Technology: A Case of Study

    Get PDF
    This paper presents the results and conclusions found when predicting the behavior of gamers in commercial videogames datasets. In particular, it uses Variable-Order Markov (VOM) to build a probabilistic model that is able to use the historic behavior of gamers and to infer what will be their next actions. Being able to predict with accuracy the next user’s actions can be of special interest to learn from the behavior of gamers, to make them more engaged and to reduce churn rate. In order to support a big volume and velocity of data, the system is built on top of the Hadoop ecosystem, using HBase for real-time processing; and the prediction tool is provided as a service (SaaS) and accessible through a RESTful API. The prediction system is evaluated using a case of study with two commercial videogames, attaining promising results with high prediction accuracies

    Methodologies for Estimating Traffic Flow on Freeways Using Probe Vehicle Trajectory Data

    Get PDF
    Probe vehicle data are increasingly becoming the primary source of traffic data. As probe vehicle data become more widespread, it is imperative that methods are developed so that traffic state estimators such as flow, density, and speed can be derived from such data. In this dissertation three different methodologies are proposed for predicting traffic flow or volume on a freeway. All of the proposed methodologies exploit several different traffic flow theories in conjunction with probe vehicle data to predict traffic flow. The first methodology takes advantage of the fundamental diagram or speed-flow relationship. The relationship states that flow can be estimated when speed is known. In this case, flow is traffic volume and speed comes from probe vehicles. Flow is predicted for four different models of fundamental diagrams and is analyzed at different time aggregation intervals. Results show that of the four fundamental diagrams, Van Aerde’s Model is the best performing model with the lowest average percent error. It is also observed that flow prediction is more accurate during low speed (congestion) compared to high speed (free-flow) conditions. The second methodology exploits the shockwave theory, which pertains to the propagation of a change (discontinuity) in traffic flow. From probe vehicle trajectories, shockwave is estimated as the boundary between free-flow and congested regimes of traffic flow. After clustering the traffic regimes into free-flow and congested periods, the traffic flow during congestion is estimated using the Northwestern congested-regime fundamental diagram. From this estimation, the flow during free-flow is then predicted. Analyses show that the percent error of the predicted flow during free-flow ranges from -9 to 1%. The third methodology is the car-following approach which relies on the spacing or distance between a leader and follower which can be directly measured from the trajectories. Based on a set of known probability distributions, the position of the follower vehicle with respect to the lead vehicle is estimated given that the spacing between the two random probe vehicles is known. A framework is developed to automatically process probe trajectories to extract relevant probe data under stop-and-go traffic conditions. The model is tested based on NGSIM datasets. The results show that when vehicle spacing is small the prediction of follower position is very accurate. As spacing increases the error in predicted follower position also increases. Though there exists some estimation error, all three approaches can reasonably predict flow for freeways using probe vehicle data

    Käyttäjien jäljittäminen ja kannusteiden hallinta älykkäissä liikennejärjestelmissä

    Get PDF
    A system for offering incentives for ecological modes of transport is presented. The main focus is on the verification of claims of having taken a trip on such a mode of transport. Three components are presented for the task of travel mode identification: A system to select features, a means to measure a GPS (Global Positioning System) trace's similarity to a bus route, and finally a machine-learning approach to the actual identification. Feature selection is carried out by sorting the features according to statistical significance, and eliminating correlating features. The novel features considered are skewnesses, kurtoses, auto- and cross correlations, and spectral components of speed and acceleration. Of these, only spectral components are found to be particularly useful in classification. Bus route similarity is measured by using a novel indexing structure called MBR-tree, short for "Multiple Bounding Rectangle", to find the most similar bus traces. The MBR-tree is an expansion of the R-tree for sequences of bounding rectangles, based on an estimation method for longest common subsequence that uses such sequences. A second option of decomposing traces to sequences of direction-distance-duration-triples and indexing them in an M-tree using edit distance with real penalty is considered but shown to perform poorly. For machine learning, the methods considered are Bayes classification, random forest, and feedforward neural networks with and without autoencoders. Autoencoder neural networks are shown to perform perplexingly poorly, but the other methods perform close to the state-of-the-art. Methods for obfuscating the user's location, and constructing secure electronic coupons, are also discussed
    corecore