96 research outputs found

    Multi-Output Gaussian Processes for Crowdsourced Traffic Data Imputation

    Full text link
    Traffic speed data imputation is a fundamental challenge for data-driven transport analysis. In recent years, with the ubiquity of GPS-enabled devices and the widespread use of crowdsourcing alternatives for the collection of traffic data, transportation professionals increasingly look to such user-generated data for many analysis, planning, and decision support applications. However, due to the mechanics of the data collection process, crowdsourced traffic data such as probe-vehicle data is highly prone to missing observations, making accurate imputation crucial for the success of any application that makes use of that type of data. In this article, we propose the use of multi-output Gaussian processes (GPs) to model the complex spatial and temporal patterns in crowdsourced traffic data. While the Bayesian nonparametric formalism of GPs allows us to model observation uncertainty, the multi-output extension based on convolution processes effectively enables us to capture complex spatial dependencies between nearby road segments. Using 6 months of crowdsourced traffic speed data or "probe vehicle data" for several locations in Copenhagen, the proposed approach is empirically shown to significantly outperform popular state-of-the-art imputation methods.Comment: 10 pages, IEEE Transactions on Intelligent Transportation Systems, 201

    Heteroscedastic Gaussian processes for uncertainty modeling in large-scale crowdsourced traffic data

    Full text link
    Accurately modeling traffic speeds is a fundamental part of efficient intelligent transportation systems. Nowadays, with the widespread deployment of GPS-enabled devices, it has become possible to crowdsource the collection of speed information to road users (e.g. through mobile applications or dedicated in-vehicle devices). Despite its rather wide spatial coverage, crowdsourced speed data also brings very important challenges, such as the highly variable measurement noise in the data due to a variety of driving behaviors and sample sizes. When not properly accounted for, this noise can severely compromise any application that relies on accurate traffic data. In this article, we propose the use of heteroscedastic Gaussian processes (HGP) to model the time-varying uncertainty in large-scale crowdsourced traffic data. Furthermore, we develop a HGP conditioned on sample size and traffic regime (SRC-HGP), which makes use of sample size information (probe vehicles per minute) as well as previous observed speeds, in order to more accurately model the uncertainty in observed speeds. Using 6 months of crowdsourced traffic data from Copenhagen, we empirically show that the proposed heteroscedastic models produce significantly better predictive distributions when compared to current state-of-the-art methods for both speed imputation and short-term forecasting tasks.Comment: 22 pages, Transportation Research Part C: Emerging Technologies (Elsevier

    Estimating Latent Demand of Shared Mobility through Censored Gaussian Processes

    Full text link
    Transport demand is highly dependent on supply, especially for shared transport services where availability is often limited. As observed demand cannot be higher than available supply, historical transport data typically represents a biased, or censored, version of the true underlying demand pattern. Without explicitly accounting for this inherent distinction, predictive models of demand would necessarily represent a biased version of true demand, thus less effectively predicting the needs of service users. To counter this problem, we propose a general method for censorship-aware demand modeling, for which we devise a censored likelihood function. We apply this method to the task of shared mobility demand prediction by incorporating the censored likelihood within a Gaussian Process model, which can flexibly approximate arbitrary functional forms. Experiments on artificial and real-world datasets show how taking into account the limiting effect of supply on demand is essential in the process of obtaining an unbiased predictive model of user demand behavior.Comment: 21 pages, 10 figure

    Enhanced Methods for Utilization of Data to Support Multi-Scenario Analysis and Multi-Resolution Modeling

    Get PDF
    The success of analysis and simulation in transportation systems depends on the availability, quality, reliability, and consistency of real-world data and the methods for utilizing the data. Additional data and data requirements are needed to support advanced analysis and simulation strategies such as multi-resolution modeling (MRM) and multi-scenario analysis. This study has developed, demonstrated, and assessed a systematic approach for the use of data to support MRM and multi-scenario analysis. First, the study developed and examined approaches for selecting one or more representative days for the analysis, considering the variability in travel conditions throughout the year based on cluster analysis. Second, this study developed and analyzed methods for using crowdsourced data vii to estimate origin-destination demands and link-level volumes for use as part of an MRM with consideration of the modeling scenario(s). The assessment of the methods to select the representative day(s) utilizes statistical measures, in addition to measures and visualization techniques that are specific to traffic operations. The results of the assessment indicate that the utilization of the K-means clustering algorithm with four clusters and spatio-temporal segregation of the variables demonstrated superior performance over other tested approaches, such as the use of the Gaussian Mixture clustering algorithm and the use of different segregation levels. The study assessed methods for the use of third-party crowdsourced data from StreetLight (SL) as part of the Origin-Destination Matrix Estimation (ODME), which identifies the method resulting in the closest origin-destination demands to the original seed matrices and real-world link counts. The results of the study indicate that Method 3(b) produced the best performance, which utilized combined data from demand forecasting models, crowdsourced data, and traffic counts. Additionally, this study examined regression models between crowdsourced data and count station data developed for link-level estimation of the volumes. This study also examined the accuracy and transferability of the link-level estimation of the volumes to determine if the crowdsourced data combined with available volume data at several locations can be used to predict missing or unavailable volumes in different locations on different days and times within the network. Regression models produced low errors than the default SL estimates when hourly or daily traffic volumes were taken into account. For similar traffic conditions, the models predicted directional traffic volume close to the real-world value

    Modeling and Analysis of Permanent Magnet Spherical Motors by A Multi-task Gaussian Process Method and Finite Element Method for Output Torque

    Get PDF
    Permanent magnet spherical motors (PMSMs) operate on the principle of the dc excitation of stator coils and three freedom of motion in the rotor. Each coil generates the torque in a specific direction, collectively they move the rotor to a direction of motion. Modeling and analysis of the output torque are of critical importance for precise position control applications. The control of these motors requires precise output torques by all coils at a specific rotor position, which is difficult to achieve in the three-dimension space. This article is the first to apply the Gaussian process to establish the relationship of the rotor position and the output torque for PMSMs. Traditional methods are difficult to resolve such a complex three-dimensional problem with a reasonable computational accuracy and time. This article utilizes a data-driven method using only input and output data validated by experiments. The multitask Gaussian process is developed to calculate the total torque produced by multiple coils at the full operational range. The training data and test data are obtained by the finite-element method. The effectiveness of the proposed method is validated and compared with existing data-driven approaches. The results exhibit superior performance of accuracy

    Performance Comparison Of Weak And Strong Learners In Detecting GPS Spoofing Attacks On Unmanned Aerial Vehicles (uavs)

    Get PDF
    Unmanned Aerial Vehicle systems (UAVs) are widely used in civil and military applications. These systems rely on trustworthy connections with various nodes in their network to conduct their safe operations and return-to-home. These entities consist of other aircrafts, ground control facilities, air traffic control facilities, and satellite navigation systems. Global positioning systems (GPS) play a significant role in UAV\u27s communication with different nodes, navigation, and positioning tasks. However, due to the unencrypted nature of the GPS signals, these vehicles are prone to several cyberattacks, including GPS meaconing, GPS spoofing, and jamming. Therefore, this thesis aims at conducting a detailed comparison of two widely used machine learning techniques, namely weak and strong learners, to investigate their performance in detecting GPS spoofing attacks that target UAVs. Real data are used to generate training datasets and test the effectiveness of machine learning techniques. Various features are derived from this data. To evaluate the performance of the models, seven different evaluation metrics, including accuracy, probabilities of detection and misdetection, probability of false alarm, processing time, prediction time per sample, and memory size, are implemented. The results show that both types of machine learning algorithms provide high detection and low false alarm probabilities. In addition, despite being structurally weaker than strong learners, weak learner classifiers also, achieve a good detection rate. However, the strong learners slightly outperform the weak learner classifiers in terms of multiple evaluation metrics, including accuracy, probabilities of misdetection and false alarm, while weak learner classifiers outperform in terms of time performance metrics

    CROWDSOURCED DATA FOR MOBILITY ANALYSIS

    Get PDF
    The importance of data in transportation research has been widely recognized since it plays a crucial role in understanding and analyzing the movement of people, identifying inefficiencies in transportation systems, and developing strategies to improve mobility services. This use of data, known as mobility analysis, involves collecting and analyzing data on transport infrastructure and services, traffic flows, demand, and travel behavior. However, traditional data sources have limitations. The widespread use of mobile devices, such as smartphones, has enabled the use of Information and Communications Technology (ICT) to improve data sources for mobility analysis. Mobile crowdsensing (MCS) is a paradigm that uses data from smart devices to provide researchers with more detailed and real-time insights into mobility patterns and behaviors. However, this new data also poses challenges, such as the need to fuse it with other types of information to obtain mobility insights. In this thesis, the primary source of data that is being examined and leveraged is the popularity index of local businesses and points of interest from Google Popular Times (GPT) data. This data has significant potential for mobility analysis as it overcomes limitations of traditional mobility data, such as data availability and lack of reflection of demand for secondary activities. The main objective of this thesis is to investigate how crowdsourced data can contribute to reduce the limitations of traditional mobility datasets. This is achieved by developing new tools and methodologies to utilize crowdsourced data in mobility analysis. The thesis first examines the potential of GPT as a source to provide information on the attractiveness of secondary activities. A data-driven approach is used to identify features that impact the popularity of local businesses and classify their attractiveness based on these features. Secondly, the thesis evaluates the possible use of GPT as a source to estimate mobility patterns. A tool is created to use the crowdness of a station to estimate transit demand information and map the precise volume and temporal dynamics of entrances and exits at the station level. Thirdly, the thesis investigates the possibility of leveraging the popularity of activities around stations to estimate flows in and out of stations. A method is proposed to profile stations based on the dynamic information of activities in catchment areas. Through this data, machine learning techniques are used to estimate transit flows at the station level. Finally, this study concludes by exploring the possibility of exploiting crowdsourced data not only for extracting mobility insights under normal conditions but also to extract mobility trends during anomalous events. To this end, we focused on analyzing the recovery of mobility during the first outbreak of COVID-19 for different cities in Europe
    • …
    corecore