226 research outputs found

    Managing Traffic Data through Clustering and Radial Basis Functions

    Get PDF
    Due to the importance of road transport an adequate identification of the various road network levels is necessary for an efficient and sustainable management of the road infrastructure. Additionally, traffic values are key data for any pavement management system. In this work traffic volume data of 2019 in the Basque Autonomous Community (Spain) were analyzed and modeled. Having a multidimensional sample, the average annual daily traffic (AADT) was considered as the main variable of interest, which is used in many areas of the road network management. First, an exploratory analysis was performed, from which descriptive statistical information was obtained continuing with the clustering by various variables in order to standardize its behavior by translation. In a second stage, the variable of interest was estimated in the entire road network of the studied country using linear-based radial basis functions (RBFs). The estimated model was compared with the sample statistically, evaluating the estimation using cross-validation and highest-traffic sectors are defined. From the analysis, it was observed that the clustering analysis is useful for identifying the real importance of each road segment, as a function of the real traffic volume and not based on other criteria. It was also observed that interpolation methods based on linear-type radial basis functions (RBF) can be used as a preliminary method to estimate the AADT.This research was funded by The University of the Basque Country (UPV/EHU), Call for Innovation Projects “IKD i3 Laborategia” (Call 1-2020, 2019/20)

    Hydrology in Water Resources Management

    Get PDF
    This book is a collection of 12 papers describing the role of hydrology in water resources management. The papers can be divided s according to their area of focus as 1) modeling of hydrological processes, 2) use of modern techniques in hydrological analysis, 3) impact of human pressure and climate change on water resources, and 4) hydrometeorological extremes. Belonging to the first area is the presentation of a new Muskingum flood routing model, a new tool to perform frequency analysis of maximum precipitation of a specified duration via the so-named PMAX΀P model (Precipitation MAXimum Time (duration) Probability), modeling of interception processes, and using a rainfall-runoff GR2M model to calculate monthly runoff. For the second area, the groundwater potential was evaluated using a model of multi-influencing factors in which the parameters were optimized by using geoprocessing tools in geographical information system (GIS) in combination with satellite altimeter data and the reanalysis of hydrological data to simulate overflow transport using the Nordic Sea as an example. Presented for the third area are a water balance model for the comparison of water resources with the needs of water users, the idea of adaptive water management, impacts of climate change, and anthropogenic activities on the runoff in catchment located in the western Himalayas of Pakistan. The last area includes spatiotemporal analysis of rainfall variability with regard to drought hazard and use of the copula function to meteorologically analyze drought

    Sulfur dioxide trends in Malta: A statistical computing approach

    Get PDF
    A statistical investigation of data related to emissions and measurement of SO2 in the Maltese islands encompassing the period 2004 to 2012 was conducted. The purpose was to investigate whether SO2 levels were driven by the Marsa power station (MPS), which was considered to be the main source of SO2 on the island. In addition, the study sought to establish spatial and temporal trends in the SO2 concentrations measured throughout the islands. Data was obtained from the Malta Environment and Planning Authority (4 fixed monitoring stations and a diffusion tube network) and also from the Enemalta Corporation (emissions of MPS). This was analysed using the Inter Operability and Automated Mapping Project (IntaMap) and GIS for mapping purposes, as well as R and SPSS packages for statistical processing. The results have shown that average yearly emissions from the MPS decreased from approximately 858 g/hr to 780 g/hr between 2009 and 2012. Diffusion tube and monitoring station data have indicated overall decreases in SO2 with certain localised areas showing increases. It was also determined that there were only two occasions when the 350 ”g/m3 hourly limit of Directive 2008/50/EC was exceeded. All the stations in the monitoring station network registered higher readings when the winds were Northerly or North-Westerly. The Kordin station was found to have the overall highest SO2 readings while Għarb had the lowest. Results suggested that emissions from the MPS had a more localised effect on SO2 levels compared to previous research. However, a 3-predictor statistical ANCOVA analysis determined that while emissions from the MPS were statistically significant in determining the amount of SO2 being measured in the monitoring stations, the results indicated that there were other contributors. These contributors could have included emissions from the Delimara power station emissions and marine vessels. On the other hand, a 2-predictor model using only readings registered with wind originating from the MPS direction showed that MPS emissions were only statistically relevant for Kordin. Hence, it can be concluded Kordin was the most likely area to be affected by MPS emissions while the effect on Msida, ƻejtun and Għarb was negligible. The overall findings of the study indicated that, although the MPS was still found to be a contributor of SO2, other sources should now start to be monitored as well. It is recommended that the identification of new sources of SO2 be a focus of future research, including examination of effects of the Delimara power station and marine vessels

    Redefine time series models for transportation planning use

    Full text link
    Time series models are used to model, simulate, and forecast the behaviour of a phenomenon over time based on data recorded over consistent intervals. The digital era has resulted in data being captured and archived in unprecedented amounts, such that vast amounts of information are available for analysis. Feature-rich time-series datasets are one of the data sets that have become available due to the expanding trend of data collection technologies worldwide. With the application of time series analysis to support financial and managerial decision-making, the development and advancement of time series models in the transportation domain are unavoidable. As a result, this thesis redefines time series models for transportation planning use with the following three aims: (1) To combine parametric and bootstrapping techniques within time series models; (2) to develop a time series model capable of modelling both temporal and spatial dependencies in time-series data; and (3) to leverage the hierarchical Bayesian modelling paradigm to accommodate flexible representations of heterogeneity in data. The first main chapter introduces an ensemble of ARIMA models. It compares its performance against conventional ARIMA (a parametric method) and LSTM models (a non-parametric method) for short-term traffic volume prediction. The second main chapter introduces a copula time series model that describes correlations between variables through time and space. Temporal correlations are modelled by an ARMA-GARCH model which enables a modeller to describe heteroscedastic data. The copula model has a flexible correlation structure and is used to model spatial correlations with the ability to model nonlinear, tailed and asymmetric correlations. The third main chapter provides a Bayesian modelling framework to raise awareness about using hierarchical Bayesian approaches for transport time series data. In addition, this chapter presents a Bayesian copula model. The combination of the two models provides a fully Bayesian approach to modelling both temporal and spatial correlations. Compared with frequentist models, the proposed modelling structures can incorporate prior knowledge. In the fourth main chapter, the fully Bayesian model is used to investigate mobility patterns before, during and after the COVID-19 pandemic using social media data. A more focused analysis is conducted on the mobility patterns of Twitter users from different zones and land use types

    Emerging Hydro-Climatic Patterns, Teleconnections and Extreme Events in Changing World at Different Timescales

    Get PDF
    This Special Issue is expected to advance our understanding of these emerging patterns, teleconnections, and extreme events in a changing world for more accurate prediction or projection of their changes especially on different spatial–time scales

    IEEE Access Special Section Editorial: Big Data Technology and Applications in Intelligent Transportation

    Get PDF
    During the last few years, information technology and transportation industries, along with automotive manufacturers and academia, are focusing on leveraging intelligent transportation systems (ITS) to improve services related to driver experience, connected cars, Internet data plans for vehicles, traffic infrastructure, urban transportation systems, traffic collaborative management, road traffic accidents analysis, road traffic flow prediction, public transportation service plan, personal travel route plans, and the development of an effective ecosystem for vehicles, drivers, traffic controllers, city planners, and transportation applications. Moreover, the emerging technologies of the Internet of Things (IoT) and cloud computing have provided unprecedented opportunities for the development and realization of innovative intelligent transportation systems where sensors and mobile devices can gather information and cloud computing, allowing knowledge discovery, information sharing, and supported decision making. However, the development of such data-driven ITS requires the integration, processing, and analysis of plentiful information obtained from millions of vehicles, traffic infrastructures, smartphones, and other collaborative systems like weather stations and road safety and early warning systems. The huge amount of data generated by ITS devices is only of value if utilized in data analytics for decision-making such as accident prevention and detection, controlling road risks, reducing traffic carbon emissions, and other applications which bring big data analytics into the picture

    GĂ©nĂ©ration de donnĂ©es : de l’anonymisation Ă  la construction de populations synthĂ©tiques

    Full text link
    Les coĂ»ts Ă©levĂ©s de collecte de donnĂ©es ne rendent souvent possible que l’échantillonnage d’un sous-ensemble de la population d’intĂ©rĂȘt. Il arrive Ă©galement que les donnĂ©es collectĂ©es renferment des renseignements personnels et sensibles au sujet des individus qui y figurent de sorte qu’elles sont protĂ©gĂ©es par des lois ou des pratiques strictes de sĂ©curitĂ© et gouvernance de donnĂ©es. Dans les deux cas, l’accĂšs aux donnĂ©es est restreint. Nos travaux considĂšrent deux angles de recheche sous lesquels on peut se servir de la gĂ©nĂ©ration de donnĂ©es fictives pour concevoir des modĂšles d’analyse oĂč les donnĂ©es vĂ©ritables sont inaccessibles. Sous le premier angle, la gĂ©nĂ©raton de donnĂ©es fictives se substitue aux donnĂ©es du recensement. Elle prend la forme d’une synthĂšse de population constituĂ©e d’individus dĂ©crits par leurs attributs aux niveaux individuel et du mĂ©nage. Nous proposons les copules comme nouvelle approche pour modĂ©liser une population d’intĂ©rĂȘt dont seules les distributions marginales sont connues lorsque nous possĂ©dons un Ă©chantillon d’une autre population qui partage des caractĂ©ristiques de dĂ©pendances interdimensionnelles similaires. Nous comparons les copules Ă  l’ajustement proportionnel itĂ©ratif, technologie rĂ©pandue dans le domaine de la synthĂšse de population, mais aussi aux approches d’apprentissage automatique modernes comme les rĂ©seaux bayĂ©siens, les auto-encodeurs variationnels et les rĂ©seaux antagonistes gĂ©nĂ©ratifs lorsque la tĂąche consiste Ă  gĂ©nĂ©rer des populations du Maryland dont les donnĂ©es sont issues du recensement amĂ©ricain. Nos expĂ©riences montrent que les copules surpassent l’ajustement proportionnel itĂ©ratif Ă  modĂ©liser les relations interdimensionnelles et que les distributions marginales des donnĂ©es qu’elles gĂ©nĂšrent correspondent mieux Ă  celles de la population d’intĂšrĂȘt que celles des donnĂ©es gĂ©nĂ©rĂ©es par les mĂ©thodes d’apprentissage automatique. Le second angle considĂšre la gĂ©nĂ©ration de donnĂ©es qui prĂ©servent la confidentialitĂ©. Comme la dĂ©sensibilisation des donnĂ©es est en relation inverse avec son utilitĂ©, nous Ă©tudions en quelles mesures le k-anonymat et la modĂ©lisation gĂ©nĂ©rative fournissent des donnĂ©es utiles relativement aux donnĂ©es sensibles qu’elles remplacent. Nous constatons qu’il est effectivement possible d’employer ces dĂ©finitions de confidentialitĂ© pour publier des donnĂ©es utiles, mais la question de comparer leurs garanties de confidentialitĂ© demeure ouverte.The high costs of data collection can restrict sampling so that only a subset of the data is available. The data collected may also contain personal and sensitive information such that it is protected by laws or strict data security and governance practices. In both cases, access to the data is restricted. Our work considers two research angles under which one can use the generation of synthetic data to design analysis models where the real data is inaccessible. In the first project, a synthetically generated population made up of individuals described by their attributes at the individual and household levels replaces census data. We propose copulas as a new approach to model a population of interest whose only marginal distributions are known when we have a sample from another population that shares similar interdimensional dependencies. We compare copulas to iterative proportional fitting, a technology developed in the field of population synthesis, but also to modern machine learning approaches such as Bayesian networks, variational autoencoders, and generative adversarial networks when the task is to generate populations of Maryland. Our experiments demonstrated that the copulas outperform iterative proportional fitting in modeling interdimensional relationships and that the marginal distributions of the data they generated match those of the population of interest better than those of the data generated by the machine learning methods. The second project consists of generating data that preserves privacy. As data privacy is inversely related to its usefulness, we study to what extent k-anonymity and generative modeling provide useful data relative to the sensitive data they replace. We find that it is indeed possible to use these privacy definitions to publish useful data, but the question of comparing their privacy guarantees remains open

    A new flood estimation paradigm for the design of civil infrastructure systems

    Get PDF
    Methods for quantifying flood risk of civil infrastructure systems such as road and rail networks require considerably more information compared to traditional methods that focus on flood risk at a point. These systems are characterised by multiple interconnected components, whereby a ‘failure’ of the overall system can arise because of complex combinations of failures in system subcomponents. For example, flooding of a single bridge along a railway may leave the entire railway inoperable, and the interest is often in the probability that one or more bridges along a stretch of railway will be flooded, rather than designing each bridge in isolation. Similarly, the viability of evacuation routes often requires an assessment of the probability that the route is flooded, conditional on an evacuation being necessary as a result of floods elsewhere in the system. Conventional design flood estimation processes are ill-equipped to deal with these complex problems. Whereas traditional flood estimation approaches focus on estimating flood risk at a single location, this thesis proposes a new estimation paradigm that focuses on estimating system-wide risk. The approach builds on the traditional intensity-duration-frequency (IDF) methods that are commonly used in engineering practice in Australia and internationally; however, this is implemented in such a way at to provide information on the spatial dependence of design storms. A particular innovation in this thesis is to estimate spatial rainfall dependence across multiple storm durations, allowing it to be used to estimate flood risk across multiple catchments with differing times of concentration. This enables the estimation of both conditional probabilities (e.g. probability of one part of a system being flooded conditional on another part of the system being flooded) and joint probabilities (e.g. the probability of multiple parts of a system experiencing floods simultaneously). Finally, whereas traditional IDF approaches consider conversion from point rainfall to spatial rainfall via areal reduction factors as a post-processing step, the approach proposed herein enables this conversion implicitly as part of the method. The proposed approach is based on two classes of extreme value model: max-stable process models, and inverted max-stable process models. These models differ in their assumption for how spatial dependence scales in the limit, as the rainfall events become increasingly extreme (referred to as “asymptotic dependence”). In particular, max-stable models assume asymptotic dependence (i.e. the spatial dependence converges to a non-zero limit), whereas inverted max-stable models assume asymptotic independence. This assumption has significant implications for very rare events (e.g. the 1% annual exceedance probability event), particularly when the estimates are based on relatively short observational records. Specifically, implementation focuses on the (inverted) Brown-Resnick family of models. This class of model was adjusted by accounting for spatial dependence across multiple storm burst durations. The adjustment used the theoretical pairwise extremal coefficient function as a function of both distance and duration. The integration of multiple durations into the modelling framework was tested on a 21,400 km2 spatial domain in the Greater Sydney region, with data on sub-daily rainfall from 25 stations. The updated model shows a reasonable fit between the observed pairwise extremal coefficients and the theoretical pairwise extremal coefficient function across all durations. The asymptotic dependence and comparison with empirically derived areal reduction factors was tested next, and it was shown that the observed data follow the behaviour of an asymptotically independent process, which leads to ARFs that decrease with an increasing return period. This demonstrates that inverted max-stable process models such as the inverted Brown-Resnick model are the most suitable method for simulating spatial rainfall in the study areas that were investigated. Finally, the outcomes of this research are demonstrated by implementing the spatially dependent IDF approach in a realistic case study that requires information on both conditional and joint dependence. The case study examines a highway upgrade project on the east coast of Australia, containing five bridge crossings with differing contributing catchment areas, and thus differing times of concentration. The results are used to show the differences between conditional-flood and conventional-flood estimates at each bridge, and the relationship between the overall failure of a system and the failure probability of an individual bridge. For example, if one were to design the highway section for a 1% probability of at least one bridge being flooded in any given year, it would be necessary to design each individual bridge to a % annual exceedance probability design flood. This research therefore is shown to enable a different paradigm for design flood risk estimation, which focuses attention on the risk of the entire system rather than considering individual system elements in isolation.Thesis (Ph.D.) -- University of Adelaide, School of Civil, Environmental and Mining Engineering, 201

    WEIGH-IN-MOTION DATA-DRIVEN PAVEMENT PERFORMANCE PREDICTION MODELS

    Get PDF
    The effective functioning of pavements as a critical component of the transportation system necessitates the implementation of ongoing maintenance programs to safeguard this significant and valuable infrastructure and guarantee its optimal performance. The maintenance, rehabilitation, and reconstruction (MRR) program of the pavement structure is dependent on a multidimensional decision-making process, which considers the existing pavement structural condition and the anticipated future performance. Pavement Performance Prediction Models (PPPMs) have become indispensable tools for the efficient implementation of the MRR program and the minimization of associated costs by providing precise predictions of distress and roughness based on inventory and monitoring data concerning the pavement structure\u27s state, traffic load, and climatic conditions. The integration of PPPMs has become a vital component of Pavement Management Systems (PMSs), facilitating the optimization, prioritization, scheduling, and selection of maintenance strategies. Researchers have developed several PPPMs with differing objectives, and each PPPM has demonstrated distinct strengths and weaknesses regarding its applicability, implementation process, and data requirements for development. Traditional statistical models, such as linear regression, are inadequate in handling complex nonlinear relationships between variables and often generate less precise results. Machine Learning (ML)-based models have become increasingly popular due to their ability to manage vast amounts of data and identify meaningful relationships between them to generate informative insights for better predictions. To create ML models for pavement performance prediction, it is necessary to gather a significant amount of historical data on pavement and traffic loading conditions. The Long-Term Pavement Performance Program (LTPP) initiated by the Federal Highway Administration (FHWA) offers a comprehensive repository of data on the environment, traffic, inventory, monitoring, maintenance, and rehabilitation works that can be utilized to develop PPPMs. The LTPP also includes Weigh-In-Motion (WIM) data that provides information on traffic, such as truck traffic, total traffic, directional distribution, and the number of different axle types of vehicles. High-quality traffic loading data can play an essential role in improving the performance of PPPMs, as the Mechanistic-Empirical Pavement Design Guide (MEPDG) considers vehicle types and axle load characteristics to be critical inputs for pavement design. The collection of high-quality traffic loading data has been a challenge in developing Pavement Performance Prediction Models (PPPMs). The Weigh-In-Motion (WIM) system, which comprises WIM scales, has emerged as an innovative solution to address this issue. By leveraging computer vision and machine learning techniques, WIM systems can collect accurate data on vehicle type and axle load characteristics, which are critical factors affecting the performance of flexible pavements. Excessive dynamic loading caused by heavy vehicles can result in the early disintegration of the pavement structure. The Long-Term Pavement Performance Program (LTPP) provides an extensive repository of WIM data that can be utilized to develop accurate PPPMs for predicting pavement future behavior and tolerance. The incorporation of comprehensive WIM data collected from LTPP has the potential to significantly improve the accuracy and effectiveness of PPPMs. To develop artificial neural network (ANN) based pavement performance prediction models (PPPMs) for seven distinct performance indicators, including IRI, longitudinal crack, transverse crack, fatigue crack, potholes, polished aggregate, and patch failure, a total of 300 pavement sections with WIM data were selected from the United States of America. Data collection spanned 20 years, from 2001 to 2020, and included information on pavement age, material properties, climatic properties, structural properties, and traffic-related characteristics. The primary dataset was then divided into two distinct subsets: one which included WIMgenerated traffic data and another which excluded WIM-generated traffic data. Data cleaning and normalization were meticulously performed using the Z-score normalization method. Each subset was further divided into two separate groups: the first containing 15 years of data for model training and the latter containing 5 years of data for testing purposes. Principal Component Analysis (PCA) was then employed to reduce the number of input variables for the model. Based on a cumulative Proportion of Variation (PoV) of 96%, 12 input variables were selected. Subsequently, a single hidden layer ANN model with 12 neurons was generated for each performance indicator. The study\u27s results indicate that incorporating Weigh-In-Motion (WIM)-generated traffic loading data can significantly enhance the accuracy and efficacy of pavement performance prediction models (PPPMs). This improvement further supports the suitability of optimized pavement maintenance scheduling with minimal costs, while also ensuring timely repairs to promote acceptable serviceability and structural stability of the pavement. The contributions of this research are twofold: first, it provides an enhanced understanding of the positive impacts that high-quality traffic loading data has on pavement conditions; and second, it explores potential applications of WIM data within the Pavement Management System (PMS)

    Modeling, Predicting and Capturing Human Mobility

    Get PDF
    Realistic models of human mobility are critical for modern day applications, specifically for recommendation systems, resource planning and process optimization domains. Given the rapid proliferation of mobile devices equipped with Internet connectivity and GPS functionality today, aggregating large sums of individual geolocation data is feasible. The thesis focuses on methodologies to facilitate data-driven mobility modeling by drawing parallels between the inherent nature of mobility trajectories, statistical physics and information theory. On the applied side, the thesis contributions lie in leveraging the formulated mobility models to construct prediction workflows by adopting a privacy-by-design perspective. This enables end users to derive utility from location-based services while preserving their location privacy. Finally, the thesis presents several approaches to generate large-scale synthetic mobility datasets by applying machine learning approaches to facilitate experimental reproducibility
    • 

    corecore