1,642 research outputs found

    IoT Data Imputation with Incremental Multiple Linear Regression

    Get PDF
    In this paper, we address the problem related to missing data imputation in the IoT domain. More specifically, we propose an Incremental Space-Time-based model (ISTM) for repairing missing values in IoT real-time data streams. ISTM is based on Incremental Multiple Linear Regression, which processes data as follows: Upon data arrival, ISTM updates the model after reading again the intermediary data matrix instead of accessing all historical information. If a missing value is detected, ISTM will provide an estimation for the missing value based on nearly historical data and the observations of neighboring sensors of the default one. Experiments conducted with real traffic data show the performance of ISTM in comparison with known techniques

    ChatGPT is on the Horizon: Could a Large Language Model be Suitable for Intelligent Traffic Safety Research and Applications?

    Full text link
    ChatGPT embarks on a new era of artificial intelligence and will revolutionize the way we approach intelligent traffic safety systems. This paper begins with a brief introduction about the development of large language models (LLMs). Next, we exemplify using ChatGPT to address key traffic safety issues. Furthermore, we discuss the controversies surrounding LLMs, raise critical questions for their deployment, and provide our solutions. Moreover, we propose an idea of multi-modality representation learning for smarter traffic safety decision-making and open more questions for application improvement. We believe that LLM will both shape and potentially facilitate components of traffic safety research.Comment: Submitted to Nature - Machine Intelligence (Revised and Extended

    QUANTIFYING NON-RECURRENT DELAY USING PROBE-VEHICLE DATA

    Get PDF
    Current practices based on estimated volume and basic queuing theory to calculate delay resulting from non-recurrent congestion do not account for the day-to-day fluctuations in traffic. In an attempt to address this issue, probe GPS data are used to develop impact zone boundaries and calculate Vehicle Hours of Delay (VHD) for incidents stored in the Traffic Response and Incident Management Assisting the River City (TRIMARC) incident log in Louisville, KY. Multiple linear regression along with stepwise selection is used to generate models for the maximum queue length, the average queue length, and VHD to explore the factors that explain the impact boundary and VHD. Models predicting queue length do not explain significant amounts of variance but can be useful in queue spillback studies. Models predicting VHD are as effective as the data collected; models using cheaper-to-collect data sources explain less variance; models collecting more detailed data explained more variance. Models for VHD can be useful in incident management after action reviews and predicting road user costs

    Strikes, Scabs and Tread Separations: Labor Strife and the Production of Defective Bridgestone/Firestone Tires

    Get PDF
    This paper provides a case study of the effect of labor relations on product quality. We consider whether a long, contentious strike and the hiring of permanent replacement workers by Bridgestone/Firestone in the mid-1990s contributed to the production of an excess number of defective tires. Using several independent data sources we find that labor strife in the Decatur plant closely coincided with lower product quality. Count data regression models based on two data sets of tire failures by plant, year and age show significantly higher failure rates for tires produced in Decatur during the labor dispute than before or after the dispute, or than at other plants. Also, an analysis of internal Firestone engineering tests indicates that P235 tires from Decatur performed less well if they were manufactured during the labor dispute compared with those produced after the dispute, or compared with those from other, non-striking plants. Monthly data suggest that the production of defective tires was particularly high around the time wage concessions were demanded by Firestone in early 1994 and when large numbers of replacement workers and permanent workers worked side by side in late 1995 and early 1996.

    Strikes, Scabs and Tread Separations: Labor Strife and the Production of Defective Bridgestone/Firestone Tires

    Get PDF
    This paper provides a case study of the effect of labor relations on product quality. We consider whether a long, contentious strike and the hiring of permanent replacement workers by Bridgestone/Firestone in the mid-1990s contributed to the production of an excess number of defective tires. Using several independent data sources we find that labor strife in the Decatur plant closely coincided with lower product quality. Count data regression models based on two data sets of tire failures by plant, year and age show significantly higher failure rates for tires produced in Decatur during the labor dispute than before or after the dispute, or than at other plants. Also, an analysis of internal Firestone engineering tests indicates that P235 tires from Decatur performed less well if they were manufactured during the labor dispute compared with those produced after the dispute, or compared with those from other, non-striking plants. Monthly data suggest that the production of defective tires was particularly high around the time wage concessions were demanded by Firestone in early 1994 and when large numbers of replacement workers and permanent workers worked side by side in late 1995 and early 1996.

    Predictive analytics applied to firefighter response, a practical approach

    Get PDF
    Time is a crucial factor for the outcome of emergencies, especially those that involve human lives. This paper looks at Lisbon’s firefighter’s occurrences and presents a model,based on city characteristics and climacteric data, to predict whether there will be an occurrence at a certain location, according to the weather forecasts. In this study three algorithms were considered, Logistic Regression, Decision Tree and Random Forest.Measured by the AUC, the best performant modelwasa random forestwith random under-sampling at 0.68. This model was well adjusted across the city and showed that precipitation and size of the subsection are themost relevant featuresin predicting firefighter’s occurrences.The work presented here has clear implications on the firefighter’s decision-makingregarding vehicle allocation, as now they can make an informed decision considering the predicted occurrences

    A framework for smart traffic management using heterogeneous data sources

    Get PDF
    A thesis submitted in partial fulfilment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.Traffic congestion constitutes a social, economic and environmental issue to modern cities as it can negatively impact travel times, fuel consumption and carbon emissions. Traffic forecasting and incident detection systems are fundamental areas of Intelligent Transportation Systems (ITS) that have been widely researched in the last decade. These systems provide real time information about traffic congestion and other unexpected incidents that can support traffic management agencies to activate strategies and notify users accordingly. However, existing techniques suffer from high false alarm rate and incorrect traffic measurements. In recent years, there has been an increasing interest in integrating different types of data sources to achieve higher precision in traffic forecasting and incident detection techniques. In fact, a considerable amount of literature has grown around the influence of integrating data from heterogeneous data sources into existing traffic management systems. This thesis presents a Smart Traffic Management framework for future cities. The proposed framework fusions different data sources and technologies to improve traffic prediction and incident detection systems. It is composed of two components: social media and simulator component. The social media component consists of a text classification algorithm to identify traffic related tweets. These traffic messages are then geolocated using Natural Language Processing (NLP) techniques. Finally, with the purpose of further analysing user emotions within the tweet, stress and relaxation strength detection is performed. The proposed text classification algorithm outperformed similar studies in the literature and demonstrated to be more accurate than other machine learning algorithms in the same dataset. Results from the stress and relaxation analysis detected a significant amount of stress in 40% of the tweets, while the other portion did not show any emotions associated with them. This information can potentially be used for policy making in transportation, to understand the users��� perception of the transportation network. The simulator component proposes an optimisation procedure for determining missing roundabouts and urban roads flow distribution using constrained optimisation. Existing imputation methodologies have been developed on straight section of highways and their applicability for more complex networks have not been validated. This task presented a solution for the unavailability of roadway sensors in specific parts of the network and was able to successfully predict the missing values with very low percentage error. The proposed imputation methodology can serve as an aid for existing traffic forecasting and incident detection methodologies, as well as for the development of more realistic simulation networks

    Imputation, modelling and optimal sampling design for digital camera data in recreational fisheries monitoring

    Get PDF
    Digital camera monitoring has evolved as an active application-oriented scheme to help address questions in areas such as fisheries, ecology, computer vision, artificial intelligence, and criminology. In recreational fisheries research, digital camera monitoring has become a viable option for probability-based survey methods, and is also used for corroborative and validation purposes. In comparison to onsite surveys (e.g. boat ramp surveys), digital cameras provide a cost-effective method of monitoring boating activity and fishing effort, including night-time fishing activities. However, there are challenges in the use of digital camera monitoring that need to be resolved. Notably, missing data problems and the cost of data interpretation are among the most pertinent. This study provides relevant statistical support to address these challenges of digital camera monitoring of boating effort, to improve its utility to enhance recreational fisheries management in Western Australia and elsewhere, with capacity to extend to other areas of application. Digital cameras can provide continuous recordings of boating and other recreational fishing activities; however, interruptions of camera operations can lead to significant gaps within the data. To fill these gaps, some climatic and other temporal classification variables were considered as predictors of boating effort (defined as number of powerboat launches and retrievals). A generalized linear mixed effect model built on fully-conditional specification multiple imputation framework was considered to fill in the gaps in the camera dataset. Specifically, the zero-inflated Poisson model was found to satisfactorily impute plausible values for missing observations for varied durations of outages in the digital camera monitoring data of recreational boating effort. Additional modelling options were explored to guide both short- and long-term forecasting of boating activity and to support management decisions in monitoring recreational fisheries. Autoregressive conditional Poisson (ACP) and integer-valued autoregressive (INAR) models were identified as useful time series models for predicting short-term behaviour of such data. In Western Australia, digital camera monitoring data that coincide with 12-month state-wide boat-based surveys (now conducted on a triennial basis) have been read but the periods between the surveys have not been read. A Bayesian regression framework was applied to describe the temporal distribution of recreational boating effort using climatic and temporally classified variables to help construct data for such missing periods. This can potentially provide a useful cost-saving alternative of obtaining continuous time series data on boating effort. Finally, data from digital camera monitoring are often manually interpreted and the associated cost can be substantial, especially if multiple sites are involved. Empirical support for low-level monitoring schemes for digital camera has been provided. It was found that manual interpretation of camera footage for 40% of the days within a year can be deemed as an adequate level of sampling effort to obtain unbiased, precise and accurate estimates to meet broad management objectives. A well-balanced low-level monitoring scheme will ultimately reduce the cost of manual interpretation and produce unbiased estimates of recreational fishing indexes from digital camera surveys
    corecore