4,821 research outputs found

    Dirichlet belief networks for topic structure learning

    Full text link
    Recently, considerable research effort has been devoted to developing deep architectures for topic models to learn topic structures. Although several deep models have been proposed to learn better topic proportions of documents, how to leverage the benefits of deep structures for learning word distributions of topics has not yet been rigorously studied. Here we propose a new multi-layer generative process on word distributions of topics, where each layer consists of a set of topics and each topic is drawn from a mixture of the topics of the layer above. As the topics in all layers can be directly interpreted by words, the proposed model is able to discover interpretable topic hierarchies. As a self-contained module, our model can be flexibly adapted to different kinds of topic models to improve their modelling accuracy and interpretability. Extensive experiments on text corpora demonstrate the advantages of the proposed model.Comment: accepted in NIPS 201

    Predicting trend reversals using market instantaneous state

    Full text link
    Collective behaviours taking place in financial markets reveal strongly correlated states especially during a crisis period. A natural hypothesis is that trend reversals are also driven by mutual influences between the different stock exchanges. Using a maximum entropy approach, we find coordinated behaviour during trend reversals dominated by the pairwise component. In particular, these events are predicted with high significant accuracy by the ensemble's instantaneous state.Comment: 18 pages, 15 figure

    "Factors affecting hospital admission and recovery stay duration of in-patient motor victims in Spain"

    Get PDF
    Hospital expenses are a major cost driver of healthcare systems in Europe, with motor injuries being the leading mechanism of hospitalizations. This paper investigates the injury characteristics which explain the hospitalization of victims of traffic accidents that took place in Spain. Using a motor insurance database with 16,081 observations a generalized Tobit regression model is applied to analyse the factors that influence both the likelihood of being admitted to hospital after a motor collision and the length of hospital stay in the event of admission. The consistency of Tobit estimates relies on the normality of perturbation terms. Here a semi-parametric regression model was fitted to test the consistency of estimates, concluding that a normal distribution of errors cannot be rejected. Among other results, it was found that older men with fractures and injuries located in the head and lower torso are more likely to be hospitalized after the collision, and that they also have a longer expected length of hospital recovery stay.Body injuries, Heckit estimator, semi-parametric estimator, Hausman test JEL classification:C24, I10

    Predicting chattering alarms: A machine Learning approach

    Get PDF
    Abstract Alarm floods represent a widespread issue for modern chemical plants. During these conditions, the number of alarms may be unmanageable, and the operator may miss safety-critical alarms. Chattering alarms, which repeatedly change between the active and non-active states, are responsible for most of the alarm records within a flood episode. Typically, chattering alarms are only addressed and removed retrospectively (e.g. during periodic performance assessments). This study proposes a Machine-Learning based approach for alarm chattering prediction. Specifically, a method for dynamic chattering quantification has been developed, whose results have been used to train three different Machine Learning models – Linear, Deep, and Wide&Deep models. The algorithms have been employed to predict future chattering behavior based on actual plant conditions. Performance metrics have been calculated to assess the correctness of predictions and to compare the performance of the three models

    ANALYSIS OF LARGE-SCALE TRAFFIC INCIDENTS AND EN ROUTE DIVERSIONS DUE TO CONGESTION ON FREEWAYS

    Get PDF
    En route traffic diversions have been identified as one of the effective traffic operations strategies in traffic incident management. The employment of such traffic operations will help relieve the congestion, save travel time, as well as reduce energy use and tailpipe emissions. However, little attention has been paid to quantifying the benefits by deploying such traffic operations under large-scale traffic incident-induced congestion on freeways, specifically under the connected vehicle environment. New Connected and Automated Vehicle technology, known as “CAV”, has the potential to further increase the benefits by deploying en route traffic diversions. This dissertation research is intended to study the benefits of en route traffic diversion by analyzing large-scale incident-related characteristics, as well as optimizing the signal plans under the diversion framework. The dissertation contributes to the art of traffic incident management by 1) understanding the characteristics of large-scale traffic incidents, and 2) developing a framework under the CAV to study the benefits of en route diversions.Towards the end, 4 studies are linked together for the dissertation. The first study will be focusing on the analysis of the large-scale traffic incidents by using the traffic incident data collected on East Tennessee major roadways. Specifically, incident classification, incident duration prediction, as well as sequential real-time prediction are studied in detail. The second study mainly focuses on truck-involved crashes. By incorporating injury severity information into the incident duration analysis, the second study developed a bivariate analysis framework using a unique dataset created by matching an incident database and a crash database. Then, the third study estimates and evaluates the benefit of deploying the en route traffic diversion strategy under the large-scale traffic incident-induced congestion on freeways by using simulation models and incorporating the analysis outcomes from the other two studies. The last study optimizes the signal timing plans for two intersections, which generates some implications along the arterial corridor under connected vehicles environment to gain more benefits in terms of travel timing savings for the studies network in Knoxville, Tennessee. The implications of the findings (e.g. faster response of agencies to the large-scale incidents reduces the incident duration, penetration of CAVs in the traffic diversion operations further reduces traffic network system delay), as well as the potential applications, will be discussed in this dissertation study

    Proceedings of Abstracts Engineering and Computer Science Research Conference 2019

    Get PDF
    © 2019 The Author(s). This is an open-access work distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. For further details please see https://creativecommons.org/licenses/by/4.0/. Note: Keynote: Fluorescence visualisation to evaluate effectiveness of personal protective equipment for infection control is © 2019 Crown copyright and so is licensed under the Open Government Licence v3.0. Under this licence users are permitted to copy, publish, distribute and transmit the Information; adapt the Information; exploit the Information commercially and non-commercially for example, by combining it with other Information, or by including it in your own product or application. Where you do any of the above you must acknowledge the source of the Information in your product or application by including or linking to any attribution statement specified by the Information Provider(s) and, where possible, provide a link to this licence: http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/This book is the record of abstracts submitted and accepted for presentation at the Inaugural Engineering and Computer Science Research Conference held 17th April 2019 at the University of Hertfordshire, Hatfield, UK. This conference is a local event aiming at bringing together the research students, staff and eminent external guests to celebrate Engineering and Computer Science Research at the University of Hertfordshire. The ECS Research Conference aims to showcase the broad landscape of research taking place in the School of Engineering and Computer Science. The 2019 conference was articulated around three topical cross-disciplinary themes: Make and Preserve the Future; Connect the People and Cities; and Protect and Care

    Use of Machine Learning and Natural Language Processing to Enhance Traffic Safety Analysis

    Get PDF
    Despite significant advances in vehicle technologies, safety data collection and analysis, and engineering advancements, tens of thousands of Americans die every year in motor vehicle crashes. Alarmingly, the trend of fatal and serious injury crashes appears to be heading in the wrong direction. In 2021, the actual rate of fatalities exceeded the predicted rate. This worrisome trend prompts and necessitates the development of advanced and holistic approaches to determining the causes of a crash (particularly fatal and major injuries). These approaches range from analyzing problems from multiple perspectives, utilizing available data sources, and employing the most suitable tools and technologies within and outside traffic safety domain.The primary source for traffic safety analysis is the structure (also called tabular) data collected from crash reports. However, structure data may be insufficient because of missing information, incomplete sequence of events, misclassified crash types, among many issues. Crash narratives, a form of free text recorded by police officers to describe the unique aspects and circumstances of a crash, are commonly used by safety professionals to supplement structure data fields. Due to its unstructured nature, engineers have to manually review every crash narrative. Thanks to the rapid development in natural language processing (NLP) and machine learning (ML) techniques, text mining and analytics has become a popular tool to accelerate information extraction and analysis for unstructured text data. The primary objective of this dissertation is to discover and develop necessary tools, techniques, and algorithms to facilitate traffic safety analysis using crash narratives. The objectives are accomplished in three areas: enhancing data quality by recovering missed crashes through text classification, uncovering complex characteristics of collision generation through information extraction and pattern recognition, and facilitating crash narrative analysis by developing a web-based tool. At first, a variety of NoisyOR classifiers were developed to identify and investigate work zone (WZ), distracted (DD), and inattentive (ID) crashes. In addition, various machine learning (ML) models, including multinomial naive bayes (MNB), logistic regression (LGR), support vector machine (SVM), k-nearest neighbor (K-NN), random forest (RF), and gated recurrent unit (GRU), were developed and compared with NoisyOR. The comparison shows that NoisyOR is simple, computationally efficient, theoretically sound, and has one of the best model performances. Furthermore, a novel neural network architecture named Sentence-based Hierarchical Attention Network (SHAN) was developed to classify crashes and its performance exceeds that of NoisyOR, GRU, Hierarchical Attention Network (HAN), and other ML models. SHAN handled noisy or irrelevant parts of narratives effectively and the model results can be visualized by attention weight. Because a crash often comprises a series of actions and events, breaking the chain of events could prevent a crash from reaching its most dangerous stage. With the objectives of creating crash sequences, discovering pattern of crash events, and finding missing events, the Part-of-Speech tagging (PT), Pattern Matching with POS Tagging (PMPT), Dependency Parser (DP), and Hybrid Generalized (HGEN) algorithms were developed and thoroughly tested using crash narratives. The top performer, HGEN, uses predefined events and event-related action words from crash narratives to find new events not captured in the data fields. Besides, the association analysis unravels the complex interrelations between events within a crash. Finally, the crash information extraction, analysis, and classification tool (CIEACT), a simple and flexible online web tool, was developed to analyze crash narratives using text mining techniques. The tool uses a Python-based Django Web Framework, HTML, and a relational database (PostgreSQL) that enables concurrent model development and analysis. The tool has built-in classifiers by default or can train a model in real time given the data. The interface is user friendly and the results can be displayed in a tabular format or on an interactive map. The tool also provides an option for users to download the word with their probability scores and the results in csv files. The advantages and limitations of each proposed methodology were discussed, and several future research directions were outlined. In summary, the methodologies and tools developed as part of the dissertation can assist transportation engineers and safety professionals in extracting valuable information from narratives, recovering missed crashes, classifying a new crash, and expediting their review process on a large scale. Thus, this research can be used by transportation agencies to analyze crash records, identify appropriate safety solutions, and inform policy making to improve highway safety of our transportation system
    • …
    corecore