7,868 research outputs found
Evaluating the reliability of automatically generated pedestrian and bicycle crash surrogates
Vulnerable road users (VRUs), such as pedestrians and bicyclists, are at a
higher risk of being involved in crashes with motor vehicles, and crashes
involving VRUs also are more likely to result in severe injuries or fatalities.
Signalized intersections are a major safety concern for VRUs due to their
complex and dynamic nature, highlighting the need to understand how these road
users interact with motor vehicles and deploy evidence-based countermeasures to
improve safety performance. Crashes involving VRUs are relatively infrequent,
making it difficult to understand the underlying contributing factors. An
alternative is to identify and use conflicts between VRUs and motorized
vehicles as a surrogate for safety performance. Automatically detecting these
conflicts using a video-based systems is a crucial step in developing smart
infrastructure to enhance VRU safety. The Pennsylvania Department of
Transportation conducted a study using video-based event monitoring system to
assess VRU and motor vehicle interactions at fifteen signalized intersections
across Pennsylvania to improve VRU safety performance. This research builds on
that study to assess the reliability of automatically generated surrogates in
predicting confirmed conflicts using advanced data-driven models. The surrogate
data used for analysis include automatically collectable variables such as
vehicular and VRU speeds, movements, post-encroachment time, in addition to
manually collected variables like signal states, lighting, and weather
conditions. The findings highlight the varying importance of specific
surrogates in predicting true conflicts, some being more informative than
others. The findings can assist transportation agencies to collect the right
types of data to help prioritize infrastructure investments, such as bike lanes
and crosswalks, and evaluate their effectiveness
Analyzing Granger causality in climate data with time series classification methods
Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested
Multi-Rater Consensus Learning for Modeling Multiple Sparse Ratings of Affective Behaviour
The use of multiple raters to label datasets is an established practice in affective computing. The principal goal is to reduce unwanted subjective bias in the labelling process. Unfortunately, this leads to the key problem of identifying a ground truth for training the affect recognition system. This problem becomes more relevant in a sparsely-crossed annotation where each rater only labels a portion of the full dataset to ensure a manageable workload per rater. In this paper, we introduce a Multi-Rater Consensus Learning (MRCL) method which learns a representative affect recognition model that accounts for each rater's agreement with the other raters. MRCL combines a multitask learning (MTL) regularizer and a consensus loss. Unlike standard MTL, this approach allows the model to learn to predict each rater's label while explicitly accounting for the consensus among raters. We evaluated our approach on two different datasets based on spontaneous affective body movement expressions for pain behaviour detection and laughter type recognition respectively. The two naturalistic datasets were chosen for the different forms of labelling (different in affect, observation stimuli, and raters) that they together offer for evaluating our approach. Empirical results demonstrate that MRCL is effective for modelling affect from datasets with sparsely-crossed multi-rater annotation
Synergy of Physics-based Reasoning and Machine Learning in Biomedical Applications: Towards Unlimited Deep Learning with Limited Data
Technological advancements enable collecting vast data, i.e., Big Data, in science and industry including biomedical field. Increased computational power allows expedient analysis of collected data using statistical and machine-learning approaches. Historical data incompleteness problem and curse of dimensionality diminish practical value of pure data-driven approaches, especially in biomedicine. Advancements in deep learning (DL) frameworks based on deep neural networks (DNN) improved accuracy in image recognition, natural language processing, and other applications yet severe data limitations and/or absence of transfer-learning-relevant problems drastically reduce advantages of DNN-based DL. Our earlier works demonstrate that hierarchical data representation can be alternatively implemented without NN, using boosting-like algorithms for utilization of existing domain knowledge, tolerating significant data incompleteness, and boosting accuracy of low-complexity models within the classifier ensemble, as illustrated in physiological-data analysis. Beyond obvious use in initial-factor selection, existing simplified models are effectively employed for generation of realistic synthetic data for later DNN pre-training. We review existing machine learning approaches, focusing on limitations caused by training-data incompleteness. We outline our hybrid framework that leverages existing domain-expert models/knowledge, boosting-like model combination, DNN-based DL and other machine learning algorithms for drastic reduction of training-data requirements. Applying this framework is illustrated in context of analyzing physiological data
Understanding Electricity-Theft Behavior via Multi-Source Data
Electricity theft, the behavior that involves users conducting illegal
operations on electrical meters to avoid individual electricity bills, is a
common phenomenon in the developing countries. Considering its harmfulness to
both power grids and the public, several mechanized methods have been developed
to automatically recognize electricity-theft behaviors. However, these methods,
which mainly assess users' electricity usage records, can be insufficient due
to the diversity of theft tactics and the irregularity of user behaviors.
In this paper, we propose to recognize electricity-theft behavior via
multi-source data. In addition to users' electricity usage records, we analyze
user behaviors by means of regional factors (non-technical loss) and climatic
factors (temperature) in the corresponding transformer area. By conducting
analytical experiments, we unearth several interesting patterns: for instance,
electricity thieves are likely to consume much more electrical power than
normal users, especially under extremely high or low temperatures. Motivated by
these empirical observations, we further design a novel hierarchical framework
for identifying electricity thieves. Experimental results based on a real-world
dataset demonstrate that our proposed model can achieve the best performance in
electricity-theft detection (e.g., at least +3.0% in terms of F0.5) compared
with several baselines. Last but not least, our work has been applied by the
State Grid of China and used to successfully catch electricity thieves in
Hangzhou with a precision of 15% (an improvement form 0% attained by several
other models the company employed) during monthly on-site investigation.Comment: 11 pages, 8 figures, WWW'20 full pape
Recommended from our members
Towards Informed Exploration for Deep Reinforcement Learning
In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact
- …