20 research outputs found
A Comprehensive Survey on Rare Event Prediction
Rare event prediction involves identifying and forecasting events with a low
probability using machine learning and data analysis. Due to the imbalanced
data distributions, where the frequency of common events vastly outweighs that
of rare events, it requires using specialized methods within each step of the
machine learning pipeline, i.e., from data processing to algorithms to
evaluation protocols. Predicting the occurrences of rare events is important
for real-world applications, such as Industry 4.0, and is an active research
area in statistical and machine learning. This paper comprehensively reviews
the current approaches for rare event prediction along four dimensions: rare
event data, data processing, algorithmic approaches, and evaluation approaches.
Specifically, we consider 73 datasets from different modalities (i.e.,
numerical, image, text, and audio), four major categories of data processing,
five major algorithmic groupings, and two broader evaluation approaches. This
paper aims to identify gaps in the current literature and highlight the
challenges of predicting rare events. It also suggests potential research
directions, which can help guide practitioners and researchers.Comment: 44 page
A Benchmark Knowledge Graph of Driving Scenes for Knowledge Completion Tasks
Knowledge graph completion (KGC) is a problem of significant importance due to the inherent incompleteness in knowledge graphs (KGs). The current approaches for KGC using link prediction (LP) mostly rely on a common set of benchmark datasets that are quite different from real-world industrial KGs. Therefore, the adaptability of current LP methods for real-world KGs and domain-specific ap- plications is questionable. To support the evaluation of current and future LP and KGC methods for industrial KGs, we introduce DSceneKG, a suite of real-world driving scene knowledge graphs that are currently being used across various industrial applications. The DSceneKG is publicly available at: https://github.com/ruwantw/DSceneKG
An Evaluation of Knowledge Graph Embeddings for Autonomous Driving Data: Experience and Practice
The autonomous driving (AD) industry is exploring the use of knowledge graphs
(KGs) to manage the vast amount of heterogeneous data generated from vehicular
sensors. The various types of equipped sensors include video, LIDAR and RADAR.
Scene understanding is an important topic in AD which requires consideration of
various aspects of a scene, such as detected objects, events, time and
location. Recent work on knowledge graph embeddings (KGEs) - an approach that
facilitates neuro-symbolic fusion - has shown to improve the predictive
performance of machine learning models. With the expectation that
neuro-symbolic fusion through KGEs will improve scene understanding, this
research explores the generation and evaluation of KGEs for autonomous driving
data. We also present an investigation of the relationship between the level of
informational detail in a KG and the quality of its derivative embeddings. By
systematically evaluating KGEs along four dimensions -- i.e. quality metrics,
KG informational detail, algorithms, and datasets -- we show that (1) higher
levels of informational detail in KGs lead to higher quality embeddings, (2)
type and relation semantics are better captured by the semantic transitional
distance-based TransE algorithm, and (3) some metrics, such as coherence
measure, may not be suitable for intrinsically evaluating KGEs in this domain.
Additionally, we also present an (early) investigation of the usefulness of
KGEs for two use-cases in the AD domain.Comment: 11 pages, To appear in AAAI 2020 Spring Symposium on Combining
Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE 2020
Towards Efficient Scoring of Student-generated Long-form Analogies in STEM
Switching from an analogy pedagogy based on comprehension to analogy pedagogy based on production raises an impractical manual analogy scoring problem. Conventional symbol-matching approaches to computational analogy evaluation focus on positive cases, and challenge computational feasibility. This work presents the Discriminative Analogy Features (DAF) pipeline to identify the discriminative features of strong and weak long-form text analogies. We introduce four feature categories (semantic, syntactic, sentiment, and statistical) used with supervised vector-based learning methods to discriminate between strong and weak analogies. Using a modestly sized vector of engineered features with SVM attains a 0.67 macro F1 score. While a semantic feature is the most discriminative, out of the top 15 discriminative features, most are syntactic. Combining these engineered features with an ELMo-generated embedding still improves classification relative to an embedding alone. While an unsupervised K-Means clustering-based approach falls short, similar hints of improvement appear when inputs include the engineered features used in supervised learning
Tutorial: Knowledge-infused Learning for Autonomous Driving (KL4AD)
Autonomous Driving (AD) is considered as a testbed for tackling many hard AI problems. Despite the recent advancements in the field, AD is still far from achieving full autonomy due to core technical problems inherent in AD. The emerging field of neuro-symbolic AI and the methods for knowledge-infused learning are showing exciting ways of leveraging external knowledge within machine/deep learning solutions, with the potential benefits for interpretability, explainability, robustness, and transferability. In this tutorial, we will examine the use of knowledge-infused learning for three core state-of-the-art technical achievements within the AD domain. With a collaborative team from both academia and industry, we will demonstrate recent innovations using real-world datasets
An Evaluation of Knowledge Graph Embeddings for Autonomous Driving Data: Experience and Practice
The autonomous driving (AD) industry is exploring the use of knowledge graphs (KGs) to manage the vast amount of heterogeneous data generated from vehicular sensors. The various types of equipped sensors include video, LIDAR and RADAR. Scene understanding is an important topic in AD which requires consideration of various aspects of a scene, such as detected objects, events, time and location. Recent work on knowledge graph embeddings (KGEs) - an approach that facilitates neuro-symbolic fusion - has shown to improve the predictive performance of machine learning models. With the expectation that neuro-symbolic fusion through KGEs will improve scene understanding, this research explores the generation and evaluation of KGEs for autonomous driving data. We also present an investigation of the relationship between the level of informational detail in a KG and the quality of its derivative embeddings. By systematically evaluating KGEs along four dimensions – i.e. quality metrics, KG informational detail, algorithms, and datasets – we show that (1) higher levels of informational detail in KGs lead to higher quality embeddings, (2) type and relation semantics are better captured by the semantic transitional distance-based TransE algorithm, and (3) some metrics, such as coherence measure, may not be suitable for intrinsically evaluating KGEs in this domain. Additionally, we also present an (early) investigation of the usefulness of KGEs for two use-cases in the AD domain