975 research outputs found
Neurosymbolic AI for Reasoning on Graph Structures: A Survey
Neurosymbolic AI is an increasingly active area of research which aims to
combine symbolic reasoning methods with deep learning to generate models with
both high predictive performance and some degree of human-level
comprehensibility. As knowledge graphs are becoming a popular way to represent
heterogeneous and multi-relational data, methods for reasoning on graph
structures have attempted to follow this neurosymbolic paradigm. Traditionally,
such approaches have utilized either rule-based inference or generated
representative numerical embeddings from which patterns could be extracted.
However, several recent studies have attempted to bridge this dichotomy in ways
that facilitate interpretability, maintain performance, and integrate expert
knowledge. Within this article, we survey a breadth of methods that perform
neurosymbolic reasoning tasks on graph structures. To better compare the
various methods, we propose a novel taxonomy by which we can classify them.
Specifically, we propose three major categories: (1) logically-informed
embedding approaches, (2) embedding approaches with logical constraints, and
(3) rule-learning approaches. Alongside the taxonomy, we provide a tabular
overview of the approaches and links to their source code, if available, for
more direct comparison. Finally, we discuss the applications on which these
methods were primarily used and propose several prospective directions toward
which this new field of research could evolve.Comment: 21 pages, 8 figures, 1 table, currently under review. Corresponding
GitHub page here: https://github.com/NeSymGraph
Text Classification: A Review, Empirical, and Experimental Evaluation
The explosive and widespread growth of data necessitates the use of text
classification to extract crucial information from vast amounts of data.
Consequently, there has been a surge of research in both classical and deep
learning text classification methods. Despite the numerous methods proposed in
the literature, there is still a pressing need for a comprehensive and
up-to-date survey. Existing survey papers categorize algorithms for text
classification into broad classes, which can lead to the misclassification of
unrelated algorithms and incorrect assessments of their qualities and behaviors
using the same metrics. To address these limitations, our paper introduces a
novel methodological taxonomy that classifies algorithms hierarchically into
fine-grained classes and specific techniques. The taxonomy includes methodology
categories, methodology techniques, and methodology sub-techniques. Our study
is the first survey to utilize this methodological taxonomy for classifying
algorithms for text classification. Furthermore, our study also conducts
empirical evaluation and experimental comparisons and rankings of different
algorithms that employ the same specific sub-technique, different
sub-techniques within the same technique, different techniques within the same
category, and categorie
Efficient Indexing for Structured and Unstructured Data
The collection of digital data is growing at an exponential rate. Data originates from wide range of data sources such as text feeds, biological sequencers, internet traffic over routers, through sensors and many other sources. To mine intelligent information from these sources, users have to query the data. Indexing techniques aim to reduce the query time by preprocessing the data. Diversity of data sources in real world makes it imperative to develop application specific indexing solutions based on the data to be queried. Data can be structured i.e., relational tables or unstructured i.e., free text. Moreover, increasingly many applications need to seamlessly analyze both kinds of data making data integration a central issue. Integrating text with structured data needs to account for missing values, errors in the data etc. Probabilistic models have been proposed recently for this purpose. These models are also useful for applications where uncertainty is inherent in data e.g. sensor networks. This dissertation aims to propose efficient indexing solutions for several problems that lie at the intersection of database and information retrieval such as joining ranked inputs, full-text documents searching etc. Other well-known problems of ranked retrieval and pattern matching are also studied under probabilistic settings. For each problem, the worst-case theoretical bounds of the proposed solutions are established and/or their practicality is demonstrated by thorough experimentation
Fuzzy expert systems in civil engineering
Imperial Users onl
Representation Learning for Natural Language Processing
This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, including words, phrases, sentences and documents. Part II then introduces the representation techniques for those objects that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, networks, and cross-modal entries. Lastly, Part III provides open resource tools for representation learning techniques, and discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing
Enhancing maritime defence and security through persistently autonomous operations and situation awareness systems
This thesis is concerned with autonomous operations with Autonomous Underwater Vehicles(AUVs) and maritime situation awareness in the context of enhancing maritime defence and security. The problem of autonomous operations with AUVs is one of persistence. That is, AUVs get stuck due to a lack of cognitive ability to deal with a situation and require intervention from a human operator. This thesis focuses on addressing vehicle subsystem failures and changes in high level mission priorities in a manner that preserves autonomy during Mine Counter measures (MCM) operations in unknown environments. This is not a trivial task. The approach followed utilizes ontologies for representing knowledge about the operational environment, the vehicle as well as mission planning and execution. Reasoning about the vehicle capabilities and consequently the actions it can execute is continuous and occurs in real time. Vehicle component faults are incorporated into the reasoning process as a means of driving adaptive planning and execution. Adaptive planning is based on a Planning Domain Definition Language (PDDL) planner. Adaptive execution is prioritized over adaptive planning as mission planning can be very demanding in terms of computational resources. Changes in high level mission priorities are also addressed as part of the adaptive planning behaviour of the system. The main contribution of this thesis regarding persistently autonomous operations is an ontological framework that drives an adaptive behaviour for increasing persistent autonomy of AUVs in unexpected situations. That is, when vehicle component faults threaten to put the mission at risk and changes in high level mission priorities should be incorporated as part of decision making. Building maritime situation awareness for maritime security is a very difficult task. High volumes of information gathered from various sources as well as their efficient fusion taking into consideration any contradictions and the requirement for reliable decision making and (re)action under potentially multiple interpretations of a situation are the most prominent challenges. To address those challenges and help alleviate the burden from humans which usually undertake such tasks, this thesis is concerned with maritime situation awareness built with Markov Logic Networks(MLNs) that support humans in their decision making. However, commonly maritime situation awareness systems rely on human experts to transfer their knowledge into the system before it can be deployed. In that respect, a promising alternative for training MLNs with data is presented. In addition, an in depth evaluation of their performance is provided during which the significance of interpreting an unfolding situation in context is demonstrated. To the best of the author’s knowledge, it is the first time that MLNs are trained with data and evaluated using cross validation in the context of building maritime situation awareness for maritime security
Relational clustering models for knowledge discovery and recommender systems
Cluster analysis is a fundamental research field in Knowledge Discovery and Data Mining
(KDD). It aims at partitioning a given dataset into some homogeneous clusters so as
to reflect the natural hidden data structure. Various heuristic or statistical approaches
have been developed for analyzing propositional datasets. Nevertheless, in relational
clustering the existence of multi-type relationships will greatly degrade the performance
of traditional clustering algorithms. This issue motivates us to find more effective algorithms
to conduct the cluster analysis upon relational datasets. In this thesis we
comprehensively study the idea of Representative Objects for approximating data distribution
and then design a multi-phase clustering framework for analyzing relational
datasets with high effectiveness and efficiency.
The second task considered in this thesis is to provide some better data models for
people as well as machines to browse and navigate a dataset. The hierarchical taxonomy
is widely used for this purpose. Compared with manually created taxonomies, automatically
derived ones are more appealing because of their low creation/maintenance cost
and high scalability. Up to now, the taxonomy generation techniques are mainly used
to organize document corpus. We investigate the possibility of utilizing them upon relational
datasets and then propose some algorithmic improvements. Another non-trivial
problem is how to assign suitable labels for the taxonomic nodes so as to credibly summarize
the content of each node. Unfortunately, this field has not been investigated
sufficiently to the best of our knowledge, and so we attempt to fill the gap by proposing
some novel approaches.
The final goal of our cluster analysis and taxonomy generation techniques is
to improve the scalability of recommender systems that are developed to tackle the
problem of information overload. Recent research in recommender systems integrates
the exploitation of domain knowledge to improve the recommendation quality, which
however reduces the scalability of the whole system at the same time. We address this
issue by applying the automatically derived taxonomy to preserve the pair-wise similarities
between items, and then modeling the user visits by another hierarchical structure.
Experimental results show that the computational complexity of the recommendation
procedure can be greatly reduced and thus the system scalability be improved
- …