1,758 research outputs found
Visual Question Answering: A Survey of Methods and Datasets
Visual Question Answering (VQA) is a challenging task that has received
increasing attention from both the computer vision and the natural language
processing communities. Given an image and a question in natural language, it
requires reasoning over visual elements of the image and general knowledge to
infer the correct answer. In the first part of this survey, we examine the
state of the art by comparing modern approaches to the problem. We classify
methods by their mechanism to connect the visual and textual modalities. In
particular, we examine the common approach of combining convolutional and
recurrent neural networks to map images and questions to a common feature
space. We also discuss memory-augmented and modular architectures that
interface with structured knowledge bases. In the second part of this survey,
we review the datasets available for training and evaluating VQA systems. The
various datatsets contain questions at different levels of complexity, which
require different capabilities and types of reasoning. We examine in depth the
question/answer pairs from the Visual Genome project, and evaluate the
relevance of the structured annotations of images with scene graphs for VQA.
Finally, we discuss promising future directions for the field, in particular
the connection to structured knowledge bases and the use of natural language
processing models.Comment: 25 page
Knowledge-based Biomedical Data Science 2019
Knowledge-based biomedical data science (KBDS) involves the design and
implementation of computer systems that act as if they knew about biomedicine.
Such systems depend on formally represented knowledge in computer systems,
often in the form of knowledge graphs. Here we survey the progress in the last
year in systems that use formally represented knowledge to address data science
problems in both clinical and biological domains, as well as on approaches for
creating knowledge graphs. Major themes include the relationships between
knowledge graphs and machine learning, the use of natural language processing,
and the expansion of knowledge-based approaches to novel domains, such as
Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages
with 3 table
Logic Programming Applications: What Are the Abstractions and Implementations?
This article presents an overview of applications of logic programming,
classifying them based on the abstractions and implementations of logic
languages that support the applications. The three key abstractions are join,
recursion, and constraint. Their essential implementations are for-loops, fixed
points, and backtracking, respectively. The corresponding kinds of applications
are database queries, inductive analysis, and combinatorial search,
respectively. We also discuss language extensions and programming paradigms,
summarize example application problems by application areas, and touch on
example systems that support variants of the abstractions with different
implementations
Leveraging Large Language Models (LLMs) for Process Mining (Technical Report)
This technical report describes the intersection of process mining and large
language models (LLMs), specifically focusing on the abstraction of traditional
and object-centric process mining artifacts into textual format. We introduce
and explore various prompting strategies: direct answering, where the large
language model directly addresses user queries; multi-prompt answering, which
allows the model to incrementally build on the knowledge obtained through a
series of prompts; and the generation of database queries, facilitating the
validation of hypotheses against the original event log.
Our assessment considers two large language models, GPT-4 and Google's Bard,
under various contextual scenarios across all prompting strategies. Results
indicate that these models exhibit a robust understanding of key process mining
abstractions, with notable proficiency in interpreting both declarative and
procedural process models.
In addition, we find that both models demonstrate strong performance in the
object-centric setting, which could significantly propel the advancement of the
object-centric process mining discipline.
Additionally, these models display a noteworthy capacity to evaluate various
concepts of fairness in process mining. This opens the door to more rapid and
efficient assessments of the fairness of process mining event logs, which has
significant implications for the field.
The integration of these large language models into process mining
applications may open new avenues for exploration, innovation, and insight
generation in the field
From Text to Knowledge with Graphs: modelling, querying and exploiting textual content
This paper highlights the challenges, current trends, and open issues related
to the representation, querying and analytics of content extracted from texts.
The internet contains vast text-based information on various subjects,
including commercial documents, medical records, scientific experiments,
engineering tests, and events that impact urban and natural environments.
Extracting knowledge from this text involves understanding the nuances of
natural language and accurately representing the content without losing
information. This allows knowledge to be accessed, inferred, or discovered. To
achieve this, combining results from various fields, such as linguistics,
natural language processing, knowledge representation, data storage, querying,
and analytics, is necessary. The vision in this paper is that graphs can be a
well-suited text content representation once annotated and the right querying
and analytics techniques are applied. This paper discusses this hypothesis from
the perspective of linguistics, natural language processing, graph models and
databases and artificial intelligence provided by the panellists of the DOING
session in the MADICS Symposium 2022
Bring Your Own KG: Self-Supervised Program Synthesis for Zero-Shot KGQA
We present BYOKG, a universal question-answering (QA) system that can operate
on any knowledge graph (KG), requires no human-annotated training data, and can
be ready to use within a day -- attributes that are out-of-scope for current
KGQA systems. BYOKG draws inspiration from the remarkable ability of humans to
comprehend information present in an unseen KG through exploration -- starting
at random nodes, inspecting the labels of adjacent nodes and edges, and
combining them with their prior world knowledge. In BYOKG, exploration
leverages an LLM-backed symbolic agent that generates a diverse set of
query-program exemplars, which are then used to ground a retrieval-augmented
reasoning procedure to predict programs for arbitrary questions. BYOKG is
effective over both small- and large-scale graphs, showing dramatic gains in QA
accuracy over a zero-shot baseline of 27.89 and 58.02 F1 on GrailQA and MetaQA,
respectively. On GrailQA, we further show that our unsupervised BYOKG
outperforms a supervised in-context learning method, demonstrating the
effectiveness of exploration. Lastly, we find that performance of BYOKG
reliably improves with continued exploration as well as improvements in the
base LLM, notably outperforming a state-of-the-art fine-tuned model by 7.08 F1
on a sub-sampled zero-shot split of GrailQA
- …