386 research outputs found
Linguistic based matching of local ontologies
This paper describes an automatic algorithm of meaning negotiation that enables semantic interoperability between local overlapping and heterogeneous ontologies. Rather than reconciling differences between heterogeneous ontologies, this algorithm searches for mappings between concepts of different ontologies. The algorithm is composed of three main steps: (i) computing the linguistic meaning of the label occurring in the ontologies via natural language processing, (ii) contextualization of such a linguistic meaning by considering the context, i.e. the ontologies, where a label occurs; (iii) comparing contextualized linguistic meaning of two ontologies in in order to find a possible matching between them
QAKiS @ QALD-2
International audienceWe present QAKiS, a system for Question Answering over linked data (in particular, DBpedia). The problem of question interpretation is addressed as the automatic identification of the set of relevant relations between entities in the natural language input question, matched against a repository of automatically collected relational patterns (i.e. the WikiFramework repository). Such patterns represent possible lexical-izations of ontological relations, and are associated to a SPARQL query derived from the linked data relational patterns. Wikipedia is used as the source of free text for the automatic extraction of the relational patterns, and DBpedia as the linked data resource to provide relational patterns and to be queried using a natural language interface
Sobre los efectos de combinar Análisis Semántico Latente con otras técnicas de procesamiento de lenguaje natural para la evaluación de preguntas abiertas
Este artículo presenta la combinación de Análisis Semántico Latente (LSA) con otras técnicas de procesamiento del lenguaje natural (lematización, eliminación de palabras funcionales y desambiguación de sentidos) para mejorar la evaluación automática de respuestas en texto libre. El sistema de evaluación de respuestas en texto libre llamado Atenea (Alfonseca & Pérez, 2004) ha servido de marco experimental para probar el esquema combinacional. Atenea es un sistema capaz de realizar preguntas, escogidas aleatoriamente o bien conforme al perfil del estudiante, y asignarles una calificación numérica. Los resultados de los experimentos demuestran que para todos los conjuntos de datos en los que las técnicas de PLN se han combinado con LSA la correlación de Pearson entre las notas dadas por Atenea y las notas dadas por los profesores para el mismo conjunto de preguntas mejora. La causa puede encontrarse en la complementariedad entre LSA, que trabaja a un nivel semántico superficial, y el resto de las técnicas NLP usadas en Atenea, que están más centradas en los niveles léxico y sintáctico.This article presents the combination of Latent Semantic Analysis (LSA) with other natural language processing techniques (stemming, removal of closed-class words and word sense disambiguation) to improve the automatic assessment of students' free-text answers. The combinational schema has been tested in the experimental framework provided by the free-text Computer Assisted Assessment (CAA) system called Atenea (Alfonseca & Pérez, 2004). This system is able to ask randomly or according to the students' profile an open-ended question to the student and then, assign a score to it. The results prove that for all datasets, when the NLP techniques are combined with LSA, the Pearson correlation between the scores given by Atenea and the scores given by the teachers for the same dataset of questions improves. We believe that this is due to the complementarity between LSA, which works more at a shallow semantic level, and the rest of the NLP techniques used in Atenea, which are more focused on the lexical and syntactical levels
Automatic assessment of students’ free-text answers underpinned by the combination of a BLEU-inspired algorithm and latent semantic analysis
This is an electronic version of the paper presented at the International Florida Artificial Intelligence Research Society Conference, FLAIRS 2005In previous work we have proved that the BLEU algorithm
(Papineni et al. 2001), originally devised for evaluating
Machine Translation systems, can be applied to
assessing short essays written by students. In this paper
we present a comparative evaluation between this
BLEU-inspired algorithm and a system based on Latent
Semantic Analysis. In addition we propose an effective
combination schema for them. Despite the simplicity of
these shallow NLP methods, they achieve state-of-theart
correlations to the teachers’ scores while keeping the
language-independence and without requiring any domain
specific knowledge.This work has been sponsored by the Spanish Ministry of
Science and Technology, project number TIN2004-03140
Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification
Neural-based models have achieved outstanding performance on slot filling and
intent classification, when fairly large in-domain training data are available.
However, as new domains are frequently added, creating sizeable data is
expensive. We show that lightweight augmentation, a set of augmentation methods
involving word span and sentence level operations, alleviates data scarcity
problems. Our experiments on limited data settings show that lightweight
augmentation yields significant performance improvement on slot filling on the
ATIS and SNIPS datasets, and achieves competitive performance with respect to
more complex, state-of-the-art, augmentation approaches. Furthermore,
lightweight augmentation is also beneficial when combined with pre-trained
LM-based models, as it improves BERT-based joint intent and slot filling
models.Comment: Accepted at PACLIC 2020 - The 34th Pacific Asia Conference on
Language, Information and Computatio
MedExpDial: Machine-to-Machine Generation of Explanatory Dialogues for Medical QA
We describe a pilot study on generating synthetic
explanatory dialogues for the medical domain,
based on a pre-existing medical dataset of multiplechoice questions with human-written explanations.
We use an instruction-tuned large language model
(LLM) to generate dialogues between a medical student and a teacher/doctor helping answer questions
about clinical cases. We inject varying degrees
of background knowledge into the teacher prompt
and analyze the effectiveness of these dialogues
in terms of whether the student is able to get to
the correct answer and in how many turns. This
method has potential applications in developing
and evaluating argument-based explanations
A Robust Data-Driven Approach for Dialogue State Tracking of Unseen Slot Values
A Dialogue State Tracker is a key component in dialogue systems which
estimates the beliefs of possible user goals at each dialogue turn. Deep
learning approaches using recurrent neural networks have shown state-of-the-art
performance for the task of dialogue state tracking. Generally, these
approaches assume a predefined candidate list and struggle to predict any new
dialogue state values that are not seen during training. This makes extending
the candidate list for a slot without model retaining infeasible and also has
limitations in modelling for low resource domains where training data for slot
values are expensive. In this paper, we propose a novel dialogue state tracker
based on copying mechanism that can effectively track such unseen slot values
without compromising performance on slot values seen during training. The
proposed model is also flexible in extending the candidate list without
requiring any retraining or change in the model. We evaluate the proposed model
on various benchmark datasets (DSTC2, DSTC3 and WoZ2.0) and show that our
approach, outperform other end-to-end data-driven approaches in tracking unseen
slot values and also provides significant advantages in modelling for DST
Scalable Neural Dialogue State Tracking
A Dialogue State Tracker (DST) is a key component in a dialogue system aiming
at estimating the beliefs of possible user goals at each dialogue turn. Most of
the current DST trackers make use of recurrent neural networks and are based on
complex architectures that manage several aspects of a dialogue, including the
user utterance, the system actions, and the slot-value pairs defined in a
domain ontology. However, the complexity of such neural architectures incurs
into a considerable latency in the dialogue state prediction, which limits the
deployments of the models in real-world applications, particularly when task
scalability (i.e. amount of slots) is a crucial factor. In this paper, we
propose an innovative neural model for dialogue state tracking, named Global
encoder and Slot-Attentive decoders (G-SAT), which can predict the dialogue
state with a very low latency time, while maintaining high-level performance.
We report experiments on three different languages (English, Italian, and
German) of the WoZ2.0 dataset, and show that the proposed approach provides
competitive advantages over state-of-art DST systems, both in terms of accuracy
and in terms of time complexity for predictions, being over 15 times faster
than the other systems.Comment: 8 pages, 3 figures, Accepted at ASRU 201
Simulating Domain Changes in Conversational Agents Through Dialogue Adaptation
A major bottleneck for the large diffusion of data-driven conversational agents is that conversational domains are subject to continuous changes, which soon make initial dialogue models inadequate to manage new situations. In the current context, updating training data is usually carried on manually, and, in addition, there are no tools for simulating the impact of a certain domain change on the performance of the dialogue system. This position paper advocates that substantial progress in the capacity to simulate domain changes is based on the ability to automatically adapt training and test dialogues to those changes. We discuss the potential of a simulation framework for task-oriented dialogues, as well as the research
challenges that need to be addressed
Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems
In task-oriented dialogue systems the dialogue state tracker (DST) component
is responsible for predicting the state of the dialogue based on the dialogue
history. Current DST approaches rely on a predefined domain ontology, a fact
that limits their effective usage for large scale conversational agents, where
the DST constantly needs to be interfaced with ever-increasing services and
APIs. Focused towards overcoming this drawback, we propose a domain-aware
dialogue state tracker, that is completely data-driven and it is modeled to
predict for dynamic service schemas. The proposed model utilizes domain and
slot information to extract both domain and slot specific representations for a
given dialogue, and then uses such representations to predict the values of the
corresponding slot. Integrating this mechanism with a pretrained language model
(i.e. BERT), our approach can effectively learn semantic relations
- …
