296 research outputs found
Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation
Existing approaches to automatic VerbNet-style verb classification are
heavily dependent on feature engineering and therefore limited to languages
with mature NLP pipelines. In this work, we propose a novel cross-lingual
transfer method for inducing VerbNets for multiple languages. To the best of
our knowledge, this is the first study which demonstrates how the architectures
for learning word embeddings can be applied to this challenging
syntactic-semantic task. Our method uses cross-lingual translation pairs to tie
each of the six target languages into a bilingual vector space with English,
jointly specialising the representations to encode the relational information
from English VerbNet. A standard clustering algorithm is then run on top of the
VerbNet-specialised representations, using vector dimensions as features for
learning verb classes. Our results show that the proposed cross-lingual
transfer approach sets new state-of-the-art verb classification performance
across all six target languages explored in this work.Comment: EMNLP 2017 (long paper
Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation
Question Generation (QG) is fundamentally a simple syntactic transformation;
however, many aspects of semantics influence what questions are good to form.
We implement this observation by developing Syn-QG, a set of transparent
syntactic rules leveraging universal dependencies, shallow semantic parsing,
lexical resources, and custom rules which transform declarative sentences into
question-answer pairs. We utilize PropBank argument descriptions and VerbNet
state predicates to incorporate shallow semantic content, which helps generate
questions of a descriptive nature and produce inferential and semantically
richer questions than existing systems. In order to improve syntactic fluency
and eliminate grammatically incorrect questions, we employ back-translation
over the output of these syntactic rules. A set of crowd-sourced evaluations
shows that our system can generate a larger number of highly grammatical and
relevant questions than previous QG systems and that back-translation
drastically improves grammaticality at a slight cost of generating irrelevant
questions.Comment: Some of the results in the paper were incorrec
Construction of an ontology for intelligent Arabic QA systems leveraging the Conceptual Graphs representation
The last decade had known a great interest in Arabic Natural Language Processing (NLP) applications. This interest is
due to the prominent importance of this 6th most wide-spread language in the world with more than 350 million native speakers.
Currently, some basic Arabic language challenges related to the high inflection and derivation, Part-of-Speech (PoS) tagging,
and diacritical ambiguity of Arabic text are practically tamed to a great extent. However, the development of high level and
intelligent applications such as Question Answering (QA) systems is still obstructed by the lacks in terms of ontologies and other
semantic resources. In this paper, we present the construction of a new Arabic ontology leveraging the contents of Arabic WordNet
(AWN) and Arabic VerbNet (AVN). This new resource presents the advantage to combine the high lexical coverage and semantic
relations between words existing in AWN together with the formal representation of syntactic and semantic frames corresponding
to verbs in AVN. The Conceptual Graphs representation was adopted in the framework of a multi-layer platform dedicated to
the development of intelligent and multi-agents systems. The built ontology is used to represent key concepts in questions and
documents for further semantic comparison. Experiments conducted in the context of the QA task show a promising coverage
with respect to the processed questions and passages. The obtained results also highlight an improvement in the performance of
Arabic QA regarding the c@1 measure.The work of the last author was carried out in the framework of the WIQ-EI IRSES project (Grant No. 269180) within the FP 7 Marie Curie, the DIANA APPLICATIONS - Finding Hidden Knowledge in Texts: Applications (TIN2012-38603-C02-01) project, and the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems.Abouenour, L.; Nasri, M.; Bouzoubaa, K.; Kabbaj, A.; Rosso, P. (2014). Construction of an ontology for intelligent Arabic QA systems leveraging the Conceptual Graphs representation. Journal of Intelligent and Fuzzy Systems. 27(6):2869-2881. https://doi.org/10.3233/IFS-141248S2869288127
Knowledge-driven Natural Language Understanding of English Text and its Applications
Understanding the meaning of a text is a fundamental challenge of natural
language understanding (NLU) research. An ideal NLU system should process a
language in a way that is not exclusive to a single task or a dataset. Keeping
this in mind, we have introduced a novel knowledge driven semantic
representation approach for English text. By leveraging the VerbNet lexicon, we
are able to map syntax tree of the text to its commonsense meaning represented
using basic knowledge primitives. The general purpose knowledge represented
from our approach can be used to build any reasoning based NLU system that can
also provide justification. We applied this approach to construct two NLU
applications that we present here: SQuARE (Semantic-based Question Answering
and Reasoning Engine) and StaCACK (Stateful Conversational Agent using
Commonsense Knowledge). Both these systems work by "truly understanding" the
natural language text they process and both provide natural language
explanations for their responses while maintaining high accuracy.Comment: Preprint. Accepted by the 35th AAAI Conference (AAAI-21) Main Track
Improvement of VerbNet-like resources by frame typing
International audienceVerbenet is a French lexicon developed by " translation " of its English counterpart — VerbNet (Kipper-Schuler, 2005) — and treatment of the specificities of French syntax (Pradet et al., 2014; Danlos et al., 2016). One difficulty encountered in its development springs from the fact that the list of (potentially numerous) frames has no internal organization. This paper proposes a type system for frames that shows whether two frames are variants of a given alternation. Frame typing facilitates coherence checking of the resource in a " virtuous circle ". We present the principles underlying a program we developed and used to automatically type frames in Verbenet. We also show that our system is portable to other languages
Achieving High Quality Knowledge Acquisition using Controlled Natural Language
Controlled Natural Languages (CNLs) are efficient languages for knowledge acquisition and reasoning. They are designed as a subset of natural languages with restricted grammar while being highly expressive. CNLs are designed to be automatically translated into logical representations, which can be fed into rule engines for query and reasoning. In this work, we build a knowledge acquisition machine, called KAM, that extends Attempto Controlled English (ACE) and achieves three goals. First, KAM can identify CNL sentences that correspond to the same logical representation but expressed in various syntactical forms. Second, KAM provides a graphical user interface (GUI) that allows users to disambiguate the knowledge acquired from text and incorporates user feedback to improve knowledge acquisition quality. Third, KAM uses a paraconsistent logical framework to encode CNL sentences in order to achieve reasoning in the presence of inconsistent knowledge
Revisiting knowledge-based Semantic Role Labeling
International audienceSemantic role labeling has seen tremendous progress in the last years, both for supervised and unsupervised approaches. The knowledge-based approaches have been neglected while they have shown to bring the best results to the related word sense disambiguation task. We contribute a simple knowledge-based system with an easy to reproduce specification. We also present a novel approach to handle the passive voice in the context of semantic role labeling that reduces the error rate in F1 by 15.7%, showing that significant improvements can be brought while retaining the key advantages of the approach: a simple approach which facilitates analysis of individual errors, does not need any hand-annotated corpora and which is not domain-specific
- …