30 research outputs found
Construction of an ontology for intelligent Arabic QA systems leveraging the Conceptual Graphs representation
The last decade had known a great interest in Arabic Natural Language Processing (NLP) applications. This interest is
due to the prominent importance of this 6th most wide-spread language in the world with more than 350 million native speakers.
Currently, some basic Arabic language challenges related to the high inflection and derivation, Part-of-Speech (PoS) tagging,
and diacritical ambiguity of Arabic text are practically tamed to a great extent. However, the development of high level and
intelligent applications such as Question Answering (QA) systems is still obstructed by the lacks in terms of ontologies and other
semantic resources. In this paper, we present the construction of a new Arabic ontology leveraging the contents of Arabic WordNet
(AWN) and Arabic VerbNet (AVN). This new resource presents the advantage to combine the high lexical coverage and semantic
relations between words existing in AWN together with the formal representation of syntactic and semantic frames corresponding
to verbs in AVN. The Conceptual Graphs representation was adopted in the framework of a multi-layer platform dedicated to
the development of intelligent and multi-agents systems. The built ontology is used to represent key concepts in questions and
documents for further semantic comparison. Experiments conducted in the context of the QA task show a promising coverage
with respect to the processed questions and passages. The obtained results also highlight an improvement in the performance of
Arabic QA regarding the c@1 measure.The work of the last author was carried out in the framework of the WIQ-EI IRSES project (Grant No. 269180) within the FP 7 Marie Curie, the DIANA APPLICATIONS - Finding Hidden Knowledge in Texts: Applications (TIN2012-38603-C02-01) project, and the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems.Abouenour, L.; Nasri, M.; Bouzoubaa, K.; Kabbaj, A.; Rosso, P. (2014). Construction of an ontology for intelligent Arabic QA systems leveraging the Conceptual Graphs representation. Journal of Intelligent and Fuzzy Systems. 27(6):2869-2881. https://doi.org/10.3233/IFS-141248S2869288127
On Singles, Couples and Extended Families. Measuring Overlapping between Latin Vallex and Latin WordNet
Different lexical resources may pursue different views on lexical meaning. However, all of them deal with lexical items as common basic components, which are described according to criteria that may vary from one resource to another. In this paper, we present a method for measuring the degree of similarity between a valency-based lexical resource and a WordNet. This is motivated by both theoretical and practical reasons. As for the former, we wonder if there are lexical classes that "impose" themselves regardless of the fact that they are explicitly recorded as such in source lexical resources. As for the latter, our work wants to contribute to the research task dealing with merging lexical resources. In order to apply and evaluate our method, we propose a normalized coefficient of overlapping that measures the overlapping rate between a valency lexicon and a WordNet. In particular, in the context of the exploitation of the linguistic resources for ancient languages built over the last decade, we compute and evaluate the overlapping between a selection of homogeneous lexical subsets extracted from two lexical resources for Latin
Recommended from our members
Probabilistic Modeling of Verbnet Clusters
The objective of this research is to build automated models that emulate VerbNet, a semantic resource for English verbs. VerbNet has been built and expanded by linguists, forming a hierarchical clustering of verbs with common semantic and syntactic expressions, and is useful in semantic tasks. A major drawback is the difficulty of extending a manually-curated resource, which leads to gaps in coverage. After over a decade of development, VerbNet has missing verbs, missing senses of common verbs, and is missing appropriate classes to contain at least some of them. Although there have been efforts to build VerbNet resources in other languages, none have received as much attention, so these coverage issues are often more glaring in resource-poor languages. Probabilistic models can emulate VerbNet by learning distributions from large corpora, addressing coverage by providing both a complete clustering of the observed data, and a model to assign unseen sentences to clusters. The output of these models can aid the creation and expansion of VerbNet in English and other languages, especially if they align strongly with known VerbNet classes.This work develops several improvements to the state-of-the-art system for verb sense induction and VerbNet-like clustering. The baseline is two-step process for automatically inducing verb senses and producing a polysemy-aware clustering, that matched VerbNet more closely than any previous methods. First, we will see that a single-step process can produce better automatic senses and clusters. Second, we explore an alternative probabilistic model, which is successful on the verb clustering task. This model does not perform well on sense induction, so we analyze the limitations on its applicability. Third, we explore methods of supervising these probabilistic models with limited labeled data, which dramatically improves the recovery of correct clusters. Together these improvements suggest a line of research for practitioners to take advantage of probabilistic models in VerbNet annotation efforts
Enhancing factoid question answering using frame semantic-based approaches
FrameNet is used to enhance the performance of semantic QA systems. FrameNet is a linguistic resource that encapsulates Frame Semantics and provides scenario-based generalizations over lexical items that share similar semantic backgrounds.Doctor of Philosoph
From Parsed Corpora to Semantically Related Verbs
A comprehensive repository of semantic relations between verbs is of great importance in supporting a large area of natural language applications. The aim of this paper is to automatically generate a repository of semantic relations between verb pairs using Distributional Memory (DM), a state-of-the-art framework for distributional semantics. The main idea of our method is to exploit relationships that are expressed through prepositions between a verbal and a nominal event in text to extract semantically related events. Then using these prepositions, we derive relation types including causal, temporal, comparison, and expansion. The result of our study leads to the construction of a resource for semantic relations, which consists of pairs of verbs associated with their probable arguments and significance scores based on our measures. Experimental evaluations show promising results on the task of extracting and categorising semantic relations between verbs
Strategies to Address Data Sparseness in Implicit Semantic Role Labeling
Natural language texts frequently contain predicates whose complete understanding re- quires access to other parts of the discourse. Human readers can retrieve such infor- mation across sentence boundaries and infer the implicit piece of information. This capability enables us to understand complicated texts without needing to repeat the same information in every single sentence. However, for computational systems, resolv- ing such information is problematic because computational approaches traditionally rely on sentence-level processing and rarely take into account the extra-sentential context.
In this dissertation, we investigate this omission phenomena, called implicit semantic role labeling. Implicit semantic role labeling involves identification of predicate argu- ments that are not locally realized but are resolvable from the context. For example, in ”What’s the matter, Walters? asked Baynes sharply.”, the ADDRESSEE of the predicate ask, Walters, is not mentioned as one of its syntactic arguments, but can be recoverable from the previous sentence. In this thesis, we try to improve methods for the automatic processing of such predicate instances to improve natural language pro- cessing applications. Our main contribution is introducing approaches to solve the data sparseness problem of the task. We improve automatic identification of implicit roles by increasing the amount of training set without needing to annotate new instances. For this purpose, we propose two approaches. As the first one, we use crowdsourcing to annotate instances of implicit semantic roles and show that with an appropriate task de- sign, reliable annotation of implicit semantic roles can be obtained from the non-experts without the need to present precise and linguistic definition of the roles to them. As the second approach, we combine seemingly incompatible corpora to solve the problem of data sparseness of ISRL by applying a domain adaptation technique. We show that out of domain data from a different genre can be successfully used to improve a baseline implicit semantic role labeling model, when used with an appropriate domain adapta- tion technique. The results also show that the improvement occurs regardless of the predicate part of speech, that is, identification of implicit roles relies more on semantic features than syntactic ones. Therefore, annotating instances of nominal predicates, for instance, can help to improve identification of verbal predicates’ implicit roles, we well. Our findings also show that the variety of the additional data is more important than its size. That is, increasing a large amount of data does not necessarily lead to a better model
Similarity Reasoning over Semantic Context-Graphs
Similarity is a central cognitive mechanism for humans which enables a broad range of perceptual and abstraction processes, including recognizing and categorizing objects, drawing parallelism, and predicting outcomes. It has been studied computationally through models designed to replicate human judgment. The work presented in this dissertation leverages general purpose semantic networks to derive similarity measures in a problem-independent manner. We model both general and relational similarity using connectivity between concepts within semantic networks. Our first contribution is to model general similarity using concept connectivity, which we use to partition vocabularies into topics without the need of document corpora. We apply this model to derive topics from unstructured dialog, specifically enabling an early literacy primer application to support parents in having better conversations with their young children, as they are using the primer together. Second, we model relational similarity in proportional analogies. To do so, we derive relational parallelism by searching in semantic networks for similar path pairs that connect either side of this analogy statement. We then derive human readable explanations from the resulting similar path pair. We show that our model can answer broad-vocabulary analogy questions designed for human test takers with high confidence. The third contribution is to enable symbolic plan repair in robot planning through object substitution. When a failure occurs due to unforeseen changes in the environment, such as missing objects, we enable the planning domain to be extended with a number of alternative objects such that the plan can be repaired and execution to continue. To evaluate this type of similarity, we use both general and relational similarity. We demonstrate that the task context is essential in establishing which objects are interchangeable
Italian VerbNet: A Construction-based Approach to Italian Verb Classification
L'elaborato consiste nella proposta di una nuova classificazione verbale per l'italiano, sulla base dell'autorevole modello inglese di VerbNet. Il metodo elaborato, punto centrale della ricerca, è stato sviluppato in modo da consentire la creazione di classi compatibili con il modello inglese, ma allo stesso tempo autonome e basate su criteri teorici indipendenti. Ad una parte esplicativa segue l'esposizione dei dati correlati da commenti
RoboCup@Home: commanding a service robot by natural language.
It was in the ancient Greece that myths were written and, among already there one
could nd the human desire of robotic servants. It was Hephaestus, god of technology,
blacksmiths, craftsmen and artisans who is said to have built robots to help him on
his workshop. This show how deep in our thoughts was this desire that one could nd
stories and tales of human-shaped machines that danced in china or inanimate materials
like mud that gave shape to golems in Jewish tradition.
In the renaissance, a lot of automata began to arise, beginning by Leonardo Da Vinci to
the artisans from China and Japan, mankind was trying to produce automatic machines,
sometimes for their own bene t, some other times to their delight and fascination.
But it wasn't until the digital era that the dream began to seem feasible. After millennia
of wondering of automated robots, computers showed that automatic calculus was
possible and from this, ideas of an automated mind arose. Theories for cognitive architectures
are born since the early stages of arti cial intelligence, cognitive architectures
that now are a reality.
Thanks to the technological advances and the knowledge about the mind, what once
was material for ctional tales, now is feasible and only matter of time. There is a lot of
research on robotics and cognition that is beginning to get coupled into what are called
"service robots".
In this thesis, I present a system that participates in a competition designed for this kind of robots. A competition that have on its basis the same dream that humans have had
all around the world for centuries: the cohabitation of humans and service automatons