590 research outputs found
Construction of an ontology for intelligent Arabic QA systems leveraging the Conceptual Graphs representation
The last decade had known a great interest in Arabic Natural Language Processing (NLP) applications. This interest is
due to the prominent importance of this 6th most wide-spread language in the world with more than 350 million native speakers.
Currently, some basic Arabic language challenges related to the high inflection and derivation, Part-of-Speech (PoS) tagging,
and diacritical ambiguity of Arabic text are practically tamed to a great extent. However, the development of high level and
intelligent applications such as Question Answering (QA) systems is still obstructed by the lacks in terms of ontologies and other
semantic resources. In this paper, we present the construction of a new Arabic ontology leveraging the contents of Arabic WordNet
(AWN) and Arabic VerbNet (AVN). This new resource presents the advantage to combine the high lexical coverage and semantic
relations between words existing in AWN together with the formal representation of syntactic and semantic frames corresponding
to verbs in AVN. The Conceptual Graphs representation was adopted in the framework of a multi-layer platform dedicated to
the development of intelligent and multi-agents systems. The built ontology is used to represent key concepts in questions and
documents for further semantic comparison. Experiments conducted in the context of the QA task show a promising coverage
with respect to the processed questions and passages. The obtained results also highlight an improvement in the performance of
Arabic QA regarding the c@1 measure.The work of the last author was carried out in the framework of the WIQ-EI IRSES project (Grant No. 269180) within the FP 7 Marie Curie, the DIANA APPLICATIONS - Finding Hidden Knowledge in Texts: Applications (TIN2012-38603-C02-01) project, and the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems.Abouenour, L.; Nasri, M.; Bouzoubaa, K.; Kabbaj, A.; Rosso, P. (2014). Construction of an ontology for intelligent Arabic QA systems leveraging the Conceptual Graphs representation. Journal of Intelligent and Fuzzy Systems. 27(6):2869-2881. https://doi.org/10.3233/IFS-141248S2869288127
A Context-theoretic Framework for Compositionality in Distributional Semantics
Techniques in which words are represented as vectors have proved useful in
many applications in computational linguistics, however there is currently no
general semantic formalism for representing meaning in terms of vectors. We
present a framework for natural language semantics in which words, phrases and
sentences are all represented as vectors, based on a theoretical analysis which
assumes that meaning is determined by context.
In the theoretical analysis, we define a corpus model as a mathematical
abstraction of a text corpus. The meaning of a string of words is assumed to be
a vector representing the contexts in which it occurs in the corpus model.
Based on this assumption, we can show that the vector representations of words
can be considered as elements of an algebra over a field. We note that in
applications of vector spaces to representing meanings of words there is an
underlying lattice structure; we interpret the partial ordering of the lattice
as describing entailment between meanings. We also define the context-theoretic
probability of a string, and, based on this and the lattice structure, a degree
of entailment between strings.
We relate the framework to existing methods of composing vector-based
representations of meaning, and show that our approach generalises many of
these, including vector addition, component-wise multiplication, and the tensor
product.Comment: Submitted to Computational Linguistics on 20th January 2010 for
revie
D7.4 Third evaluation report. Evaluation of PANACEA v3 and produced resources
D7.4 reports on the evaluation of the different components integrated in the PANACEA third cycle of development as well as the final validation of the platform itself. All validation and evaluation experiments follow the evaluation criteria already described in D7.1. The main goal of WP7 tasks was to test the (technical) functionalities and capabilities of the middleware that allows the integration of the various resource-creation components into an interoperable distributed environment (WP3) and to evaluate the quality of the components developed in WP5 and WP6. The content of this deliverable is thus complementary to D8.2 and D8.3 that tackle advantages and usability in industrial scenarios. It has to be noted that the PANACEA third cycle of development addressed many components that are still under research. The main goal for this evaluation cycle thus is to assess the methods experimented with and their potentials for becoming actual production tools to be exploited outside research labs. For most of the technologies, an attempt was made to re-interpret standard evaluation measures, usually in terms of accuracy, precision and recall, as measures related to a reduction of costs (time and human resources) in the current practices based on the manual production of resources. In order to do so, the different tools had to be tuned and adapted to maximize precision and for some tools the possibility to offer confidence measures that could allow a separation of the resources that still needed manual revision has been attempted. Furthermore, the extension to other languages in addition to English, also a PANACEA objective, has been evaluated. The main facts about the evaluation results are now summarized
- …