Search CORE

4 research outputs found

Mapping Events and Abstract Entities from PAROLE-SIMPLE-CLIPS to ItalWordNet

Author: Roventini Adriana
Ruimy Nilda
Publication venue: European Language Resources Association (ELRA), Paris, France
Publication date
Field of study

In the few last years, due to the increasing importance of the web, both computational tools and resources need to be more and more visible and easily accessible to a vast community of scholars, students and researchers. Furthermore, high quality lexical resources are crucially required for a wide range of HLT-NLP applications, among which word sense disambiguation. Vast and consistent electronic lexical resources do exist which can be further enhanced and enriched through their linking and integration. An ILC project dealing with the link of two large lexical semantic resources for the Italian language, namely ItalWordNet and PAROLE-SIMPLE-CLIPS, fits this trend. Concrete entities were already linked and this paper addresses the semi-automatic mapping of events and abstract entities. The lexical models of the two resources, the mapping strategy and the tool that was implemented to this aim are briefly outlined. Special focus is put on the results of the linking process: figures are reported and examples are given which illustrate both the linking and harmonization of the resources but also cases of discrepancies, mainly due to the different underlying semantic models

PUblication MAnagement

D3.8 Lexical-semantic analytics for NLP

Author: Campagnano Cesare
Costa Rute
de Does Jesse
Dobrovoljc Kaja
Frontini Francesca
Gantar Polona
Kallas Jelena
Koppel Kristina
Krek Simon
Langemets Margit
Martelli Federico
Maru Marco
Munda Tina
Navigli Roberto
Nimb Sanni
Olsen Sussi
Quochi Valeria
Salgado Ana de Castro
Tempelaars Rob
Tiberius Carole
Ureña-Ruiz Rafael-J.
Velardi Paola
Čibej Jaka
Publication venue: ELEXIS - European Lexicographic Infrastructure
Publication date: 01/01/2022
Field of study

UIDB/03213/2020 UIDP/03213/2020The present document illustrates the work carried out in task 3.3 (work package 3) of ELEXIS project focused on lexical-semantic analytics for Natural Language Processing (NLP). This task aims at computing analytics for lexical-semantic information such as words, senses and domains in the available resources, investigating their role in NLP applications. Specifically, this task concentrates on three research directions, namely i) sense clustering, in which grouping senses based on their semantic similarity improves the performance of NLP tasks such as Word Sense Disambiguation (WSD), ii) domain labeling of text, in which the lexicographic resources made available by the ELEXIS project for research purposes allow better performances to be achieved, and finally iii) analysing the diachronic distribution of senses, for which a software package is made available.publishersversionpublishe

Repositório da Universidade Nova de Lisboa

Web 2.0, Language Resources and standards to automatically build a multilingual Named Entity Lexicon

Author: A. Lenci
Antonio Toral
Aristotle
D. Lenat
G. A. Miller
H. Alshawi
I. H. Witten
J. Giles
J. M. Wiebe
J. Pustejovsky
M. A. Hearst
Monica Monachini
O. Etzioni
P. Vossen
Rafael Muñoz
S. P. Ponzetto
Sergio Ferrández
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Time, events and temporal relations: an empirical model for temporal processing of Italian texts

Author: CASELLI TOMMASO
Publication venue: 'Pisa University Press'
Publication date: 26/04/2009
Field of study

The aim of this work is the elaboration a computational model for the identification of temporal relations in text/discourse to be used as a component in more complex systems for Open-Domain Question-Answers, Information Extraction and Summarization. More specifically, the thesis will concentrate on the relationships between the various elements which signal temporal relations in Italian texts/discourses, on their roles and how they can be exploited. Time is a pervasive element of human life. It is the primary element thanks to which we are able to observe, describe and reason about what surrounds us and the world. The absence of a correct identification of the temporal ordering of what is narrated and/or described may result in a bad comprehension, which can lead to a misunderstanding. Normally, texts/discourses present situations standing in a particular temporal ordering. Whether these situations precede, or overlap or are included one within the other is inferred during the general process of reading and understanding. Nevertheless, to perform this seemingly easy task, we are taking into account a set of complex information involving different linguistic entities and sources of knowledge. A wide variety of devices is used in natural languages to convey temporal information. Verb tense, temporal prepositions, subordinate conjunctions, adjectival phrases are some of the most obvious. Nevertheless even these obvious devices have different degrees of temporal transparency, which may sometimes be not so obvious as it can appear at a quick and superficial analysis. One of the main shortcomings of previous research on temporal relations is represented by the fact that they concentrated only on a particular discourse segment, namely narrative discourse, disregarding the fact that a text/discourse is composed by different types of discourse segments and relations. A good theory or framework for temporal analysis must take into account all of them. In this work, we have concentrated on the elaboration of a framework which could be applied to all text/discourse segments, without paying too much attention to their type, since we claim that temporal relations can be recovered in every kind of discourse segments and not only in narrative ones. The model we propose is obtained by mixing together theoretical assumptions and empirical data, collected by means of two tests submitted to a total of 35 subjects with different backgrounds. The main results we have obtained from these empirical studies are: (i.) a general evaluation of the difficulty of the task of recovering temporal relations; (ii.) information on the level of granularity of temporal relations; (iii.) a saliency-based order of application of the linguistic devices used to express the temporal relations between two eventualities; (iv.) the proposal of tense temporal polysemy, as a device to identify the set of preferences which can assign unique values to possibly multiple temporal relations. On the basis of the empirical data, we propose to enlarge the set of classical finely grained interval relations (Allen, 1983) by including also coarse-grained temporal relations (Freska, 1992). Moreover, there could be cases in which we are not able to state in a reliable way if there exists a temporal relation or what the particular relation between two entities is. To overcome this issue we have adopted the proposal by Mani (2007) which allows the system to have differentiated levels of temporal representation on the basis of the temporal granularity associated with each discourse segment. The lack of an annotated corpus for eventualities, temporal expressions and temporal relations in Italian represents the biggest shortcomings of this work which has prevented the implementation of the model and its evaluation. Nevertheless, we have been able to conduct a series of experiments for the validation of procedures for the further realization of a working prototype. In addition to this, we have been able to implement and validate a working prototype for the spotting of temporal expressions in texts/discourses

Electronic Thesis and Dissertation Archive - Università di Pisa