128 research outputs found

    Linguistic based matching of local ontologies

    This paper describes an automatic algorithm of meaning negotiation that enables semantic interoperability between local overlapping and heterogeneous ontologies. Rather than reconciling differences between heterogeneous ontologies, this algorithm searches for mappings between concepts of different ontologies. The algorithm is composed of three main steps: (i) computing the linguistic meaning of the label occurring in the ontologies via natural language processing, (ii) contextualization of such a linguistic meaning by considering the context, i.e. the ontologies, where a label occurs; (iii) comparing contextualized linguistic meaning of two ontologies in in order to find a possible matching between them

    The Named Entity Recognition Task at EVALITA 2009

    The submissions of results to the Named Entity Recognition task at EVALITA 2009 by seven different teams (five working in Italy and two abroad) confirms the interest displayed in the 2007 evaluation campaign. Using the same guidelines and evaluation metrics as in the previous edition, there has been a significant improvement in the average performance of the systems, with an average F-measure of the systems’ best run close to 76% (in comparison to a 70% average for the 2007 evaluation) and three systems scoring above 80%

    Becoming JILDA

    The difficulty in finding use-ful dialogic data to train a conversationalagent is an open issue even nowadays,when chatbots and spoken dialogue sys-tems are widely used. For this reason wedecided to build JILDA, a novel data col-lection of chat-based dialogues, producedby Italian native speakers and related to thejob-offer domain. JILDA is the first dia-logue collection related to this domain forthe Italian language. Because of its collec-tion modalities, we believe that JILDA canbe a useful resource not only for the Italianresearch community, but also for the inter-national one

    Effective Communication without Verbs? Sure!

    Nominal utterances are very frequent, especially in social media texts, and play a crucial role as they are very dense from a semantic point of view. In spite of this, their automatic identification has received little to no attention. We have thus developed a framework for the annotation of nominal utterances and created the manually annotated corpus COSMIANU (Corpus Of Social Media Italian Annotated with Nominal Utterances), which could be used to train automatic systems.Gli enunciati nominali sono un fenomento linguistico molto frequente, specialmente nello scritto dei social media, e di cruciale importanza, data la loro alta densità semantica. Tuttavia, ben poca attenzione è stata dedicata al loro riconoscimento automatico. In quest’ottica, questo lavoro illustra le guidelines per l’annotazione manuale degli enunciati nominali da noi sviluppate e presenta il corpus dell’italiano dei social media da noi annotato con gli enunciati nominali (COSMIANU), utilizzabile per addestrare sistemi automatici

    Event Factuality in Italian: Annotation of News Stories from the Ita-TimeBank

    In this paper we present ongoing work devoted to the extension of the Ita-TimeBank (Caselli et al., 2011) with event factuality annotation on top of TimeML annotation, where event factuality is represented on three main axes: time, polarity and certainty. We describe the annotation schema proposed for Italian and report on the results of our corpus analysis

    Multilingual extension of a temporal expression normalizer using annotated corpora

    This paper presents the automatic extension to other languages of TERSEO, a knowledge-based system for the recognition and normalization of temporal expressions originally developed for Spanish. TERSEO was first extended to English through the automatic translation of the temporal expressions. Then, an improved porting process was applied to Italian, where the automatic translation of the temporal expressions from English and from Spanish was combined with the extraction of new expressions from an Italian annotated corpus. Experimental results demonstrate how, while still adhering to the rule-based paradigm, the development of automatic rule translation procedures allowed us to minimize the effort required for porting to new languages. Relying on such procedures, and without any manual effort or previous knowledge of the target language, TERSEO recognizes and normalizes temporal expressions in Italian with good results (72% precision and 83% recall for recognition).This research was partially funded by the Spanish Government (contract TIC2003-07158-C04-01

    FacTA: Evaluation of Event Factuality and Temporal Anchoring

    In this paper we describe FacTA, a new task connecting the evaluation of factuality profiling and temporal anchoring, two strictly related aspects in event processing. The proposed task aims at providing a complete evaluation framework for factuality profiling, at taking the first steps in the direction of narrative container evaluation for Italian, and at making available benchmark data for high-level semantic tasks
