Search CORE

16 research outputs found

Open subtitles 2018 : Statistical rescoring of sentence alignments in large, noisy parallel corpora

Author: Kouylekov M.
Lison P.
Tiedemann J.
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2018
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Recognizing Textual Entailment with Tree Edit Distance: Application to Question Answering and Information Extraction

Author: M. O. Kouylekov
Publication venue
Publication date: 01/01/2006
Field of study

This thesis addresses the problem of Recognizing Textual Entailment (i.e. recognizing that the meaning of a text entails the meaning of another text) using a Tree Edit Distance algorithm between the syntactic trees of the two texts. A key aspect of the approach is the estimation of the cost for the editing operations (i.e. Insertion, Deletion, Substitution) among words. Our aim is to compare the contribution of different resources providing entailment rules, including lexical rules from WordNet and the UniAlberta thesaurus, and syntactic rules automatically acquired by the Dirt and TEASE systems. We carried out a number of experiments over the PASCAL-RTE dataset in order to estimate the contribution of different combinations of the available resources. In addition, we have developed and evaluated an Answer Validation module for Question Answering and a Relation Extraction system, both of them based on textual entailment

Archivio della ricerca - Fondazione Bruno Kessler

FBK_NK: a WordNet-based System for Multi-Way Classification of Semantic Relations

Author: M. Kouylekov
M. Negri
Publication venue
Publication date
Field of study

We describe a WordNet-based system for the extraction of semantic relations between pairs of nominals appearing in English texts. The system adopts a lightweight approach, based on training a Bayesian Network classifier using large sets of binary features. Our features consider: i) the context surrounding the annotated nominals, and ii) different types of knowledge extracted from WordNet, including direct and explicit relations between the annotated nominals, and more general and implicit evidence (e.g. seman- tic boundary collocations). The system achieved a Macro-averaged F1 of 68.02% on the “Multi-Way Classification of Se-mantic Relations Between Pairs of Nominals” task (Task #8) at SemEval-2010

Archivio della ricerca - Fondazione Bruno Kessler

An Open-Source Package for Recognizing Textual Entailment

Author: M. Kouylekov
M. Negri
Publication venue: Association for Computational Linguistics
Publication date
Field of study

This paper presents a general-purpose open source package for recognizing Textual Entailment. The system implements a collection of algorithms, providing a configurable framework to quickly set up a working environment to experiment with the RTE task. Fast prototyping of new solutions is also allowed by the possibility to extend its modular architecture. We present the tool as a useful resource to approach the Textual Entailment problem, as an instrument for didactic purposes, and as an opportunity to create a collaborative environment to promote research in the field

Archivio della ricerca - Fondazione Bruno Kessler

Recognizing Textual Entailment with Tree Edit Distance: Application to Question Answering and Information Extraction

Author: M. O. Kouylekov
Publication venue
Publication date
Field of study

Archivio della ricerca - Fondazione Bruno Kessler

Document Filtering and Ranking Using Syntax and Statistics for Open Domain Question Answering

Author: H. Tanev
M. O. Kouylekov
Publication venue
Publication date
Field of study

Document Filtering and Ranking Using Syntax and Statistics for Open Domain Question Answering. This paper presents a strategy for a syntax based ranking of documents specifically oriented to Question Answering (QA). This strategy should limit the number of documents, processed by an answer extraction module of an syntax oriented QA system. Several measures for statistical scoring of expressions are presented and evaluated on 400 factoid questions from the TREC-12 competition. We prove that syntax based document filtering can outperform classical inverse document frequency approaches (idf

Archivio della ricerca - Fondazione Bruno Kessler

Mining Wikipedia for Large-scale Repositories of Context-Sensitive Entailment Rules

Author: M. Kouylekov
M. Negri
Y. Mehdad
Publication venue
Publication date
Field of study

This paper focuses on the central role played by lexical information in the task of Recognizing Textual Entailment. In particular, the usefulness of lexical knowledge extracted from several widely used static resources, represented in the form of entailment rules, is compared with a method to extract lexical information from Wikipedia as a dynamic knowledge resource. The proposed acquisition method aims at maximizing two key features of the resulting entailment rules: coverage (i.e. the proportion of rules successfully applied over a dataset of TE pairs), and context sensitivity (i.e. the proportion of rules applied in appropriate contexts). Evaluation results show that Wikipedia can be effectively used as a source of lexical entailment rules, featuring both higher coverage and context sensitivity with respect to other resources

Archivio della ricerca - Fondazione Bruno Kessler

Is it Worth Submitting this Run? Assess your RTE System with a Good Sparring Partner.

Author: M. Kouylekov
M. Negri
Y. Mehdad
Publication venue
Publication date
Field of study

We address two issues related to the devel- opment of systems for Recognizing Textual Entailment. The first is the impossibility to capitalize on lessons learned over the different datasets available, due to the changing nature of traditional RTE evaluation settings. The second is the lack of simple ways to assess the results achieved by our system on a given training corpus, and figure out its real potential on unseen test data. Our contribution is the ex- tension of an open-source RTE package with an automatic way to explore the large search space of possible configurations, in order to select the most promising one over a given dataset. From the developers’ point of view, the efficiency and ease of use of the system, together with the good results achieved on all previous RTE datasets, represent a useful support, providing an immediate term of comparison to position the results of their approach

Archivio della ricerca - Fondazione Bruno Kessler

Reconstructing DIOGENE: ITC-irst at TREC 2006

Author: B. Coppola
B. Magnini
M. Negri
M. Ognianov Kouylekov
Publication venue
Publication date
Field of study

Archivio della ricerca - Fondazione Bruno Kessler

Multilingual Pattern Libraries for Question Answering: a Case Study for Definition Questions

Author: B. Coppola
B. Magnini
H. Tanev
M. Negri
M. O. Kouylekov
Publication venue
Publication date: 01/01/2004
Field of study

In this paper we investigate the effectiveness of a novel resource for Multilingual Question Answering (QA). Such a resource consists of a set of multilingual pattern libraries for answer extraction and validation. In the spirit of the ongoing attempts to develop freely available resources for QA, we argue that the distribution and use of pattern libraries will contribute to make Multilingual QA a more feasible task

CiteSeerX

Archivio della ricerca - Fondazione Bruno Kessler