Search CORE

437 research outputs found

Guidelines for annotating the LUNA corpus with frame information

Author: Riccardi Giuseppe
Tonelli Sara
Publication venue
Publication date: 01/02/2010
Field of study

This document defines the annotation workflow aimed at adding frame information to the LUNA corpus of conversational speech. In particular, it details both the corpus pre-processing steps and the proper annotation process, giving hints about how to choose the frame and the frame element labels. Besides, the description of 20 new domain-specific and language-specific frames is reported. To our knowledge, this is the first attempt to adapt the frame paradigm to dialogs and at the same time to define new frames and frame elements for the specific domain of software/hardware assistance. The technical report is structured as follows: in Section 2 an overview of the FrameNet project is given, while Section 3 introduces the LUNA project and the annotation framework involving the Italian dialogs. Section 4 details the annotation workflow, including the format preparation of the dialog files and the annotation strategy. In Section 5 we discuss the main issues of the annotation of frame information in dialogs and we describe how the standard annotation procedure was changed in order to face such issues. Then, the 20 newly introduced frames are reported in Section 6

Unitn-eprints Research

Unsupervised Semantic Frame Induction using Triclustering

Author: Biemann Chris
Kutuzov Andrei
Panchenko Alexander
Ponzetto Simone Paolo
Ustalov Dmitry
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

We use dependency triples automatically extracted from a Web-scale corpus to perform unsupervised semantic frame induction. We cast the frame induction problem as a triclustering problem that is a generalization of clustering for triadic data. Our replicable benchmarks demonstrate that the proposed graph-based approach, Triframes, shows state-of-the art results on this task on a FrameNet-derived dataset and performing on par with competitive methods on a verb class clustering task.Comment: 8 pages, 1 figure, 4 tables, accepted at ACL 201

arXiv.org e-Print Archive

Crossref

Extracting Formal Models from Normative Texts

Author: Camilleri John J.
Grūzītis Normunds
Schneider Gerardo
Publication venue
Publication date: 01/01/2016
Field of study

We are concerned with the analysis of normative texts - documents based on the deontic notions of obligation, permission, and prohibition. Our goal is to make queries about these notions and verify that a text satisfies certain properties concerning causality of actions and timing constraints. This requires taking the original text and building a representation (model) of it in a formal language, in our case the C-O Diagram formalism. We present an experimental, semi-automatic aid that helps to bridge the gap between a normative text in natural language and its C-O Diagram representation. Our approach consists of using dependency structures obtained from the state-of-the-art Stanford Parser, and applying our own rules and heuristics in order to extract the relevant components. The result is a tabular data structure where each sentence is split into suitable fields, which can then be converted into a C-O Diagram. The process is not fully automatic however, and some post-editing is generally required of the user. We apply our tool and perform experiments on documents from different domains, and report an initial evaluation of the accuracy and feasibility of our approach.Comment: Extended version of conference paper at the 21st International Conference on Applications of Natural Language to Information Systems (NLDB 2016). arXiv admin note: substantial text overlap with arXiv:1607.0148

arXiv.org e-Print Archive

Crossref

Chalmers Research

Unsupervised semantic frame induction using triclustering

Author: Biemann Chris
Kutuzov Andrei
Panchenko Alexander
Ponzetto Simone Paolo
Ustalov Dmitry
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server

NORA - Norwegian Open Research Archives

Finding common ground: towards a surface realisation shared task

Author: Belz Anya
Hogan Deirdre
Stent Amanda
van Genabith Josef
White Mike
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2010
Field of study

In many areas of NLP reuse of utility tools such as parsers and POS taggers is now common, but this is still rare in NLG. The subfield of surface realisation has perhaps come closest, but at present we still lack a basis on which different surface realisers could be compared, chiefly because of the wide variety of different input representations used by different realisers. This paper outlines an idea for a shared task in surface realisation, where inputs are provided in a common-ground representation formalism which participants map to the types of input required by their system. These inputs are derived from existing annotated corpora developed for language analysis (parsing etc.). Outputs (realisations) are evaluated by automatic comparison against the human-authored text in the corpora as well as by human assessors

DCU Online Research Access Service

FinnFN 1.0: The Finnish frame semantic database

Author: Haltia Heidi
Laine Antti
Lindén Krister
Luukkonen Juha
Roivainen Hege
Väisänen Niina
Publication venue
Publication date: 14/08/2017
Field of study

The article describes the process of creating a Finnish language FrameNet or FinnFN, based on the original English language FrameNet hosted at the International Computer Science Institute in Berkeley, California. We outline the goals and results relating to the FinnFN project and especially to the creation of the FinnFrame corpus. The main aim of the project was to test the universal applicability of frame semantics by annotating real Finnish using the same frames and annotation conventions as in the original Berkeley FrameNet project. From Finnish newspaper corpora, 40,721 sentences were automatically retrieved and manually annotated as example sentences evoking certain frames. This became the FinnFrame corpus. Applying the Berkeley FrameNet annotation conventions to the Finnish language required some modifications due to Finnish morphology, and a convention for annotating individual morphemes within words was introduced for phenomena such as compounding, comparatives and case endings. Various questions about cultural salience across the two languages arose during the project, but problematic situations occurred only in a few examples, which we also discuss in the article. The article shows that, barring a few minor instances, the universality hypothesis of frames is largely confirmed for languages as different as Finnish and English.Peer reviewe

Helsingin yliopiston digitaalinen arkisto