Search CORE

14 research outputs found

Measuring Frame Instance Relatedness

Author: Basile Valerio
Elena Cabrio
Roque Lopez Condori
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Institutional Research Information System University of Turin

Recommended from our members

Who, What, When, Where, Why? Comparing Multiple Approaches to the Cross-Lingual 5W Task

Author: Coyne Robert Eric
Diab Mona T.
Grishman Ralph
Hakkani-Tür Dilek
Harper Mary
Ji Heng
Ma Wei Yun
McKeown Kathleen
Meyers Adam
Parton Kristen
Rosenthal Sara
Sun Ang
Tur Gokhan
Xu Wei
Yaman Sibel
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

Cross-lingual tasks are especially difficult due to the compounding effect of errors in language processing and errors in machine translation (MT). In this paper, we present an error analysis of a new cross-lingual task: the 5W task, a sentence-level understanding task which seeks to return the English 5W's (Who, What, When, Where and Why) corresponding to a Chinese sentence. We analyze systems that we developed, identifying specific problems in language processing and MT that cause errors. The best cross-lingual 5W system was still 19% worse than the best monolingual 5W system, which shows that MT significantly degrades sentence-level understanding. Neither source-language nor target-language analysis was able to circumvent problems in MT, although each approach had advantages relative to the other. A detailed error analysis across multiple systems suggests directions for future research on the problem

Columbia University Academic Commons

A computational construction grammar approach to semantic frame extraction

Author: Beuls Katrien
Cangalovic Vanja Sophie
Van Eecke Paul
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2021
Field of study

Repository of the University of Namur

Estonian football specific corpora automatic semantic role labeling with football specific Framenet

Author: Tammeveski Lauri
Publication venue
Publication date: 01/01/2014
Field of study

Käesoleva töö eesmärgiks on uurida ning üritada lahendada eestikeelse teksti automaatse freimidega märgendamise probleemi. Üldine eestikeelne Framenet on alles algusjärgus, kuid olemas on terviklik jalgpalli-alane freimide ressurss, mille abil üritame tõestada hüpoteesi, et jalgpalli-alase teksti märgendamiseks piisab vaid morfoloogilisest ning süntaktilisest infost. Sellele hüpoteesile me siiski kinnitust ei saanud, kuna sama tähendust kandvat lauset on võimalik esitada liiga paljudel erinevatel viisidel. Lisaks täiendasime jalgpalli-alaste sõnadega Eesti suurimat leksikaal-semantilist andmebaasi, Wordnetti.Research and a possible solution to the problem of automatic semantic role labeling of text in Estonian is carried out in this paper. A general Estonian Framenet is in the starting phase, but there is also available a football specific Framenet. We try to prove the hypothesis that morphological and syntactical information is enough for automatic semantic role labeling in football related corpora. Unfortunately, we did not achieve a confirmation for the hypothesis, because there are too many ways to present sentences that have the same meaning. In addition, we supplemented Estonian biggest lexical-syntactic database with football related words

DSpace at Tartu University Library

Support Vector Learning for Semantic Argument Classification

Author: C. J. C. Burges
D. Gildea
D. M. Bikel
Daniel Jurafsky
H. Lodhi
J. Platt
J. R. Quinlan
James H. Martin
Kadri Hacioglu
S. Wallis
Sameer Pradhan
V. Vapnik
Valerie Krugler
Wayne Ward
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Semantic argument classification and semantic categorization of Turkish existential sentences using support vector learning

Author: Koca Aylin
Publication venue: Bilkent University
Publication date: 01/01/2004
Field of study

Cataloged from PDF version of article.There are three types of sentences that form all existing natural languages: verbal sentences (e.g. “I read the book.”), copulative sentences (e.g. “The book is on the table.”), and existential sentences (e.g. “There is a book on the table.”). Syntactic and semantic recognition of these sentence types are crucially important in computational linguistics although there has not been any significant work towards this end. This thesis, in an attempt to fill this evident gap, is on identifying and assigning semantic categories of Turkish existential sentences in print. Existential sentences in Turkish are minimally characterized by the two existential particles var, meaning there is/are, and yok, meaning there is/are no. In addition to these most basic meanings, other senses of existential particles are possible, which can be categorized into groups such as case existentials and possession existentials. Our system does shallow semantic parsing in defining the predicate-argument relationships in an existential sentence on a word-byword basis, via utilizing Support Vector Machines, after which it proceeds with the semantic categorization of the whole sentence. For both of these tasks, our system produces promising results, in terms of accuracy and precision/recall, respectively. Part of this research contributes to the annotation of the METU-Sabancı Turkish Treebank with semantic information.Koca, AylinM.S

Bilkent University Institutional Repository

Doctor of Philosophy

Author: Huang Ruihong
Publication venue: University of Utah
Publication date: 01/12/2014
Field of study

dissertationEvents are one important type of information throughout text. Event extraction is an information extraction (IE) task that involves identifying entities and objects (mainly noun phrases) that represent important roles in events of a particular type. However, the extraction performance of current event extraction systems is limited because they mainly consider local context (mostly isolated sentences) when making each extraction decision. My research aims to improve both coverage and accuracy of event extraction performance by explicitly identifying event contexts before extracting individual facts. First, I introduce new event extraction architectures that incorporate discourse information across a document to seek out and validate pieces of event descriptions within the document. TIER is a multilayered event extraction architecture that performs text analysis at multiple granularities to progressively \zoom in" on relevant event information. LINKER is a unied discourse-guided approach that includes a structured sentence classier to sequentially read a story and determine which sentences contain event information based on both the local and preceding contexts. Experimental results on two distinct event domains show that compared to previous event extraction systems, TIER can nd more event information while maintaining a good extraction accuracy, and LINKER can further improve extraction accuracy. Finding documents that describe a specic type of event is also highly challenging because of the wide variety and ambiguity of event expressions. In this dissertation, I present the multifaceted event recognition approach that uses event dening characteristics (facets), in addition to event expressions, to eectively resolve the complexity of event descriptions. I also present a novel bootstrapping algorithm to automatically learn event expressions as well as facets of events, which requires minimal human supervision. Experimental results show that the multifaceted event recognition approach can eectively identify documents that describe a particular type of event and make event extraction systems more precise

The University of Utah: J. Willard Marriott Digital Library

Concept Mining: A Conceptual Understanding based Approach

Author: Shehata Shady
Publication venue: 'University of Waterloo'
Publication date: 01/01/2009
Field of study

Due to the daily rapid growth of the information, there are considerable needs to extract and discover valuable knowledge from data sources such as the World Wide Web. Most of the common techniques in text mining are based on the statistical analysis of a term either word or phrase. These techniques consider documents as bags of words and pay no attention to the meanings of the document content. In addition, statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Therefore, there is an intensive need for a model that captures the meaning of linguistic utterances in a formal structure. The underlying model should indicate terms that capture the semantics of text. In this case, the model can capture terms that present the concepts of the sentence, which leads to discover the topic of the document. A new concept-based model that analyzes terms on the sentence, document and corpus levels rather than the traditional analysis of document only is introduced. The concept-based model can effectively discriminate between non-important terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. The proposed model consists of concept-based statistical analyzer, conceptual ontological graph representation, concept extractor and concept-based similarity measure. The term which contributes to the sentence semantics is assigned two different weights by the concept-based statistical analyzer and the conceptual ontological graph representation. These two weights are combined into a new weight. The concepts that have maximum combined weights are selected by the concept extractor. The similarity between documents is calculated based on a new concept-based similarity measure. The proposed similarity measure takes full advantage of using the concept analysis measures on the sentence, document, and corpus levels in calculating the similarity between documents. Large sets of experiments using the proposed concept-based model on different datasets in text clustering, categorization and retrieval are conducted. The experiments demonstrate extensive comparison between traditional weighting and the concept-based weighting obtained by the concept-based model. Experimental results in text clustering, categorization and retrieval demonstrate the substantial enhancement of the quality using: (1) concept-based term frequency (tf), (2) conceptual term frequency (ctf), (3) concept-based statistical analyzer, (4) conceptual ontological graph, (5) concept-based combined model. In text clustering, the evaluation of results is relied on two quality measures, the F-Measure and the Entropy. In text categorization, the evaluation of results is relied on three quality measures, the Micro-averaged F1, the Macro-averaged F1 and the Error rate. In text retrieval, the evaluation of results relies on three quality measures, the precision at 10 documents retrieved P(10), the preference measure (bpref), and the mean uninterpolated average precision (MAP). All of these quality measures are improved when the newly developed concept-based model is used to enhance the quality of the text clustering, categorization and retrieval

University of Waterloo's Institutional Repository

Joint learning of syntactic and semantic dependencies

Author: Martorell Xavier Lluís
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2008
Field of study

In this master’s thesis we designed, implemented and evaluated a novel joint syntactic and semantic parsing model

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC