Search CORE

1,892 research outputs found

Towards a constructional approach to discourse-level phenomena : the case of the Spanish interpersonal epistemic stance construction

Author: Enghels Renata
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2018
Field of study

This study contributes to a better understanding of how constructional models can be applied to discourse-level phenomena, and constitute a valuable complementation to previous grammaticalization accounts of pragmatic markers. The case study that is presented concerns the recent development of the interpersonal epistemic stance construction in Spanish. The central argument is that the expanding use of sabes as a pragmatic marker can best be fully understood by taking into account the composite network of related expressions which Spanish speakers have at their disposal when performing a particular speech act. The diachronic analysis is documented with spoken corpus examples collected in recent decades, and is mainly informed by frequency data measuring the productivity, as well as formal properties of the construction and its instances

Ghent University Academic Bibliography

Workshop on Extracting and Using Constructions in Computational Linguistics

Author: Knutsson Ola
Sahlgren Magnus
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2010
Field of study

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Generating indicative-informative summaries with SumUM

Author: Benbrahim Mohamed
Guy Lapalme
Horacio Saggion
Jing Hongyan
Johnson Frances C
Jordan Michael P
Radev Dragomir R
Teufel S.
Tombros Anastasios
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2002
Field of study

We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the reader's interest. SumUM motivates the topics, describes entities, and defines concepts. It is a first step for exploring the issue of dynamic summarization. This is accomplished through a process of shallow syntactic and semantic analysis, concept identification, and text regeneration. Our method was developed through the study of a corpus of abstracts written by professional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness, and text acceptability of the automatic summaries. The results thus far indicate good performance when compared with other summarization technologies

CiteSeerX

Crossref

White Rose Research Online

WikiSense: Supersense Tagging of Wikipedia Named Entities Based WordNet

Author: Chang Jason S.
Chang Joseph
Tsai Richard Tzong-Han
Publication venue: City University of Hong Kong
Publication date: 01/01/2009
Field of study

PACLIC 23 / City University of Hong Kong / 3-5 December 200

Waseda University Repository

Serbo-Croat Clitics and Word Grammar

Author: Hudson Richard
Čamdžić Amela
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2007
Field of study

Serbo-Croat has a complex system of clitics which raise interesting problems for any theory of the interface between syntax and morphology. After summarising the data we review previous analyses (mostly within the generative tradition), all of which are unsatisfactory in various ways. We then explain how Word Grammar handles clitics: as words whose form is an affix rather than the usual ‘word-form’. Like other affixes, clitics need a word to accommodate them, but in the case of clitics this is a special kind of word called a ‘hostword’. We present a detailed analysis of Serbo-Croat clitics within this theory, introducing a new distinction between two cases: where the clitics are attached to the verb or auxiliary, and where they are attached to some dependent of the verb

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)

Narrative-based taxonomy distillation for effective indexing of text collections.

Author: Cataldi Mario
Sapino Maria Luisa
Sel&#231
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Institutional Research Information System University of Turin

Diacritic Restoration and the Development of a Part-of-Speech Tagset for the Māori Language

Author: Cocks John
Publication venue: 'University of Waikato'
Publication date: 16/03/2012
Field of study

This thesis investigates two fundamental problems in natural language processing: diacritic restoration and part-of-speech tagging. Over the past three decades, statistical approaches to diacritic restoration and part-of-speech tagging have grown in interest as a consequence of the increasing availability of manually annotated training data in major languages such as English and French. However, these approaches are not practical for most minority languages, where appropriate training data is either non-existent or not publically available. Furthermore, before developing a part-of-speech tagging system, a suitable tagset is required for that language. In this thesis, we make the following contributions to bridge this gap: Firstly, we propose a method for diacritic restoration based on naive Bayes classifiers that act at word-level. Classifications are based on a rich set of features, extracted automatically from training data in the form of diacritically marked text. This method requires no additional resources, which makes it language independent. The algorithm was evaluated on one language, namely Māori, and an accuracy exceeding 99% was observed. Secondly, we present our work on creating one of the necessary resources for the development of a part-of-speech tagging system in Māori, that of a suitable tagset. The tagset described was developed in accordance with the EAGLES guidelines for morphosyntactic annotation of corpora, and was the result of in-depth analysis of the Māori grammar

Research Commons@Waikato