Search CORE

30,605 research outputs found

Semantic Types, Lexical Sorts and Classifiers

Author: Mery Bruno
Retoré Christian
Publication venue
Publication date: 01/01/2013
Field of study

We propose a cognitively and linguistically motivated set of sorts for lexical semantics in a compositional setting: the classifiers in languages that do have such pronouns. These sorts are needed to include lexical considerations in a semantical analyser such as Boxer or Grail. Indeed, all proposed lexical extensions of usual Montague semantics to model restriction of selection, felicitous and infelicitous copredication require a rich and refined type system whose base types are the lexical sorts, the basis of the many-sorted logic in which semantical representations of sentences are stated. However, none of those approaches define precisely the actual base types or sorts to be used in the lexicon. In this article, we shall discuss some of the options commonly adopted by researchers in formal lexical semantics, and defend the view that classifiers in the languages which have such pronouns are an appealing solution, both linguistically and cognitively motivated

arXiv.org e-Print Archive

Open Archive Toulouse Archive Ouverte

Probabilistic Linguistic Knowledge and Token-level Text Augmentation

Author: Wang Zhengxiang
Publication venue
Publication date: 28/06/2023
Field of study

This paper investigates the effectiveness of token-level text augmentation and the role of probabilistic linguistic knowledge within a linguistically-motivated evaluation context. Two text augmentation programs, REDA and REDA

_{NG}

, were developed, both implementing five token-level text editing operations: Synonym Replacement (SR), Random Swap (RS), Random Insertion (RI), Random Deletion (RD), and Random Mix (RM). REDA

_{NG}

leverages pretrained

n

-gram language models to select the most likely augmented texts from REDA's output. Comprehensive and fine-grained experiments were conducted on a binary question matching classification task in both Chinese and English. The results strongly refute the general effectiveness of the five token-level text augmentation techniques under investigation, whether applied together or separately, and irrespective of various common classification model types used, including transformers. Furthermore, the role of probabilistic linguistic knowledge is found to be minimal.Comment: 20 pages; 3 figures; 8 table

arXiv.org e-Print Archive

Predicting Native Language from Gaze

Author: Berzak Yevgeni
Flynn Suzanne
Katz Boris
Nakamura Chie
Publication venue
Publication date: 01/01/2017
Field of study

A fundamental question in language learning concerns the role of a speaker's first language in second language acquisition. We present a novel methodology for studying this question: analysis of eye-movement patterns in second language reading of free-form text. Using this methodology, we demonstrate for the first time that the native language of English learners can be predicted from their gaze fixations when reading English. We provide analysis of classifier uncertainty and learned features, which indicates that differences in English reading are likely to be rooted in linguistic divergences across native languages. The presented framework complements production studies and offers new ground for advancing research on multilingualism.Comment: ACL 201

arXiv.org e-Print Archive

DSpace@MIT

Crossref

A literature survey of methods for analysis of subjective language

Author: Täckström Oscar
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/2009
Field of study

Subjective language is used to express attitudes and opinions towards things, ideas and people. While content and topic centred natural language processing is now part of everyday life, analysis of subjective aspects of natural language have until recently been largely neglected by the research community. The explosive growth of personal blogs, consumer opinion sites and social network applications in the last years, have however created increased interest in subjective language analysis. This paper provides an overview of recent research conducted in the area

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Rare and endangered - languages or features? An African perspective

Author: Lüpke Friederike
Publication venue
Publication date: 01/01/2010
Field of study

SOAS Research Online