Search CORE

17 research outputs found

Finite-state morphological analysis of Persian

Author: Karine Megerdoomian
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2004
Field of study

This paper describes a two-level morphological analyzer for Persian using a system based on the Xerox finite state tools. Persian language presents certain challenges to computational analysis: There is a complex verbal conjugation paradigm which includes long-distance morphological dependencies; phonological alternations apply at morpheme boundaries; word and noun phrase boundaries are difficult to define since morphemes may be detached from their stems and distinct words can appear without an intervening space. In this work, we develop these problems and provide solutions in a finite-state morphology system.

CiteSeerX

Crossref

Human language reveals a universal positivity bias

Author: Bagrow James P.
Clark Eric M.
Danforth Christopher M.
Desu Suma
Dodds Peter Sheridan
Frank Morgan R.
Harris Kameron Decker
Kloumann Isabel M.
McMahon Matthew T.
Megerdoomian Karine
Mitchell Lewis
Reagan Andrew J.
Tivnan Brian F.
Williams Jake Ryland
Publication venue: UVM ScholarWorks
Publication date: 01/06/2014
Field of study

Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (i ) the words of natural human language possess a universal positivity bias, (ii ) the estimated emotional content of words is consistent between languages under translation, and (iii ) this positivity bias is strongly independent of frequency of word use. Alongside these general regularities, we describe interlanguage variations in the emotional spectrum of languages that allow us to rank corpora. We also show how our word evaluations can be used to construct physical-like instruments for both real-time and offline measurement of the emotional content of large-scale texts

arXiv.org e-Print Archive

Reply to Garcia et al.: Common mistakes in measuring frequency-dependent word characteristics

Author: Bagrow James P.
Clark Eric M.
Danforth Christopher M.
Desu Suma
Dodds Peter Sheridan
Frank Morgan R.
Harris Kameron Decker
Kloumann Isabel M.
McMahon Matthew T.
Megerdoomian Karine
Mitchell Lewis
Reagan Andrew J.
Tivnan Brian F.
Williams Jake Ryland
Publication venue: UVM ScholarWorks
Publication date: 01/01/2015
Field of study

We demonstrate that the concerns expressed by Garcia et al. are misplaced, due to (1) a misreading of our findings in [1]; (2) a widespread failure to examine and present words in support of asserted summary quantities based on word usage frequencies; and (3) a range of misconceptions about word usage frequency, word rank, and expert-constructed word lists. In particular, we show that the English component of our study compares well statistically with two related surveys, that no survey design influence is apparent, and that estimates of measurement error do not explain the positivity biases reported in our work and that of others. We further demonstrate that for the frequency dependence of positivity---of which we explored the nuances in great detail in [1]---Garcia et al. did not perform a reanalysis of our data---they instead carried out an analysis of a different, statistically improper data set and introduced a nonlinearity before performing linear regression.Comment: 5 pages, 2 figures, 1 table. Expanded version of reply appearing in PNAS 201

arXiv.org e-Print Archive

Crossref

UVM ScholarWorks

Adelaide Research & Scholarship

PubMed Central

Unification-Based Persian Morphology

Author: Karine Megerdoomian
Publication venue
Publication date
Field of study

this paper, we describe the implementation of an inflectional morphological analyzer for Persian, which is based on finite state transducers and typed feature structures with unification. The analyzer was designed to provide an interface to the syntactic parser in the Shiraz Persian-English machine translation system (http://crl.nmsu.edu/shiraz) and was tested on online newspaper articles. The system includes a dictionary with 50,000 entries which is used for lookup after morphological analysis has been performed

CiteSeerX

Author: Ali Farghaly (ed
Karine Megerdoomian
Publication venue
Publication date
Field of study

CiteSeerX