Search CORE

40 research outputs found

Documenting Classification Systems: A Case Study and Considerations

Author: Straughn Christopher
Publication venue: NEIU Digital Commons
Publication date: 01/01/2023
Field of study

There is little literature on documenting the correct application of classification systems. This paper seeks to remedy this gap by describing how Northeastern Illinois University created documentation for their implementation of system that describes Illinois State publications. We recommend creating documentation that is flexible, accessible, and user-oriented. Flexible documentation not only facilitates changes to the documentation, it also allows librarians to take advantage of other uses of this documentation. In our case, the process of documentation produced a near complete listing of Illinois publications and provided the basis for a structural history of Illinois government. Documentation of classification systems not only improves library work, but also assists in preserving artifacts of library history

NEIU Digital Commons (Northeastern Illinois University)

Evidentiality in Uzbek and Kazakh

Author: Straughn Christopher
Publication venue: NEIU Digital Commons
Publication date: 01/12/2011
Field of study

The purpose of this work is to describe and account for the broad range of phenomena referred to as “evidentiality” in two Turkic languages: Uzbek and Kazakh. Much previous work on the Turkic languages treats evidentiality as a distinct verbal category. However, morphemes that express evidential meaning also often express other meanings such as dubitativity and admirativity, or may even express rhetorical questions. This work follows Friedman (1978; 1981; 1988) and others in considering these meanings to be the result of an evidential-like strategy: the expression of non-confirmativity. In Uzbek and Kazakh, as well as in many other Eurasian languages, the past tense is the locus of evidential meaning. There are three items in the Uzbek and Kazakh past tense paradigm, and these differ in terms of markedness for confirmativity: one is marked as confirmative, one as non-confirmative, and one is unmarked for confirmativity. The unmarked item, often referred to as the perfect, exists in a copular form. As a copular form, it expresses marked non-confirmativity. When this copular form (in Uzbek: ekan, in Kazakh: eken) is employed to express non-confirmativity, this non-confirmativity is manifested either as non-firsthand information source or as admirativity. By employing the non-confirmative analysis, we are able to account for the broad range of phenomena considered “evidential” without resorting to postulating an evidential category. Rather, in Uzbek and Kazakh, evidential meaning is merely one effect of the expression of non-confirmativity, which is a subtype of the categories of status or modality. xv NOTES ON ORTHOGRAPHY AND PHONOLOGY For the purpose of readabil

NEIU Digital Commons (Northeastern Illinois University)

SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages

Author: Pimentel Tiago
Ryskina Maria
Straughn Christopher
Publication venue: NEIU Digital Commons
Publication date: 01/01/2021
Field of study

This year’s iteration of the SIGMORPHON Shared Task on morphological reinflection focuses on typological diversity and cross-lingual variation of morphosyntactic features. In terms of the task, we enrich UniMorph with new data for 32 languages from 13 language families, with most of them being under-resourced: Kunwinjku, Classical Syriac, Arabic (Modern Standard, Egyptian, Gulf), Hebrew, Amharic, Aymara, Magahi, Braj, Kurdish (Central, Northern, Southern), Polish, Karelian, Livvi, Ludic, Veps, Võro, Evenki, Xibe, Tuvan, Sakha, Turkish, Indonesian, Kodi, Seneca, Asháninka, Yanesha, Chukchi, Itelmen, Eibela. We evaluate six systems on the new data and conduct an extensive error analysis of the systems’ predictions. Transformer-based models generally demonstrate superior performance on the majority of languages, achieving \u3e90% accuracy on 65% of them. The languages on which systems yielded low accuracy are mainly under-resourced, with a limited amount of data. Most errors made by the systems are due to allomorphy, honorificity, and form variation. In addition, we observe that systems especially struggle to inflect multiword lemmas. The systems also produce misspelled forms or end up in repetitive loops (e.g., RNN-based models). Finally, we report a large drop in systems’ performance on previously unseen lemmas

NEIU Digital Commons (Northeastern Illinois University)

Noun Incorporation and Case: Evidence from Sakha

Author: Straughn Christopher A.
Publication venue: 'Linguistic Society of America'
Publication date: 17/10/2006
Field of study

Proceedings Published by the LSA (Linguistic Society of America)

Clumpy Galaxies in CANDELS. I. The Definition of UV Clumps and the Fraction of Clumpy Galaxies at 0.5<z<3

Author: Barro Guillermo
Bell Eric F.
Ceverino Daniel
Conselice Christopher J.
Dekel Avishai
Faber Sandra M.
Fang Jerome J.
Ferguson Henry C.
Giavalisco Mauro
Guo Yicheng
Kassin Susan
Koekemoer Anton M.
Koo David C.
Lu Yu
Lucas Ray
Mandelker Nir
McIntosh Daniel M.
Noeske Kai
Primack Joel R.
Rafelski Marc
Ravindranath Swara
Straughn Amber
Publication venue: 'IOP Publishing'
Publication date: 01/01/2015
Field of study

Although giant clumps of stars are crucial to galaxy formation and evolution, the most basic demographics of clumps are still uncertain, mainly because the definition of clumps has not been thoroughly discussed. In this paper, we study the basic demographics of clumps in star-forming galaxies (SFGs) at 0.5<z<3, using our proposed physical definition that UV-bright clumps are discrete star-forming regions that individually contribute more than 8% of the rest-frame UV light of their galaxies. Clumps defined this way are significantly brighter than the HII regions of nearby large spiral galaxies, either individually or blended, when physical spatial resolution and cosmological dimming are considered. Under this definition, we measure the fraction of SFGs that contain at least one off-center clump (Fclumpy) and the contributions of clumps to the rest-frame UV light and star formation rate of SFGs in the CANDELS/GOODS-S and UDS fields, where our mass-complete sample consists of 3239 galaxies with axial ratio q>0.5. The redshift evolution of Fclumpy changes with the stellar mass (M*) of the galaxies. Low-mass (log(M*/Msun)<9.8) galaxies keep an almost constant Fclumpy of about 60% from z~3.0 to z~0.5. Intermediate-mass and massive galaxies drop their Fclumpy from 55% at z~3.0 to 40% and 15%, respectively, at z~0.5. We find that (1) the trend of disk stabilization predicted by violent disk instability matches the Fclumpy trend of massive galaxies; (2) minor mergers are a viable explanation of the Fclumpy trend of intermediate-mass galaxies at z<1.5, given a realistic observability timescale; and (3) major mergers are unlikely responsible for the Fclumpy trend in all masses at z<1.5. The clump contribution to the rest-frame UV light of SFGs shows a broad peak around galaxies with log(M*/Msun)~10.5 at all redshifts, possibly linked to the molecular gas fraction of the galaxies. (Abridged)Comment: 22 pages, 15 figures. Appeared in ApJ (2015, 800, 39). A few typos correcte

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Biblos-e Archivo

Major merging history in CANDELS. I. Evolution of the incidence of massive galaxy–galaxy pairs from z = 3 to z ∼ 0

The rate of major galaxy–galaxy merging is theoretically predicted to steadily increase with redshift during the peak epoch of massive galaxy development (1 ≤ z ≤ 3). We use close-pair statistics to objectively study the incidence of massive galaxies (stellar M1 > 2 × 1010 M⊙) hosting major companions (1 ≤ M1/M2 ≤ 4; i.e. 4:1) companions at z > 1. We show that these evolutionary trends are statistically robust to changes in companion proximity. We find disagreements between published results are resolved when selection criteria are closely matched. If we compute merger rates using constant fraction-to-rate conversion factors (Cmerg,pair = 0.6 and Tobs,pair = 0.65 Gyr), we find that MR rates disagree with theoretical predictions at z > 1.5. Instead, if we use an evolving Tobs,pair(z) ∝ (1 + z)−2 from Snyder et al., our MR-based rates agree with theory at 0 < z < 3. Our analysis underscores the need for detailed calibration of Cmerg,pair and Tobs,pair as a function of redshift, mass, and companion selection criteria to better constrain the empirical major merger history

Docta Complutense

HAL AMU

OA@INAF - Istituto Nazionale di Astrofisica

Leiden University Scholary Publications

MPG.PuRe

Nottingham ePrints

arXiv.org e-Print Archive

Nottingham eTheses

OPUS

Crossref

Repository@Nottingham

HAL-INSU

Lancaster E-Prints

SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages

Author: Aiton Grant
Ambridge Ben
Ataman Duygu
Ate Yustinus Ghanggo
Barta Botond
Bayyr-ool Aziyana
Bernardy Jean-Philippe
Chodroff Eleanor
Coler Matt
Cotterell Ryan
Ek Adam
El-Khaissi Charbel
Ganieva Sofya
Gasser Michael
Goldman Omer
Habash Nizar
Hatcher Richard J.
Hulden Mans
Ivanova Sardana
Khalifa Salam
Kieraś Witold
Klyachko Elena
Krizhanovsky Andrew
Krizhanovsky Natalia
Kumar Ritesh
Lakatos Dorina
Lane William
Leonard Brian
Liu Zoey
Mielke Sabrina J.
Montoya Samame Jaime Rafael
Nicolai Garett
Nuriah Zahroh
Oncevay Arturo
Pimentel Tiago
Plugaryov Matvey
Ponti Edoardo M.
Prud'hommeaux Emily
Raj Mohit
Ratan Shyam
Ryskina Maria
Salchak Aelita
Salehi Ali
Shcherbakov Andrey
Sheifer Karina
Silva Villegas Gema Celeste
Stoehr Niklas
Straughn Christopher
Suhardijanto Totok
Szolnok Gábor
Tyers Francis M.
Vania Clara
Vylomova Ekaterina
Washington Jonathan
Woliński Marcin
Wu Shijie
Yarowsky David
Ács Judit
Publication venue: The Association for Computational Linguistics
Publication date: 01/08/2021
Field of study

This year's iteration of the SIGMORPHON Shared Task on morphological reinflection focuses on typological diversity and cross-lingual variation of morphosyntactic features. In terms of the task, we enrich UniMorph with new data for 32 languages from 13 language families, with most of them being under-resourced: Kunwinjku, Classical Syriac, Arabic (Modern Standard, Egyptian, Gulf), Hebrew, Amharic, Aymara, Magahi, Braj, Kurdish (Central, Northern, Southern), Polish, Karelian, Livvi, Ludic, Veps, Võro, Evenki, Xibe, Tuvan, Sakha, Turkish, Indonesian, Kodi, Seneca, Asháninka, Yanesha, Chukchi, Itelmen, Eibela. We evaluate six systems on the new data and conduct an extensive error analysis of the systems' predictions. Transformer-based models generally demonstrate superior performance on the majority of languages, achieving >90% accuracy on 65% of them. The languages on which systems yielded low accuracy are mainly under-resourced, with a limited amount of data. Most errors made by the systems are due to allomorphy, honorificity, and form variation. In addition, we observe that systems especially struggle to inflect multiword lemmas. The systems also produce misspelled forms or end up in repetitive loops (e.g., RNN-based models). Finally, we report a large drop in systems' performance on previously unseen lemmas.Peer reviewe

Edinburgh Research Explorer

Helsingin yliopiston digitaalinen arkisto

UniMorph 4.0:Universal Morphology

Author: Aiton Grant
Anastasopoulos Antonios
Andrushko Taras
Angulo Candy
Arora Aryaman
Ataman Duygu
Ate Yustinus Ghanggo
Batsuren Khuyagbaatar
Bautista Juan López
Baxi Jatayu
Bayyr-ool Aziyana
Bella Gábor
Bernardy Jean-Philippe
Bhatt Brijesh
Budianskaya Elena
Camaiteri Delio Siticonatzi
Chodroff Eleanor
Coler Matt
Cotterell Ryan
Cruz Hilaria
Czarnowska Paula
Dirix Peter
Dolatian Hossep
Ek Adam
El-Khaissi Charbel
Francis Didier López
Ganieva Sofya
Gasser Michael
Giunchiglia Fausto
Goldman Omer
Gorman Kyle
Guriel David
Habash Nizar
Hatcher Richard J.
Hennigen Lucas Torroba
Hulden Mans
Ivanova Sardana
Karahóǧa Ritván
Khalifa Salam
Kieraś Witold
Klyachko Elena
Krizhanovskaya Natalia
Krizhanovsky Andrew
Kumar Ritesh
Lane William
Leonard Brian
Liu Zoey
Marchenko Igor
Markantonatou Stella
Mashkovtseva Polina
Maudslay Rowan Hall
McCarthy Arya D.
Mielke Sabrina J.
Nepomniashchaya Maria
Nicolai Garrett
Nikkarinen Irene
Nuriah Zahroh
Oncevay Arturo
Pavlidis George
Pimentel Tiago
Pinter Yuval
Plugaryov Matvey
Ponti Edoardo M.
Prud'hommeaux Emily
Raj Mohit
Ratan Shyam
Rodionova Daria
Rojas Esaú Zumaeta
Ryskina Maria
Salchak Aelita
Salehi Ali
Salesky Elizabeth
Samame Jaime Rafael Montoya
Scherbakov Andrey
Serova Alexandra
Sheifer Karina
Silfverberg Miikka
Stoehr Niklas
Straughn Christopher
Suhardijanto Totok
Tsarfaty Reut
Tyers Francis M.
Valvoda Josef
Vania Clara
Villegas Gema Celeste Silva
Vylomova Ekaterina
Washington Jonathan North
White Jennifer
Wolinski Marcin
Yablonskaya Anna
Yarowsky David
Yemelina Anastasia
Young Jeremiah
Zariquiey Roberto
Zmigrod Ran
Publication venue: 'Center for Open Science'
Publication date: 07/05/2022
Field of study

University of Groningen

UniMorph 4.0:Universal Morphology

Author: Aiton Grant
Anastasopoulos Antonios
Andrushko Taras
Angulo Candy
Arora Aryaman
Ataman Duygu
Ate Yustinus Ghanggo
Batsuren Khuyagbaatar
Bautista Juan López
Baxi Jatayu
Bayyr-ool Aziyana
Bella Gábor
Bernardy Jean-Philippe
Bhatt Brijesh
Budianskaya Elena
Camaiteri Delio Siticonatzi
Chodroff Eleanor
Coler Matt
Cotterell Ryan
Cruz Hilaria
Czarnowska Paula
Dirix Peter
Dolatian Hossep
Ek Adam
El-Khaissi Charbel
Francis Didier López
Ganieva Sofya
Gasser Michael
Giunchiglia Fausto
Goldman Omer
Gorman Kyle
Guriel David
Habash Nizar
Hatcher Richard J.
Hennigen Lucas Torroba
Hulden Mans
Ivanova Sardana
Karahóǧa Ritván
Khalifa Salam
Kieraś Witold
Klyachko Elena
Krizhanovskaya Natalia
Krizhanovsky Andrew
Kumar Ritesh
Lane William
Leonard Brian
Liu Zoey
Marchenko Igor
Markantonatou Stella
Mashkovtseva Polina
Maudslay Rowan Hall
McCarthy Arya D.
Mielke Sabrina J.
Nepomniashchaya Maria
Nicolai Garrett
Nikkarinen Irene
Nuriah Zahroh
Oncevay Arturo
Pavlidis George
Pimentel Tiago
Pinter Yuval
Plugaryov Matvey
Ponti Edoardo M.
Prud'hommeaux Emily
Raj Mohit
Ratan Shyam
Rodionova Daria
Rojas Esaú Zumaeta
Ryskina Maria
Salchak Aelita
Salehi Ali
Salesky Elizabeth
Samame Jaime Rafael Montoya
Scherbakov Andrey
Serova Alexandra
Sheifer Karina
Silfverberg Miikka
Stoehr Niklas
Straughn Christopher
Suhardijanto Totok
Tsarfaty Reut
Tyers Francis M.
Valvoda Josef
Vania Clara
Villegas Gema Celeste Silva
Vylomova Ekaterina
Washington Jonathan North
White Jennifer
Wolinski Marcin
Yablonskaya Anna
Yarowsky David
Yemelina Anastasia
Young Jeremiah
Zariquiey Roberto
Zmigrod Ran
Publication venue: 'Center for Open Science'
Publication date: 07/05/2022
Field of study

The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema. This paper presents the expansions and improvements made on several fronts over the last couple of years (since McCarthy et al. (2020)). Collaborative efforts by numerous linguists have added 67 new languages, including 30 endangered languages. We have implemented several improvements to the extraction pipeline to tackle some issues, e.g. missing gender and macron information. We have also amended the schema to use a hierarchical structure that is needed for morphological phenomena like multiple-argument agreement and case stacking, while adding some missing morphological features to make the schema more inclusive. In light of the last UniMorph release, we also augmented the database with morpheme segmentation for 16 languages. Lastly, this new release makes a push towards inclusion of derivational morphology in UniMorph by enriching the data and annotation schema with instances representing derivational processes from MorphyNet

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

UniMorph 4.0:Universal Morphology

Author: Aiton Grant
Anastasopoulos Antonios
Andrushko Taras
Angulo Candy
Arora Aryaman
Ataman Duygu
Ate Yustinus Ghanggo
Batsuren Khuyagbaatar
Bautista Juan López
Baxi Jatayu
Bayyr-ool Aziyana
Bella Gábor
Bernardy Jean-Philippe
Bhatt Brijesh
Budianskaya Elena
Camaiteri Delio Siticonatzi
Chodroff Eleanor
Coler Matt
Cotterell Ryan
Cruz Hilaria
Czarnowska Paula
Dirix Peter
Dolatian Hossep
Ek Adam
El-Khaissi Charbel
Francis Didier López
Ganieva Sofya
Gasser Michael
Giunchiglia Fausto
Goldman Omer
Gorman Kyle
Guriel David
Habash Nizar
Hatcher Richard J.
Hennigen Lucas Torroba
Hulden Mans
Ivanova Sardana
Karahóǧa Ritván
Khalifa Salam
Kieraś Witold
Klyachko Elena
Krizhanovskaya Natalia
Krizhanovsky Andrew
Kumar Ritesh
Lane William
Leonard Brian
Liu Zoey
Marchenko Igor
Markantonatou Stella
Mashkovtseva Polina
Maudslay Rowan Hall
McCarthy Arya D.
Mielke Sabrina J.
Nepomniashchaya Maria
Nicolai Garrett
Nikkarinen Irene
Nuriah Zahroh
Oncevay Arturo
Pavlidis George
Pimentel Tiago
Pinter Yuval
Plugaryov Matvey
Ponti Edoardo M.
Prud'hommeaux Emily
Raj Mohit
Ratan Shyam
Rodionova Daria
Rojas Esaú Zumaeta
Ryskina Maria
Salchak Aelita
Salehi Ali
Salesky Elizabeth
Samame Jaime Rafael Montoya
Scherbakov Andrey
Serova Alexandra
Sheifer Karina
Silfverberg Miikka
Stoehr Niklas
Straughn Christopher
Suhardijanto Totok
Tsarfaty Reut
Tyers Francis M.
Valvoda Josef
Vania Clara
Villegas Gema Celeste Silva
Vylomova Ekaterina
Washington Jonathan North
White Jennifer
Wolinski Marcin
Yablonskaya Anna
Yarowsky David
Yemelina Anastasia
Young Jeremiah
Zariquiey Roberto
Zmigrod Ran
Publication venue: 'Center for Open Science'
Publication date: 07/05/2022
Field of study

ARTS repository - University of Groningen