5 research outputs found
Adpositional Supersenses for Mandarin Chinese
This study adapts Semantic Network of Adposition and Case Supersenses (SNACS) annotation to Mandarin Chinese and demonstrates that the same supersense categories are appropriate for Chinese adposition semantics. We annotated 20 chapters of The Little Prince, with high interannotator agreement. The parallel corpus substantiates the applicability of construal analysis in Chinese and gives insight into the differences in construals between adpositions in two languages. The corpus can further support automatic disambiguation of adpositions in Chinese, and the common inventory of supersenses between the two languages can potentially serve cross-linguistic tasks such as machine translation
Recommended from our members
SNACS Annotation of Case Markers and Adpositions in Hindi
We present in-progress annotation of semantic relations expressed through adpositions and case markers in a Hindi corpus. We used the multilingual SNACS annotation scheme, which has been applied to a variety of typologically diverse languages. Annotation problems in Hindi are examined and used to suggest changes to SNACS. We look towards finalizing the corpus and using it for future work in typology and semantic role-dependent tasks
Lexical Semantic Recognition
In lexical semantics, full-sentence segmentation and segment labeling of
various phenomena are generally treated separately, despite their
interdependence. We hypothesize that a unified lexical semantic recognition
task is an effective way to encapsulate previously disparate styles of
annotation, including multiword expression identification / classification and
supersense tagging. Using the STREUSLE corpus, we train a neural CRF sequence
tagger and evaluate its performance along various axes of annotation. As the
label set generalizes that of previous tasks (PARSEME, DiMSUM), we additionally
evaluate how well the model generalizes to those test sets, finding that it
approaches or surpasses existing models despite training only on STREUSLE. Our
work also establishes baseline models and evaluation metrics for integrated and
accurate modeling of lexical semantics, facilitating future work in this area.Comment: 11 pages, 3 figures; to appear at MWE 202
Reanalyzing L2 Preposition Learning with Bayesian Mixed Effects and a Large Language Model
We use both Bayesian and neural models to dissect a data set of Chinese
learners' pre- and post-interventional responses to two tests measuring their
understanding of English prepositions. The results mostly replicate previous
findings from frequentist analyses and newly reveal crucial interactions
between student ability, task type, and stimulus sentence. Given the sparsity
of the data as well as high diversity among learners, the Bayesian method
proves most useful; but we also see potential in using language model
probabilities as predictors of grammaticality and learnability