A Probabilistic Generative Model of Linguistic Typology

Augenstein, Isabelle; Bjerva, Johannes; Cotterell, Ryan; Kementchedjhieva, Yova

slides

A Probabilistic Generative Model of Linguistic Typology

Authors: Isabelle Augenstein
Johannes Bjerva
Ryan Cotterell
Yova Kementchedjhieva
Publication date: 1 January 2019
Publisher
Doi

Abstract

In the principles-and-parameters framework, the structural features of languages depend on parameters that may be toggled on or off, with a single parameter often dictating the status of multiple features. The implied covariance between features inspires our probabilisation of this line of linguistic inquiry---we develop a generative model of language based on exponential-family matrix factorisation. By modelling all languages and features within the same architecture, we show how structural similarities between languages can be exploited to predict typological features with near-perfect accuracy, outperforming several baselines on the task of predicting held-out features. Furthermore, we show that language embeddings pre-trained on monolingual text allow for generalisation to unobserved languages. This finding has clear practical and also theoretical implications: the results confirm what linguists have hypothesised, i.e.~that there are significant correlations between typological features and languages.Comment: NAACL 2019, 12 page

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Copenhagen University Research Information System

oai:pure.atira.dk:publications...

Last time updated on 12/04/2020

Crossref

Last time updated on 10/08/2021