183 research outputs found
Density Matching for Bilingual Word Embedding
Recent approaches to cross-lingual word embedding have generally been based
on linear transformations between the sets of embedding vectors in the two
languages. In this paper, we propose an approach that instead expresses the two
monolingual embedding spaces as probability densities defined by a Gaussian
mixture model, and matches the two densities using a method called normalizing
flow. The method requires no explicit supervision, and can be learned with only
a seed dictionary of words that have identical strings. We argue that this
formulation has several intuitively attractive properties, particularly with
the respect to improving robustness and generalization to mappings between
difficult language pairs or word pairs. On a benchmark data set of bilingual
lexicon induction and cross-lingual word similarity, our approach can achieve
competitive or superior performance compared to state-of-the-art published
results, with particularly strong results being found on etymologically distant
and/or morphologically rich languages.Comment: Accepted by NAACL-HLT 201
Deep Clustering of Text Representations for Supervision-free Probing of Syntax
We explore deep clustering of text representations for unsupervised model
interpretation and induction of syntax. As these representations are
high-dimensional, out-of-the-box methods like KMeans do not work well. Thus,
our approach jointly transforms the representations into a lower-dimensional
cluster-friendly space and clusters them. We consider two notions of syntax:
Part of speech Induction (POSI) and constituency labelling (CoLab) in this
work. Interestingly, we find that Multilingual BERT (mBERT) contains surprising
amount of syntactic knowledge of English; possibly even as much as English BERT
(EBERT). Our model can be used as a supervision-free probe which is arguably a
less-biased way of probing. We find that unsupervised probes show benefits from
higher layers as compared to supervised probes. We further note that our
unsupervised probe utilizes EBERT and mBERT representations differently,
especially for POSI. We validate the efficacy of our probe by demonstrating its
capabilities as an unsupervised syntax induction technique. Our probe works
well for both syntactic formalisms by simply adapting the input
representations. We report competitive performance of our probe on 45-tag
English POSI, state-of-the-art performance on 12-tag POSI across 10 languages,
and competitive results on CoLab. We also perform zero-shot syntax induction on
resource impoverished languages and report strong results
- …