43 research outputs found
NEALT Proceedings Series
Proceedings of the Ninth International Workshop
on Treebanks and Linguistic Theories.
Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti.
NEALT Proceedings Series, Vol. 9 (2010), 257-258.
© 2010 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/15891
Preface
Proceedings of the 17th Nordic Conference of Computational Linguistics
NODALIDA 2009.
Editors: Kristiina Jokinen and Eckhard Bick.
NEALT Proceedings Series, Vol. 4 (2009), vii-viii.
© 2009 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/9206
Title Pages
Proceedings of the 17th Nordic Conference of Computational Linguistics
NODALIDA 2009.
Editors: Kristiina Jokinen and Eckhard Bick.
NEALT Proceedings Series, Vol. 4 (2009), i-ii.
© 2009 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/9206
Proceedings
Proceedings of the NODALIDA 2009 workshop
Nordic Perspectives on the CLARIN Infrastructure of Language Resources.
Editors: Rickard Domeij, Kimmo Koskenniemi, Steven Krauwer, Bente Maegaard,
Eiríkur Rögnvaldsson and Koenraad de Smedt.
NEALT Proceedings Series, Vol. 5 (2009), v+45 pp.
© 2009 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/9207
Insights into Analogy Completion from the Biomedical Domain
Analogy completion has been a popular task in recent years for evaluating the
semantic properties of word embeddings, but the standard methodology makes a
number of assumptions about analogies that do not always hold, either in recent
benchmark datasets or when expanding into other domains. Through an analysis of
analogies in the biomedical domain, we identify three assumptions: that of a
Single Answer for any given analogy, that the pairs involved describe the Same
Relationship, and that each pair is Informative with respect to the other. We
propose modifying the standard methodology to relax these assumptions by
allowing for multiple correct answers, reporting MAP and MRR in addition to
accuracy, and using multiple example pairs. We further present BMASS, a novel
dataset for evaluating linguistic regularities in biomedical embeddings, and
demonstrate that the relationships described in the dataset pose significant
semantic challenges to current word embedding methods.Comment: Accepted to BioNLP 2017. (10 pages
RUSSE'2018 : a shared task on word sense induction for the Russian language
The paper describes the results of the first shared task on word sense induction (WSI) for the Russian language. While similar shared tasks were conducted in the past for some Romance and Germanic languages, we explore the performance of sense induction and disambiguation methods for a Slavic language that shares many features with other Slavic languages, such as rich morphology and free word order. The participants were asked to group contexts with a given word in accordance with its senses that were not provided beforehand. For instance, given a word “bank” and a set of contexts with this word, e.g. “bank is a financial institution that accepts deposits” and “river bank is a slope beside a body of water”, a participant was asked to cluster such contexts in the unknown in advance number of clusters corresponding to, in this case, the “company” and the “area” senses of the word “bank”. For the purpose of this evaluation campaign, we developed three new evaluation datasets based on sense inventories that have different sense granularity. The contexts in these datasets were sampled from texts of Wikipedia, the academic corpus of Russian, and an explanatory dictionary of Russian. Overall 18 teams participated in the competition submitting 383 models. Multiple teams managed to substantially outperform competitive state-of-the-art baselines from the previous years based on sense embeddings
Proceedings
Proceedings of the NODALIDA 2009 workshop
Constraint Grammar and robust parsing.
Editors: Eckhard Bick, Kristin Hagen, Kaili Müürisep and
Trond Trosterud.
NEALT Proceedings Series, Vol. 8 (2009), 33 pages.
© 2009 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/14180
Proceedings
Proceedings of the NODALIDA 2011 Workshop
Constraint Grammar Applications.
Editors: Eckhard Bick, Kristin Hagen, Kaili Müürisep, Trond Trosterud.
NEALT Proceedings Series, Vol. 14 (2011), vi+69 pp.
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/19231