Search CORE

23 research outputs found

Inducing Baseform Models from a Swedish Vocabulary Pool

Author: Forsbom Eva
Publication venue
Publication date: 21/05/2007
Field of study

Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Joakim Nivre, Heiki-Jaan Kaalep, Kadri Muischnek and Mare Koit. University of Tartu, Tartu, 2007. ISBN 978-9985-4-0513-0 (online) ISBN 978-9985-4-0514-7 (CD-ROM) pp. 51-58

DSpace at Tartu University Library

Extending the View: Explorations in Bootstrapping a Swedish PoS Tagger

Author: Forsbom Eva
Publication venue
Publication date: 11/05/2009
Field of study

Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kristiina Jokinen and Eckhard Bick. NEALT Proceedings Series, Vol. 4 (2009), 34-40. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9206

DSpace at Tartu University Library

Swedish CLARIN activities

Author: Andréasson Maia
Beskow Jonas
Borin Lars
Carlson Rolf
Edlund Jens
Elenius Kjell
Eriksson Anders
Forsberg Markus
Forsbom Eva
Hellmer Kahl
House David
Megyesi Beáta
Merkel Magnus
Strömqvist Sven
Publication venue
Publication date: 12/05/2009
Field of study

Proceedings of the NODALIDA 2009 workshop Nordic Perspectives on the CLARIN Infrastructure of Language Resources. Editors: Rickard Domeij, Kimmo Koskenniemi, Steven Krauwer, Bente Maegaard, Eiríkur Rögnvaldsson and Koenraad de Smedt. NEALT Proceedings Series, Vol. 5 (2009), 1-5. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9207

DSpace at Tartu University Library

Training a Super Model Look-Alike: Featuring Edit Distance, N-Gram Occurrence, and One Reference Translation

Author: Eva Forsbom
Publication venue
Publication date: 01/01/2003
Field of study

Two string comparison measures, edit distance and n-gram co-occurrence, are tested for automatic evaluation of translation quality, where the quality is compared to one or several reference translations. The measures are tested in combination for diagnostic evaluation on segments. Both measures have been used for evaluation of translation quality before, but for another evaluation purpose (performance) and with another granularity (system). Preliminary experiments showed that the measures are not portable without redefinitions, so two new measures are defined, WAFT and NEVA. The new measures could be applied for both purposes and granularities

CiteSeerX

Feature Combination for Genre Classification

Author: Eva Forsbom
Publication venue
Publication date
Field of study

In this paper, we describe an experiment on genre classification of Swedish texts, using as predictors the frequency of the top 50 most frequent words in the text collection Stockholm-Umeå Corpus (SUC). The purpose of this particular experiment was to find out if the combination of features in a fully-connected feedforward multi-layer perceptron (MLP) gives better classification than single features in a decision tree. The 1,040 text samples in SUC, classified into 9 major genres, were divided into 10 sets, and used for 10-fold cross-validation training of 10 MLPs (50-7-9), where the hidden layer is supposed to correspond to the 7 stylistic dimensions of Biber (1995). The result was better than for a previous experiment using a decision tree (48.6 vs. 58.8 % misclassification). Given the simplicity of the predictors, the sparse data and skewed distribution of genres in the text collection, the result is rather promising. In order to explain the knowledge learnt by the MLPs, we also extracted decision trees from the input and output of the MLPs. Extra input was generated by sampling from the feature space of the original training data. The resulting trees used finer distinctions (more branches) than the tree from the previous experiment, about the same features but with additional split points, and a few more features.

CiteSeerX

Revision of Part-of-Speech Tagging in Stockholm Umeå Corpus 2.0

Author: Forsbom Eva
Wilhelmsson Kenneth
Publication venue
Publication date: 01/01/2010
Field of study

Many parsers use a part-of-speech tagger as a ﬁrst step in parsing. The accuracy of the tagger naturally affects the performance of the parser. In this experiment, we revise 1500+ proposed errors in SUC 2.0 that were mainly found during work with schema parsing, and evaluate tagger instances trained on the revised corpus. The revisions turned out to be beneﬁcial also for the taggers.Samarbete med Eva Forsbom, Uppsala universite

University of Borås