Search CORE

31 research outputs found

FinnTreeBank: Creating a research resource and service for language researchers with Constraint Grammar

Author: Voutilainen Atro
Publication venue
Publication date: 17/11/2011
Field of study

Proceedings of the NODALIDA 2011 Workshop Constraint Grammar Applications. Editors: Eckhard Bick, Kristin Hagen, Kaili Müürisep, Trond Trosterud. NEALT Proceedings Series, Vol. 14 (2011), 41–49. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/19231

DSpace at Tartu University Library

Inducing Constraint Grammars

Author: Samuelsson Christer
Tapanainen Pasi
Voutilainen Atro
Publication venue
Publication date: 01/01/1996
Field of study

Constraint Grammar rules are induced from corpora. A simple scheme based on local information, i.e., on lexical biases and next-neighbour contexts, extended through the use of barriers, reached 87.3 percent precision (1.12 tags/word) at 98.2 percent recall. The results compare favourably with other methods that are used for similar tasks although they are by no means as good as the results achieved using the original hand-written rules developed over several years time.Comment: 10 pages, uuencoded, gzipped PostScrip

arXiv.org e-Print Archive

CiteSeerX

A double-blind experiment on interannotator agreement: the case of dependency syntax and Finnish

Author: Purtonen Tanja
Voutilainen Atro
Publication venue
Publication date: 10/05/2011
Field of study

Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011. Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa. NEALT Proceedings Series, Vol. 11 (2011), 319-322. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/1695

DSpace at Tartu University Library

Compiling and Using Finite-State Syntactic Rules

Author: Koskenniemi Kimmo
Tapanainen Pasi
Voutilainen Atro
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/1992
Field of study

Proceeding volume: 1A language-independent framework for syntactic finlte-state parsing is discussed. The article presents a framework, a formalism, a compiler and a parser for grammars written in this forrealism. As a substantial example, fragments from a nontrivial finite-state grammar of English are discussed. The linguistic framework of the present approach is based on a surface syntactic tagging scheme by F. Karlsson. This representation is slightly less powerful than phrase structure tree notation, letUng some ambiguous constructions be described more concisely. The finite-state rule compiler implements what was briefly sketched by Koskenniemi (1990). It is based on the calculus of finite-state machines. The compiler transforms rules into rule-automata. The run-time parser exploits one of certain alternative strategies in performing the effective intersection of the rule automata and the sentence automaton. Fragments of a fairly comprehensive finite-state granmmr of English are presented here, including samples from non-finite constructions as a demonstration of the capacity of the present formalism, which goes far beyond plain disamblguation or part of speech tagging. The grammar itself is directly related to a parser and tagging system for English created as a part of project SIMPR I using Karlsson's CG (Constraint Grammar) formalism.Peer reviewe

CiteSeerX

Crossref

Helsingin yliopiston digitaalinen arkisto

Analysing Finnish with word lists : The DDI approach to morphology revisited

Author: Palolahti Maria Johanna
Voutilainen Atro Tapio
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/2018
Field of study

Morphological lexicons for morphologically complex languages provide good text coverage at the cost of overgeneration, difficulty of modification, and sometimes performance issues. Use of simple, manageable lexicon forms – especially lists – for morphologically complex languages may appear unviable because the number of possible word-forms in a morphologically complex language can be prohibitively high. We created and experimented with a list-based lexicon for a morphologically complex language (Finnish), and compared its coverage with that of a mature morphological analyser on new text in two experimental settings. The observed smallish difference in coverage suggests the viability of using simple and easy-to-modify list-based lexicons as an initial part of morphological analysis, to increase developer control on the vast majority of input tokens.Morphological lexicons for morphologically complex languages provide good text coverage at the cost of overgeneration, difficulty of modification, and sometimes performance issues. Use of simple, manageable lexicon forms – especially lists – for morphologically complex languages may appear unviable because the number of possible word-forms in a morphologically complex language can be prohibitively high. We created and experimented with a list-based lexicon for a morphologically complex language (Finnish), and compared its coverage with that of a mature morphological analyser on new text in two experimental settings. The observed smallish difference in coverage suggests the viability of using simple and easy-to-modify list-based lexicons as an initial part of morphological analysis, to increase developer control on the vast majority of input tokens

Crossref

Helsingin yliopiston digitaalinen arkisto