39 research outputs found
Inducing Constraint Grammars
Constraint Grammar rules are induced from corpora. A simple scheme based on
local information, i.e., on lexical biases and next-neighbour contexts,
extended through the use of barriers, reached 87.3 percent precision (1.12
tags/word) at 98.2 percent recall. The results compare favourably with other
methods that are used for similar tasks although they are by no means as good
as the results achieved using the original hand-written rules developed over
several years time.Comment: 10 pages, uuencoded, gzipped PostScrip
Recognizing Text Genres with Simple Metrics Using Discriminant Analysis
A simple method for categorizing texts into predetermined text genre
categories using the statistical standard technique of discriminant analysis is
demonstrated with application to the Brown corpus. Discriminant analysis makes
it possible use a large number of parameters that may be specific for a certain
corpus or information stream, and combine them into a small number of
functions, with the parameters weighted on basis of how useful they are for
discriminating text genres. An application to information retrieval is
discussed.Comment: 6 pages, LaTeX, In proceedings of COLING 9
Use of Weighted Finite State Transducers in Part of Speech Tagging
This paper addresses issues in part of speech disambiguation using
finite-state transducers and presents two main contributions to the field. One
of them is the use of finite-state machines for part of speech tagging.
Linguistic and statistical information is represented in terms of weights on
transitions in weighted finite-state transducers. Another contribution is the
successful combination of techniques -- linguistic and statistical -- for word
disambiguation, compounded with the notion of word classes.Comment: uses psfig, ipamac
FinnTreeBank: Creating a research resource and service for language researchers with Constraint Grammar
Proceedings of the NODALIDA 2011 Workshop
Constraint Grammar Applications.
Editors: Eckhard Bick, Kristin Hagen, Kaili Müürisep, Trond Trosterud.
NEALT Proceedings Series, Vol. 14 (2011), 41–49.
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/19231