2,513 research outputs found
Sense Tagging: Semantic Tagging with a Lexicon
Sense tagging, the automatic assignment of the appropriate sense from some
lexicon to each of the words in a text, is a specialised instance of the
general problem of semantic tagging by category or type. We discuss which
recent word sense disambiguation algorithms are appropriate for sense tagging.
It is our belief that sense tagging can be carried out effectively by combining
several simple, independent, methods and we include the design of such a
tagger. A prototype of this system has been implemented, correctly tagging 86%
of polysemous word tokens in a small test set, providing evidence that our
hypothesis is correct.Comment: 6 pages, uses aclap LaTeX style file. Also in Proceedings of the
SIGLEX Workshop "Tagging Text with Lexical Semantics
A comparative evaluation of deep and shallow approaches to the automatic detection of common grammatical errors
This paper compares a deep and a shallow processing approach to the problem of classifying a sentence as grammatically wellformed or ill-formed. The deep processing
approach uses the XLE LFG parser and English grammar: two versions are presented, one which uses the XLE directly to perform the classification, and another one which uses a decision tree trained on features consisting of the XLE’s output statistics. The shallow processing approach predicts grammaticality based on n-gram frequency statistics:
we present two versions, one which uses frequency thresholds and one which uses a decision tree trained on the frequencies of the rarest n-grams in the input sentence.
We find that the use of a decision tree improves on the basic approach only for the deep parser-based approach. We also show that combining both the shallow and deep
decision tree features is effective. Our evaluation
is carried out using a large test set of grammatical and ungrammatical sentences. The ungrammatical test set is generated automatically by inserting grammatical errors
into well-formed BNC sentences
A Lexicalized Tree-Adjoining Grammar for Vietnamese
In this paper, we present the first sizable grammar built for Vietnamese using LTAG, developed over the past two years, named vnLTAG. This grammar aims at modelling written language and is general enough to be both application- and domain-independent. It can be used for the morpho-syntactic tagging and syntactic parsing of Vietnamese texts, as well as text generation. We then present a robust parsing scheme using vnLTAG and a parser for the grammar. We finish with an evaluation using a test suite
Automatic Extraction of Subcategorization from Corpora
We describe a novel technique and implemented system for constructing a
subcategorization dictionary from textual corpora. Each dictionary entry
encodes the relative frequency of occurrence of a comprehensive set of
subcategorization classes for English. An initial experiment, on a sample of 14
verbs which exhibit multiple complementation patterns, demonstrates that the
technique achieves accuracy comparable to previous approaches, which are all
limited to a highly restricted set of subcategorization classes. We also
demonstrate that a subcategorization dictionary built with the system improves
the accuracy of a parser by an appreciable amount.Comment: 8 pages; requires aclap.sty. To appear in ANLP-9
- …