50 research outputs found
Extracting Noun Phrases from Large-Scale Texts: A Hybrid Approach and Its Automatic Evaluation
To acquire noun phrases from running texts is useful for many applications,
such as word grouping,terminology indexing, etc. The reported literatures adopt
pure probabilistic approach, or pure rule-based noun phrases grammar to tackle
this problem. In this paper, we apply a probabilistic chunker to deciding the
implicit boundaries of constituents and utilize the linguistic knowledge to
extract the noun phrases by a finite state mechanism. The test texts are
SUSANNE Corpus and the results are evaluated by comparing the parse field of
SUSANNE Corpus automatically. The results of this preliminary experiment are
encouraging.Comment: 8 pages, Postscript file, Unix compressed, uuencode
Extracting Conceptual Terms from Medical Documents
Automated biomedical concept recognition is important for biomedical document retrieval and text mining research. In this paper, we describe a two-step concept extraction technique for documents in biomedical domain. Step one includes noun phrase extraction, which can automatically extract noun phrases from medical documents. Extracted noun phrases are used as concept term candidates which become inputs of next step. Step two includes keyphrase extraction, which can automatically identify important topical terms from candidate terms. Experiments were conducted to evaluate results of both steps. The experiment results show that our noun phrase extractor is effective in identifying noun phrases from medical documents, so is the keyphrase extractor in identifying document conceptual terms
Automatic identification of terms for the generation of students’ concept maps
Proceedings of the 4th International Conference on Multimedia and Information and Communication Technologies in Education, M-icte 2006, held in Seville (Spain) on November 2006Willow, an adaptive multilingual free-text Computer-Assisted Assessment system, automatically
evaluates students’ free-text answers given a set of correct ones. This paper presents an extension of the
system in order to generate the students’ concept maps while they are being assessed. To that aim, a new
module for the automatic identification of the terms of a particular knowledge field has been created. It
identifies and keeps track of the terms that are being used in the students’ answers, and calculates a confidence
score of the student's knowledge about each term. An empyrical evaluation using the students' real
answers show that it is robust enough to generate a good set of terms from a very small set of answers.This work has been sponsored by Spanish Ministry of Science and Technology, project number TIN2004-0314
Use of Weighted Finite State Transducers in Part of Speech Tagging
This paper addresses issues in part of speech disambiguation using
finite-state transducers and presents two main contributions to the field. One
of them is the use of finite-state machines for part of speech tagging.
Linguistic and statistical information is represented in terms of weights on
transitions in weighted finite-state transducers. Another contribution is the
successful combination of techniques -- linguistic and statistical -- for word
disambiguation, compounded with the notion of word classes.Comment: uses psfig, ipamac