50 research outputs found

    Extracting Noun Phrases from Large-Scale Texts: A Hybrid Approach and Its Automatic Evaluation

    Full text link
    To acquire noun phrases from running texts is useful for many applications, such as word grouping,terminology indexing, etc. The reported literatures adopt pure probabilistic approach, or pure rule-based noun phrases grammar to tackle this problem. In this paper, we apply a probabilistic chunker to deciding the implicit boundaries of constituents and utilize the linguistic knowledge to extract the noun phrases by a finite state mechanism. The test texts are SUSANNE Corpus and the results are evaluated by comparing the parse field of SUSANNE Corpus automatically. The results of this preliminary experiment are encouraging.Comment: 8 pages, Postscript file, Unix compressed, uuencode

    Extracting Conceptual Terms from Medical Documents

    Get PDF
    Automated biomedical concept recognition is important for biomedical document retrieval and text mining research. In this paper, we describe a two-step concept extraction technique for documents in biomedical domain. Step one includes noun phrase extraction, which can automatically extract noun phrases from medical documents. Extracted noun phrases are used as concept term candidates which become inputs of next step. Step two includes keyphrase extraction, which can automatically identify important topical terms from candidate terms. Experiments were conducted to evaluate results of both steps. The experiment results show that our noun phrase extractor is effective in identifying noun phrases from medical documents, so is the keyphrase extractor in identifying document conceptual terms

    A Hungarian NP Chunker

    Get PDF

    Automatic identification of terms for the generation of students’ concept maps

    Full text link
    Proceedings of the 4th International Conference on Multimedia and Information and Communication Technologies in Education, M-icte 2006, held in Seville (Spain) on November 2006Willow, an adaptive multilingual free-text Computer-Assisted Assessment system, automatically evaluates students’ free-text answers given a set of correct ones. This paper presents an extension of the system in order to generate the students’ concept maps while they are being assessed. To that aim, a new module for the automatic identification of the terms of a particular knowledge field has been created. It identifies and keeps track of the terms that are being used in the students’ answers, and calculates a confidence score of the student's knowledge about each term. An empyrical evaluation using the students' real answers show that it is robust enough to generate a good set of terms from a very small set of answers.This work has been sponsored by Spanish Ministry of Science and Technology, project number TIN2004-0314

    Use of Weighted Finite State Transducers in Part of Speech Tagging

    Full text link
    This paper addresses issues in part of speech disambiguation using finite-state transducers and presents two main contributions to the field. One of them is the use of finite-state machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on transitions in weighted finite-state transducers. Another contribution is the successful combination of techniques -- linguistic and statistical -- for word disambiguation, compounded with the notion of word classes.Comment: uses psfig, ipamac
    corecore