2 research outputs found

    Automatically Finding Significant Topical Terms from Documents

    Get PDF
    With the pervasion of digital textual data, text mining is becoming more and more important to deriving competitive advantages. One factor for successful text mining applications is the ability of finding significant topical terms for discovering interesting patterns or relationships. Document keyphrases are phrases carrying the most important topical concepts for a given document. In many applications, keyphrases as textual elements are better suited for text mining and could provide more discriminating power than single words. This paper describes an automatic keyphrase identification program (KIP). KIP’s algorithm examines the composition of noun phrases and calculates their scores by looking up a domain-specific glossary database; the ones with higher scores are extracted as keyphrases. KIP’s learning function can enrich its glossary database by automatically adding new identified keyphrases. KIP’s personalization feature allows the user build a glossary database specifically suitable for the area of his/her interest

    Extracting Conceptual Terms from Medical Documents

    Get PDF
    Automated biomedical concept recognition is important for biomedical document retrieval and text mining research. In this paper, we describe a two-step concept extraction technique for documents in biomedical domain. Step one includes noun phrase extraction, which can automatically extract noun phrases from medical documents. Extracted noun phrases are used as concept term candidates which become inputs of next step. Step two includes keyphrase extraction, which can automatically identify important topical terms from candidate terms. Experiments were conducted to evaluate results of both steps. The experiment results show that our noun phrase extractor is effective in identifying noun phrases from medical documents, so is the keyphrase extractor in identifying document conceptual terms
    corecore