2 research outputs found

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    Chinese Word Segmentation And its Effects on Chinese Information Retrieval

    Get PDF
    This experiment tests the effectiveness of Chinese information retrieval using a segmenter that is developed with dictionary-based Maximum Forward Matching algorithm. IRTOOLS, an IR system developed at UNC Chapel Hill, is used as the platform. This study finds that less accurate segmentation will not necessarily yield worse information retrieval results. As a matter of fact, allowing two-character words only in the dictionary produced the best retrieval results in terms of precision and recall. Allowing longer words in the dictionary will lead to the missing of index words -- the problem of over-specification. However, long-word indexing can produce better results when the long-word is also used in queries
    corecore