62,781 research outputs found

    JAPANESE-MALAY E-DICTIONARY WITH VOICE RECOGNITION

    Get PDF
    This report is to give a brief introduction to a project that is going to be done under the Artificial Intelligent field. This project is based on the natural language processing which an electronic dictionary will be built for Japanese-Malay language translation. The intelligent part of this e-dictionary is that it will be integrated with voice recognition program to allow user to interact with it. The e-dictionary will be built similar to the current e-dictionary available in the market today. The end-product may not be a full-functioning device by itself, but it may be a prDlmype to be tested on a computer. The program for the e-dictionary will be using Microsofi~ Visual Basic 6.0 for the system. As voice recognition system will be added to the system, the speech recognition engine and text-to-speech engine available for 'viicrosoft Visual Basic 6.0 language will be used in order to take advantage of its library. 'vficrosofi: Office Access 2003 will be used for the database of this system

    JAPANESE-MALAY E-DICTIONARY WITH VOICE RECOGNITION

    Get PDF
    This report is to give a brief introduction to a project that is going to be done under the Artificial Intelligent field. This project is based on the natural language processing which an electronic dictionary will be built for Japanese-Malay language translation. The intelligent part of this e-dictionary is that it will be integrated with voice recognition program to allow user to interact with it. The e-dictionary will be built similar to the current e-dictionary available in the market today. The end-product may not be a full-functioning device by itself, but it may be a prDlmype to be tested on a computer. The program for the e-dictionary will be using Microsofi~ Visual Basic 6.0 for the system. As voice recognition system will be added to the system, the speech recognition engine and text-to-speech engine available for 'viicrosoft Visual Basic 6.0 language will be used in order to take advantage of its library. 'vficrosofi: Office Access 2003 will be used for the database of this system

    Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences

    Full text link
    Given the lack of word delimiters in written Japanese, word segmentation is generally considered a crucial first step in processing Japanese texts. Typical Japanese segmentation algorithms rely either on a lexicon and syntactic analysis or on pre-segmented data; but these are labor-intensive, and the lexico-syntactic techniques are vulnerable to the unknown word problem. In contrast, we introduce a novel, more robust statistical method utilizing unsegmented training data. Despite its simplicity, the algorithm yields performance on long kanji sequences comparable to and sometimes surpassing that of state-of-the-art morphological analyzers over a variety of error metrics. The algorithm also outperforms another mostly-unsupervised statistical algorithm previously proposed for Chinese. Additionally, we present a two-level annotation scheme for Japanese to incorporate multiple segmentation granularities, and introduce two novel evaluation metrics, both based on the notion of a compatible bracket, that can account for multiple granularities simultaneously.Comment: 22 pages. To appear in Natural Language Engineerin

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    Efficient deep processing of japanese

    Get PDF
    We present a broad coverage Japanese grammar written in the HPSG formalism with MRS semantics. The grammar is created for use in real world applications, such that robustness and performance issues play an important role. It is connected to a POS tagging and word segmentation tool. This grammar is being developed in a multilingual context, requiring MRS structures that are easily comparable across languages
    • …
    corecore