15 research outputs found

    Laos Organization Name Using Cascaded Model Based on SVM and CRF

    No full text
    According to the characteristics of Laos organization name, this paper proposes a two layer model based on conditional random field (CRF) and support vector machine (SVM) for Laos organization name recognition. A layer of model uses CRF to recognition simple organization name, and the result is used to support the decision of the second level. Based on the driving method, the second layer uses SVM and CRF to recognition the complicated organization name. Finally, the results of the two levels are combined, And by a subsequent treatment to correct results of low confidence recognition. The results show that this approach based on SVM and CRF is efficient in recognizing organization name through open test for real linguistics, and the recalling rate achieve 80. 83ï¼…and the precision rate achieves 82. 75ï¼…

    Laos Organization Name Using Cascaded Model Based on SVM and CRF

    No full text
    According to the characteristics of Laos organization name, this paper proposes a two layer model based on conditional random field (CRF) and support vector machine (SVM) for Laos organization name recognition. A layer of model uses CRF to recognition simple organization name, and the result is used to support the decision of the second level. Based on the driving method, the second layer uses SVM and CRF to recognition the complicated organization name. Finally, the results of the two levels are combined, And by a subsequent treatment to correct results of low confidence recognition. The results show that this approach based on SVM and CRF is efficient in recognizing organization name through open test for real linguistics, and the recalling rate achieve 80. 83ï¼…and the precision rate achieves 82. 75ï¼…

    Event Relation Recognition by Multi Part of Speech Association Distribution Characteristics

    No full text
    Event relation recognition, as one of natural language processing technologies, faces information stream of texts detecting event relation. By analyzing the influence of the words of different parts of speech on the relevance of events. And use the form of lexical chain to extract and store the relevant vocabulary between events, this paper propose an event relation recognization method based on lexical chain to detect latent semantic relation between events: whether events hold logical relation or not. Cornpared with the method based on dependency cue inference, the proposed method achieves 7. 68% improvement

    Method of Word Segmentation in Laos Based on Maximal Matching of Syllables

    No full text
    Word segmentation is an important support of semantic analysis, Machine Translation, QA, knowledge mapping research work, mainly used in information retrieval, text processing, data processing and many other areas of Natural Language Processing. Therefore, the realization of word segmentation is a very meaningful work. The method of this paper is to segment the syllables of the text corpus of Lao language and complete the maximal matching of syllables and dictionaries. Then match the results of the word segmentation and the error dictionary, and correct some wrong words by the error dictionary. Finally, we use regular expressions to match the corresponding word strings in segmentation results and correct the wrong words by some artificially formulated rules of the alphabet, numbers, etc. in the Lao language. It can improve the efficiency and accuracy rate of Laos Word Segmentation

    Event Relation Recognition by Multi Part of Speech Association Distribution Characteristics

    No full text
    Event relation recognition, as one of natural language processing technologies, faces information stream of texts detecting event relation. By analyzing the influence of the words of different parts of speech on the relevance of events. And use the form of lexical chain to extract and store the relevant vocabulary between events, this paper propose an event relation recognization method based on lexical chain to detect latent semantic relation between events: whether events hold logical relation or not. Cornpared with the method based on dependency cue inference, the proposed method achieves 7. 68% improvement

    Chinese-Lao Bilingual Named Entity Alignment Research

    No full text
    Chinese-Lao bilingual NE alignment has a very important significance. Three entity alignment methods are proposed in this paper. Firstly, the paper proposes the similarity of bilingual entity fuzzy matching problem. Secondly, we use bilingual entity word sequence pattern similarity to propose Chinese entity model to match Lao entity method. Then we build a naïve Bayes bilingual NE alignment model to align Chinese and Lao named entity in the comparable corpus, by mining knowledge information words of Chinese entities. In the end, the rules combine the advantages of the three methods are proposed to achieve the best results

    Method of Word Segmentation in Laos Based on Maximal Matching of Syllables

    No full text
    Word segmentation is an important support of semantic analysis, Machine Translation, QA, knowledge mapping research work, mainly used in information retrieval, text processing, data processing and many other areas of Natural Language Processing. Therefore, the realization of word segmentation is a very meaningful work. The method of this paper is to segment the syllables of the text corpus of Lao language and complete the maximal matching of syllables and dictionaries. Then match the results of the word segmentation and the error dictionary, and correct some wrong words by the error dictionary. Finally, we use regular expressions to match the corresponding word strings in segmentation results and correct the wrong words by some artificially formulated rules of the alphabet, numbers, etc. in the Lao language. It can improve the efficiency and accuracy rate of Laos Word Segmentation

    The Distribution of Words in Chinese and Laos Based on Cross Language Corpus

    No full text
    Word representation is the basic research content of natural language processing. At present, distributed representation of monolingual words has shown satisfactory application effect in some Neural Probabilistic Language (NPL) research, while as for distributed representation of cross-lingual words, there is little research both at home and abroad. Aiming at this problem given distribution similarity of nouns and verbs in these two languages, we embed mutual translated words, synonyms, super-ordinates into Chinese corpus by the weakly supervised learning extension approach and other methods, thus Laos word distribution in cross-lingual environment of Chinese and Laos is learned. We applied the distributed representation of the cross-lingual words learned before to compute similarities of bilingual texts and classify the mixed text corpus of Chinese and Laos, Experimental results show that the proposal has a satisfactory effect on the two tasks

    The Distribution of Words in Chinese and Laos Based on Cross Language Corpus

    No full text
    Word representation is the basic research content of natural language processing. At present, distributed representation of monolingual words has shown satisfactory application effect in some Neural Probabilistic Language (NPL) research, while as for distributed representation of cross-lingual words, there is little research both at home and abroad. Aiming at this problem given distribution similarity of nouns and verbs in these two languages, we embed mutual translated words, synonyms, super-ordinates into Chinese corpus by the weakly supervised learning extension approach and other methods, thus Laos word distribution in cross-lingual environment of Chinese and Laos is learned. We applied the distributed representation of the cross-lingual words learned before to compute similarities of bilingual texts and classify the mixed text corpus of Chinese and Laos, Experimental results show that the proposal has a satisfactory effect on the two tasks

    Chinese-Lao Bilingual Named Entity Alignment Research

    No full text
    Chinese-Lao bilingual NE alignment has a very important significance. Three entity alignment methods are proposed in this paper. Firstly, the paper proposes the similarity of bilingual entity fuzzy matching problem. Secondly, we use bilingual entity word sequence pattern similarity to propose Chinese entity model to match Lao entity method. Then we build a naïve Bayes bilingual NE alignment model to align Chinese and Lao named entity in the comparable corpus, by mining knowledge information words of Chinese entities. In the end, the rules combine the advantages of the three methods are proposed to achieve the best results
    corecore