6 research outputs found

    Comprehensive Part-Of-Speech Tag Set and SVM Based POS Tagger for Sinhala

    No full text
    This paper presents a new comprehensive multi-level Part-Of-Speech tag set and a Support Vector Machine based Part-Of-Speech tagger for the Sinhala language. The currently available tag set for Sinhala has two limitations: the unavailability of tags to represent some word classes and the lack of tags to capture inflection based grammatical variations of words. The new tag set, presented in this paper overcomes both of these limitations. The accuracy of available Sinhala Part-Of-Speech taggers, which are based on Hidden Markov Models, still falls far behind state of the art. Our Support Vector Machine based tagger achieved an overall accuracy of 84.68% with 59.86% accuracy for unknown words and 87.12% for known words, when the test set contains 10% of unknown words

    Proinflammatory Cytokine IL-17 Shows a Significant Association with Helicobacter pylori Infection and Disease Severity

    No full text
    Background. The pro- and anti-inflammatory cytokines play an important role in the immune response against H. pylori infection. The proinflammatory cytokines of Th17 cells have been suggested to play a major role in H. pylori infection and resulting gastric inflammation. Objective. The objective of this study was to compare the expression of selected inflammatory cytokines (IL-10, IL-17, IL-21, IL-23, and TNF-α) in H. pylori-infected patients and healthy controls and to understand their association with H. pylori infection and disease severity. Results. The expression levels of IL-17 and IL-23 were significantly higher in H. pylori-infected patients. The expression of IL-21 was also higher in H. pylori-positive patients but there was no significant association with infection. IL-17 expression showed a significant increase with the severity of chronic gastritis. Conclusion. The proinflammatory cytokine, IL-17, shows a significant association with H. pylori infection and disease severity in a Sri Lankan dyspeptic patient population

    Automatic Creation of a Sentence Aligned Sinhala-Tamil Parallel Corpus

    No full text
    A sentence aligned parallel corpus is an important prerequisite in statistical machine translation. However, manual creation of such a parallel corpus is time consuming, and requires experts fluent in both languages. Automatic creation of a sentence aligned parallel corpus using parallel text is the solution to this problem. In this paper, we present the first ever empirical evaluation carried out to identify the best method to automatically create a sentence aligned Sinhala-Tamil parallel corpus. Annual reports from Sri Lankan government institutions were used as the parallel text for aligning. Despite both Sinhala and Tamil being under-resourced languages, we were able to achieve an F-score value of 0.791 using a hybrid approach that makes use of a bilingual dictionary
    corecore