7,605 research outputs found

    Methods for Amharic part-of-speech tagging

    Get PDF
    The paper describes a set of experiments involving the application of three state-of- the-art part-of-speech taggers to Ethiopian Amharic, using three different tagsets. The taggers showed worse performance than previously reported results for Eng- lish, in particular having problems with unknown words. The best results were obtained using a Maximum Entropy ap- proach, while HMM-based and SVM- based taggers got comparable results

    A Machine Learning Approach For Opinion Holder Extraction In Arabic Language

    Full text link
    Opinion mining aims at extracting useful subjective information from reliable amounts of text. Opinion mining holder recognition is a task that has not been considered yet in Arabic Language. This task essentially requires deep understanding of clauses structures. Unfortunately, the lack of a robust, publicly available, Arabic parser further complicates the research. This paper presents a leading research for the opinion holder extraction in Arabic news independent from any lexical parsers. We investigate constructing a comprehensive feature set to compensate the lack of parsing structural outcomes. The proposed feature set is tuned from English previous works coupled with our proposed semantic field and named entities features. Our feature analysis is based on Conditional Random Fields (CRF) and semi-supervised pattern recognition techniques. Different research models are evaluated via cross-validation experiments achieving 54.03 F-measure. We publicly release our own research outcome corpus and lexicon for opinion mining community to encourage further research

    Part of Speech Tagging of Marathi Text Using Trigram Method

    Get PDF
    In this paper we present a Marathi part of speech tagger. It is a morphologically rich language. It is spoken by the native people of Maharashtra. The general approach used for development of tagger is statistical using trigram Method. The main concept of trigram is to explore the most likely POS for a token based on given information of previous two tags by calculating probabilities to determine which is the best sequence of a tag. In this paper we show the development of the tagger. Moreover we have also shown the evaluation done

    Enriching ontological user profiles with tagging history for multi-domain recommendations

    Get PDF
    Many advanced recommendation frameworks employ ontologies of various complexities to model individuals and items, providing a mechanism for the expression of user interests and the representation of item attributes. As a result, complex matching techniques can be applied to support individuals in the discovery of items according to explicit and implicit user preferences. Recently, the rapid adoption of Web2.0, and the proliferation of social networking sites, has resulted in more and more users providing an increasing amount of information about themselves that could be exploited for recommendation purposes. However, the unification of personal information with ontologies using the contemporary knowledge representation methods often associated with Web2.0 applications, such as community tagging, is a non-trivial task. In this paper, we propose a method for the unification of tags with ontologies by grounding tags to a shared representation in the form of Wordnet and Wikipedia. We incorporate individuals' tagging history into their ontological profiles by matching tags with ontology concepts. This approach is preliminary evaluated by extending an existing news recommendation system with user tagging histories harvested from popular social networking sites
    corecore