57,474 research outputs found

    A Case Study of Algorithms for Morphosyntactic Tagging of Polish Language

    Get PDF
    The paper presents an evaluation of several part-of-speech taggers, representing main tagging algorithms, applied to corpus of frequency dictionary of the contemporary Polish language. We report our results considering two tagging schemes: IPI PAN positional tagset and its simplified version. Tagging accuracy is calculated for different training sets and takes into account many subcategories (accuracy on known and unknown tokens, word segments, sentences etc.) The comparison of results with other inflecting and analytic languages is done. Performance aspects (time demands) of used tagging tools are also discussed

    Recommending Items in Social Tagging Systems Using Tag and Time Information

    Full text link
    In this work we present a novel item recommendation approach that aims at improving Collaborative Filtering (CF) in social tagging systems using the information about tags and time. Our algorithm follows a two-step approach, where in the first step a potentially interesting candidate item-set is found using user-based CF and in the second step this candidate item-set is ranked using item-based CF. Within this ranking step we integrate the information of tag usage and time using the Base-Level Learning (BLL) equation coming from human memory theory that is used to determine the reuse-probability of words and tags using a power-law forgetting function. As the results of our extensive evaluation conducted on data-sets gathered from three social tagging systems (BibSonomy, CiteULike and MovieLens) show, the usage of tag-based and time information via the BLL equation also helps to improve the ranking and recommendation process of items and thus, can be used to realize an effective item recommender that outperforms two alternative algorithms which also exploit time and tag-based information.Comment: 6 pages, 2 tables, 9 figure

    Ontology-based Interoperation of Linguistic Tools for an Improved Lemma Annotation in Spanish

    Get PDF
    In this paper, we present an ontology-based methodology and architecture for the comparison, assessment, combination (and, to some extent, also contrastive evaluation) of the results of different linguistic tools. More specifically, we describe an experiment aiming at the improvement of the correctness of lemma tagging for Spanish. This improvement was achieved by means of the standardisation and combination of the results of three different linguistic annotation tools (Bitext’s DataLexica, Connexor’s FDG Parser and LACELL’s POS tagger), using (1) ontologies, (2) a set of lemma tagging correction rules, determined empirically during the experiment, and (3) W3C standard languages, such as XML, RDF(S) and OWL. As we show in the results of the experiment, the interoperation of these tools by means of ontologies and the correction rules applied in the experiment improved significantly the quality of the resulting lemma tagging (when compared to the separate lemma tagging performed by each of the tools that we made interoperate)

    The use of tagging to support the authoring of personalisable learning content

    Get PDF
    This research project is interested in the area of personalised and adaptable learning and in particular within an e-learning context. Brusilovsky (1996) and Santally (2005) stress the importance of adaptive systems within e-learning. Karagiannikis and Sampson et al. (2004) argue that personalised learning systems can be defined by their capability to adapt automatically to the changing attitudes of the “learning experience” which can, in turn, be defined by the individual learner characteristics, for example the type of learning material. The project evolved to cover areas including personalised learning, e-learning environments, authoring tools, tagging, learning objects, learning theories and learning styles. The main focus at the start of the project was to provide a personalised and adaptable learning environment for students based on their learning style. During the research, this led to a specific interest about how an academic can create, tag and author learning objects to provide the capability of personalised adaptable e-learning for a learner. Research undertaken was designed to gain an understanding of personalised and adaptive learning techniques, e-learning tools and learning styles. Important findings of this research showed that e-learning platforms do not offer much in the way of a personalised learning experience for a learner. Additionally, the research showed that general adaptive systems and adaptive systems incorporating learning styles are not commonly used or available due to issues with flexibility, reuse and integration. The concept of tagging was investigated during the research and it was found that tagging is underused within e-learning, although the research shows that it could be a good ‘fit’ within e-learning. This therefore led to the decision to create a general purpose discriminatory tagging methodology to allow authors to tag learning objects for personalisation and reuse. The main focus for the evaluation of this tagging methodology was the authoring side of the tagging. It was found that other research projects have evaluated the personalisation of learning content based on a learner’s learning style (see Graf and Kinshuk (2007)). It was therefore felt that there was a sufficient body of existing evidence in this area whereas there was limited research available on the authoring side. The evaluation of the discriminatory tagging methodology demonstrated that the methodology could allow for any discrimination between learners to be used. The example demonstrated within this thesis includes discriminating according to a learner’s learning style and accessibility type. This type of platform independent flexible discriminatory methodology does not exist within current e-learning platforms or other e-learning systems. Therefore, the main contribution of this thesis is therefore a platform independent general-purpose discriminatory tagging methodology

    Collaborative Filtering in Social Tagging Systems Based on Joint Item-Tag Recommendations

    Get PDF
    Tapping into the wisdom of the crowd, social tagging can be considered an alternative mechanism - as opposed to Web search - for organizing and discovering information on the Web. Effective tag-based recommendation of information items, such as Web resources, is a critical aspect of this social information discovery mechanism. A precise understanding of the information structure of social tagging systems lies at the core of an effective tag-based recommendation method. While most of the existing research either implicitly or explicitly assumes a simple tripartite graph structure for this purpose, we propose a comprehensive information structure to capture all types of co-occurrence information in the tagging data. Based on the proposed information structure, we further propose a unified user profiling scheme to make full use of all available information. Finally, supported by our proposed user profile, we propose a novel framework for collaborative filtering in social tagging systems. In our proposed framework, we first generate joint item-tag recommendations, with tags indicating topical interests of users in target items. These joint recommendations are then refined by the wisdom from the crowd and projected to the item space for final item recommendations. Evaluation using three real-world datasets shows that our proposed recommendation approach significantly outperformed state-of-the-art approaches

    Learning morphology with Morfette

    Get PDF
    Morfette is a modular, data-driven, probabilistic system which learns to perform joint morphological tagging and lemmatization from morphologically annotated corpora. The system is composed of two learning modules which are trained to predict morphological tags and lemmas using the Maximum Entropy classifier. The third module dynamically combines the predictions of the Maximum-Entropy models and outputs a probability distribution over tag-lemma pair sequences. The lemmatization module exploits the idea of recasting lemmatization as a classification task by using class labels which encode mappings from wordforms to lemmas. Experimental evaluation results and error analysis on three morphologically rich languages show that the system achieves high accuracy with no language-specific feature engineering or additional resources
    corecore