1,846 research outputs found

    Clustering Arabic Tweets for Sentiment Analysis

    Get PDF
    The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normal-sized documents where, in many information retrieval applications, light stemming performs better than root-based stemming and the Cosine function is commonly used

    Clustering Arabic Tweets for Sentiment Analysis

    Get PDF
    The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normal-sized documents where, in many information retrieval applications, light stemming performs better than root-based stemming and the Cosine function is commonly used

    Encouraging Inactive Users towards Effective Recommendation

    Get PDF
    Disagreement amongst users in a social network might occur when some of them have different opinion or preferences towards certain items (e.g. topics). Some of the users in the social network might have dynamic preferences due to certain situations. With these differences in opinion amongst the users, some of the users might decide to become either less-active or inactive in providing their opinions on items for recommendation processes to be possible or effective. The current state of the users will lead to a cold-start problem where the recommender system will be unable to find accurate preference information of the users for a recommendation of new items to be provided to them. It will also be difficult to identify these inactive or less-active users within a group for the recommendation of items to be done effectively. Attempts have been made by several researchers to reduce the cold-start problem using singular value decomposition (SVD) algorithm, but the disagreement problem amongst users will still occur due to the dynamic preferences of the users towards items. It was hypothesized in this thesis that an influence based preference modelling could resolve the disagreement problem. It is possible to encourage less-active or inactive users to become active only if they have been identified with a group of their trustworthy neighbours. A suitable clustering technique that does not require pre-specified parameters (e.g. the number of clusters or the number of cluster members) was needed to accurately identify trustworthy users with groups (i.e. clusters) and also identify exemplars (i.e. Cluster representatives) from each group. Several existing clustering techniques such as Highly connected subgraphs (HCS), Markov clustering and Affinity Propagation (AP) clustering were explored in this thesis to check if they have the capabilities to achieve these required outputs. The suitable clustering technique amongst these techniques that is able to identify exemplars in each cluster could be validated using pattern information of past social activities, estimated trust values or familiarity values. The proposed method for estimating these values was based on psychological theories such as the theory of interpersonal behaviour (TIB) and rational choice theory as it was necessary to predict the trustworthiness behaviour of social users. It will also be revealed that users with high trust values (i.e. Trustworthy users) are not necessarily exemplars of various clusters, but they are more likely to encourage less active users in accepting recommended items preferred by the exemplar of their respective cluster

    User modeling for exploratory search on the Social Web. Exploiting social bookmarking systems for user model extraction, evaluation and integration

    Get PDF
    Exploratory search is an information seeking strategy that extends be- yond the query-and-response paradigm of traditional Information Retrieval models. Users browse through information to discover novel content and to learn more about the newly discovered things. Social bookmarking systems integrate well with exploratory search, because they allow one to search, browse, and filter social bookmarks. Our contribution is an exploratory tag search engine that merges social bookmarking with exploratory search. For this purpose, we have applied collaborative filtering to recommend tags to users. User models are an im- portant prerequisite for recommender systems. We have produced a method to algorithmically extract user models from folksonomies, and an evaluation method to measure the viability of these user models for exploratory search. According to our evaluation web-scale user modeling, which integrates user models from various services across the Social Web, can improve exploratory search. Within this thesis we also provide a method for user model integra- tion. Our exploratory tag search engine implements the findings of our user model extraction, evaluation, and integration methods. It facilitates ex- ploratory search on social bookmarks from Delicious and Connotea and pub- lishes extracted user models as Linked Data

    Lightweight Adaptation of Classifiers to Users and Contexts: Trends of the Emerging Domain

    Get PDF
    Intelligent computer applications need to adapt their behaviour to contexts and users, but conventional classifier adaptation methods require long data collection and/or training times. Therefore classifier adaptation is often performed as follows: at design time application developers define typical usage contexts and provide reasoning models for each of these contexts, and then at runtime an appropriate model is selected from available ones. Typically, definition of usage contexts and reasoning models heavily relies on domain knowledge. However, in practice many applications are used in so diverse situations that no developer can predict them all and collect for each situation adequate training and test databases. Such applications have to adapt to a new user or unknown context at runtime just from interaction with the user, preferably in fairly lightweight ways, that is, requiring limited user effort to collect training data and limited time of performing the adaptation. This paper analyses adaptation trends in several emerging domains and outlines promising ideas, proposed for making multimodal classifiers user-specific and context-specific without significant user efforts, detailed domain knowledge, and/or complete retraining of the classifiers. Based on this analysis, this paper identifies important application characteristics and presents guidelines to consider these characteristics in adaptation design
    • …
    corecore