22 research outputs found

    Two-layer classification and distinguished representations of users and documents for grouping and authorship identification

    Get PDF
    Most studies on authorship identification reported a drop in the identification result when the number of authors exceeds 20-25. In this paper, we introduce a new user representation to address this problem and split classification across two layers. There are at least 3 novelties in this paper. First, the two-layer approach allows applying authorship identification over larger number of authors (tested over 100 authors), and it is extendable. The authors are divided into groups that contain smaller number of authors. Given an anonymous document, the primary layer detects the group to which the document belongs. Then, the secondary layer determines the particular author inside the selected group. In order to extract the groups linking similar authors, clustering is applied over users rather than documents. Hence, the second novelty of this paper is introducing a new user representation that is different from document representation. Without the proposed user representation, the clustering over documents will result in documents of author(s) distributed over several clusters, instead of a single cluster membership for each author. Third, the extracted clusters are descriptive and meaningful of their users as the dimensions have psychological backgrounds. For authorship identification, the documents are labelled with the extracted groups and fed into machine learning to build classification models that predicts the group and author of a given document. The results show that the documents are highly correlated with the extracted corresponding groups, and the proposed model can be accurately trained to determine the group and the author identity

    Sentiment analysis tools should take account of the number of exclamation marks!!!

    Get PDF
    There are various factors that affect the sentiment level expressed in textual comments. Capitalization of letters tends to mark something for attention and repeating of letters tends to strengthen the emotion. Emoticons are used to help visualize facial expressions which can affect understanding of text. In this paper, we show the effect of the number of exclamation marks used, via testing with twelve online sentiment tools. We present opinions gathered from 500 respondents towards ā€œlikeā€ and ā€œdislikeā€ values, with a varying number of exclamation marks. Results show that only 20% of the online sentiment tools tested considered the number of exclamation marks in their returned scores. However, results from our human raters show that the more exclamation marks used for positive comments, the more they have higher ā€œlikeā€ values than the same comments with fewer exclamations marks. Similarly, adding more exclamation marks for negative comments, results in a higher ā€œdislikeā€

    A topic model for building fine-grained domain-specific emotion lexicon

    Get PDF
    Emotion lexicons play a crucial role in sentiment analysis and opinion mining. In this paper, we propose a novel Emotion-aware LDA (EaLDA) model to build a domainspecific lexicon for predefined emotions that include anger, disgust, fear, joy, sadness, surprise. The model uses a minimal set of domain-independent seed words as prior knowledge to discover a domainspecific lexicon, learning a fine-grained emotion lexicon much richer and adaptive to a specific domain. By comprehensive experiments, we show that our model can generate a high-quality fine-grained domain-specific emotion lexicon. Ā© 2014 Association for Computational Linguistics.published_or_final_versio

    Consistency of online consumers' perceptions of posted comments: An analysis of TripAdvisor reviews

    Get PDF
    Ratings and comments play a dominant role in online reviews. The question, thus, arises as to whether or not there is any consistency in consumer perception of the reviews, and how future choices might be influenced. We analysed 2000 comments of 20 different hotels posted on TripAdvisor to determine if the comments posted by previous guests of a hotel influence the decisions of potential guests. Two hundred human raters were asked to consider 20 reviews and to rate a hotel based on the reviews. The Cohen Kappa coefficient was used to evaluate the degree of agreement on the hotel quality as determined by the human raters and the star rating given by the original reviewer. The results showed a high consistency between the human ratersā€™ evaluation and the reviewersā€™ star rating. This research reveals the importance of website feedback such as TripAdvisor in influencing consumer choice

    Um framework para tratamento do lƩxico afetivo a partir de textos disponibilizados em um ambiente virtual de aprendizagem

    Get PDF
    Os ambientes virtuais de ensino e aprendizagem considerados "sensĆ­veis afetivamente" devem ser capazes de identificar aspectos afetivos dos participantes em interaĆ§Ć£o. A maioria dos sistemas utiliza as tecnologias de computaĆ§Ć£o afetiva para captar emoƧƵes a partir de informaƧƵes fisiolĆ³gica, comportamental, gestual e modulaĆ§Ć£o da fala. TĆ©cnicas para anĆ”lise de subjetividade vĆŖm sendo desenvolvidas para classificar conteĆŗdo emocional em textos. Este artigo apresenta o framework AWM (Affect Word Minig) para mineraĆ§Ć£o de palavras com conotaĆ§Ć£o afetiva e examina os resultados preliminares sobre o reconhecimento da afetividade em textos registrados no FĆ³rum do ambiente virtual de aprendizagem (AVA) ROODA.

    How to Identify Tomorrow\u27s Most Active Social Commerce Contributors? Inviting Starlets to the Reviewer Hall of Fame

    Get PDF
    Social commerce contributors share their experiences of products and services, which is appreciated by consumers and online retailers. Since such user generated content is especially valuable for online retailers, they incentivize the most active contributors to provide further product reviews. Our paper aims to explore the question of which user characteristics can be used to identify contributors of valuable contents. This is especially relevant for newly registered users who have not extensively contributed yet. Drawing upon the literature on social information processing, signaling and communication theory, we explore how individual user characteristics published in the personal user profiles are associated with the actual contribution activity. Therefore, we analyze more than 30,000 user profiles from amazon.com. We find that information disclosure, emotiveness and problem-orientation are related to the contribution activity. Consequently, our results advance the understanding of who are the most active contributors and provide new implications for theory and practice

    Common Emotion Modeling in Distinct Medium Analysis and Matching

    Get PDF
    With the ever growing amount of digital information and multimedia on the World Wide Web and the current trend towards personalizing technology, users find themselves wanting a more intuitive way of finding related information, and not just any information but relevant information that is personal to them. One way to personalize and filter the information is by extracting the mood affectation, allowing the user to search based on current mood. The artificial intelligence field has done extensive research and continues to discover and improve current mood extraction techniques for each distinct medium. This paper will explore how to link and integrate the mood extraction of several distinct mediumsā€” audio, image, and textā€”by utilizing a common emotion model that is customizable to the user. This project will allow the user to provide an input medium and find a matching output of a different medium based on default settings or user customization

    Consistency of online consumers' perceptions of posted comments: An analysis of tripadvisor reviews

    Get PDF
    Ratings and comments play a dominant role in online reviews.The question, thus, arises as to whether or not there is any consistency in consumer perception of the reviews, and how future choices might be influenced.We analysed 2000 comments of 20 different hotels posted on TripAdvisor to determine if the comments posted by previous guests of a hotel influence the decisions of potential guests.Two hundred human raters were asked to consider 20 reviews and to rate a hotel based on the reviews.The Cohen Kappa coefficient was used to evaluate the degree of agreement on the hotel quality as determined by the human raters and the star rating given by the original reviewer.The results showed a high consistency between the human ratersā€™ evaluation and the reviewersā€™ star rating. This research reveals the importance of website feedback such as TripAdvisor in influencing consumer choice

    How epidemic psychology works on Twitter: evolution of responses to the COVID-19 pandemic in the U.S.

    Get PDF
    Disruptions resulting from an epidemic might often appear to amount to chaos but, in reality, can be understood in a systematic way through the lens of "epidemic psychology". According to Philip Strong, the founder of the sociological study of epidemic infectious diseases, not only is an epidemic biological; there is also the potential for three psycho-social epidemics: of fear, moralization, and action. This work empirically tests Strong's model at scale by studying the use of language of 122M tweets related to the COVID-19 pandemic posted in the U.S. during the whole year of 2020. On Twitter, we identified three distinct phases. Each of them is characterized by different regimes of the three psycho-social epidemics. In the refusal phase, users refused to accept reality despite the increasing number of deaths in other countries. In the anger phase (started after the announcement of the first death in the country), users' fear translated into anger about the looming feeling that things were about to change. Finally, in the acceptance phase, which began after the authorities imposed physical-distancing measures, users settled into a "new normal" for their daily activities. Overall, refusal of accepting reality gradually died off as the year went on, while acceptance increasingly took hold. During 2020, as cases surged in waves, so did anger, re-emerging cyclically at each wave. Our real-time operationalization of Strong's model is designed in a way that makes it possible to embed epidemic psychology into real-time models (e.g., epidemiological and mobility models).Comment: Humanities and Social Sciences Communications. 24 pages, 7 figures, 4 table

    Linguistic markers of secrets and sensitive self-disclosure in Twitter

    Get PDF
    corecore