4 research outputs found

    #REVAL: a semantic evaluation framework for hashtag recommendation

    Full text link
    Automatic evaluation of hashtag recommendation models is a fundamental task in many online social network systems. In the traditional evaluation method, the recommended hashtags from an algorithm are firstly compared with the ground truth hashtags for exact correspondences. The number of exact matches is then used to calculate the hit rate, hit ratio, precision, recall, or F1-score. This way of evaluating hashtag similarities is inadequate as it ignores the semantic correlation between the recommended and ground truth hashtags. To tackle this problem, we propose a novel semantic evaluation framework for hashtag recommendation, called #REval. This framework includes an internal module referred to as BERTag, which automatically learns the hashtag embeddings. We investigate on how the #REval framework performs under different word embedding methods and different numbers of synonyms and hashtags in the recommendation using our proposed #REval-hit-ratio measure. Our experiments of the proposed framework on three large datasets show that #REval gave more meaningful hashtag synonyms for hashtag recommendation evaluation. Our analysis also highlights the sensitivity of the framework to the word embedding technique, with #REval based on BERTag more superior over #REval based on FastText and Word2Vec.Comment: 18 pages, 4 figure

    Corporate image or social engagement: Twitter discourse on corporate social responsibility (CSR) in public relations strategies in the energy sector

    Get PDF
    Social media have opened up new opportunities for the creation of innovative public relations strategies focused on establishing and cultivating relationships with stakeholders on the basis of meaningful dialogue. Consideration of the interrelation between corporate social responsibility (CSR) and public relations highlights new areas for exploration and engagement. Both the dialogical and semantic perspectives reveal the performative and conversational aspects of social media. In general, both the linguistic panorama of CSR and digital media as part of a PR strategy open new possibilities for a dialogical, interactive, meaningful relationship strategy for corporate image management. Based on the linguistic approach to CSR and the Communication Management Approach, this paper explores the linguistic use of Twitter as a primary dialogical strategy to effectively enhance interactive dialogue-based relationships with the stakeholders of the top 50 companies in the energy sector based on tweet data from 2016. Semantic analysis was conducted by advanced text mining and clustering techniques on 3042 tweets monitored in 2017 that contained the leading CSR-related hashtags and keywords. The results demonstrated that the top energy companies apply a defensive and symbolic perspective, mainly for branding purposes. The corporate discourse dominates over a meaningful conversational strategy to foster interaction with stakeholders around sustainability issues on Twitter. The study reveals a homogenized interrelation between CSR, social media, and public relations. The results reveal a tendency for isomorphy in the communication models applied by the companies in the energy sector. Furthermore, similarities in semantics and thus strong tendencies to mutually mimic dialogical strategies are also observed. The semantic narrative built around the brand indicates a limited orientation towards CSR and sustainability. As such, it does not contribute to the creation of a dialogical interaction and meaningful relationships with multiple stakeholders on Twitter, in the high-risk sector represented by the energy industry

    Cluster Analysis of Time Series Data with Application to Hydrological Events and Serious Illness Conversations

    Get PDF
    Cluster analysis explores the underlying structure of data and organizes it into groups (i.e., clusters) such that observations within the same group are more similar than those in different groups. Quantifying the ``similarity\u27\u27 between observations, choosing the optimal number of clusters, and interpreting the results all require careful consideration of the research question at hand, the model parameters, the amount of data and their attributes. In this dissertation, the first manuscript explores the impact of design choices and the variability in clustering performance on different datasets. This is demonstrated through a benchmark study consisting of 128 datasets from the University of California, Riverside time series classification archive. Next, a multivariate event time series clustering approach is applied to hydrological storm events in watershed science. Specifically, river discharge and suspended sediment data from six watersheds in the Vermont are clustered, and yield four types of hydrological water quality events to help inform conservation and management efforts. In a second application, a novel and computationally efficient clustering algorithm called SOMTimeS (Self-organizing Map for Time Series) is designed for large time series analysis using dynamic time warping (DTW). The algorithm scales linearly with increasing data, making SOMTimeS, to the best of our knowledge, the fastest DTW-based clustering algorithm to date. For proof of concept, it is applied to conversational features from a Palliative Care Communication Research Initiative study with the goal of understanding and motivating high quality communication in serious illness health care settings
    corecore