10,960 research outputs found
The applications of social media in sports marketing
n the era of big data, sports consumer's activities in social media become valuable assets to sports marketers. In this paper, the authors review extant literature regarding how to effectively use social media to promote sports as well as how to effectively analyze social media data to support business decisions. Methods: The literature review method. Results: Our findings suggest that sports marketers can use social media to achieve the following goals, such as facilitating marketing communication campaigns, adding values to sports products and services, creating a two-way communication between sports brands and consumers, supporting sports sponsorship program, and forging brand communities. As to how to effectively analyze social media data to support business decisions, extent literature suggests that sports marketers to undertake traffic and engagement analysis on their social media sites as well as to conduct sentiment analysis to probe customer's opinions. These insights can support various aspects of business decisions, such as marketing communication management, consumer's voice probing, and sales predictions. Conclusion: Social media are ubiquitous in the sports marketing and consumption practices. In the era of big data, these "footprints" can now be effectively analyzed to generate insights to support business decisions. Recommendations to both the sports marketing practices and research are also addressed
Adaptive Semi-supervised Learning for Cross-domain Sentiment Classification
We consider the cross-domain sentiment classification problem, where a
sentiment classifier is to be learned from a source domain and to be
generalized to a target domain. Our approach explicitly minimizes the distance
between the source and the target instances in an embedded feature space. With
the difference between source and target minimized, we then exploit additional
information from the target domain by consolidating the idea of semi-supervised
learning, for which, we jointly employ two regularizations -- entropy
minimization and self-ensemble bootstrapping -- to incorporate the unlabeled
target data for classifier refinement. Our experimental results demonstrate
that the proposed approach can better leverage unlabeled data from the target
domain and achieve substantial improvements over baseline methods in various
experimental settings.Comment: Accepted to EMNLP201
Wikipedias: Collaborative web-based encyclopedias as complex networks
Wikipedia is a popular web-based encyclopedia edited freely and
collaboratively by its users. In this paper we present an analysis of
Wikipedias in several languages as complex networks. The hyperlinks pointing
from one Wikipedia article to another are treated as directed links while the
articles represent the nodes of the network. We show that many network
characteristics are common to different language versions of Wikipedia, such as
their degree distributions, growth, topology, reciprocity, clustering,
assortativity, path lengths and triad significance profiles. These
regularities, found in the ensemble of Wikipedias in different languages and of
different sizes, point to the existence of a unique growth process. We also
compare Wikipedias to other previously studied networks.Comment: v3: 9 pages, 12 figures, Change of title, few paragraphs and two
figures. Accepted for publication in Phys. Rev.
Inferring Networks of Substitutable and Complementary Products
In a modern recommender system, it is important to understand how products
relate to each other. For example, while a user is looking for mobile phones,
it might make sense to recommend other phones, but once they buy a phone, we
might instead want to recommend batteries, cases, or chargers. These two types
of recommendations are referred to as substitutes and complements: substitutes
are products that can be purchased instead of each other, while complements are
products that can be purchased in addition to each other.
Here we develop a method to infer networks of substitutable and complementary
products. We formulate this as a supervised link prediction task, where we
learn the semantics of substitutes and complements from data associated with
products. The primary source of data we use is the text of product reviews,
though our method also makes use of features such as ratings, specifications,
prices, and brands. Methodologically, we build topic models that are trained to
automatically discover topics from text that are successful at predicting and
explaining such relationships. Experimentally, we evaluate our system on the
Amazon product catalog, a large dataset consisting of 9 million products, 237
million links, and 144 million reviews.Comment: 12 pages, 6 figure
Deep Emotions Across Languages: A Novel Approach for Sentiment Propagation in Multilingual WordNets
Sentiment analysis involves using WordNets enriched with emotional metadata,
which are valuable resources. However, manual annotation is time-consuming and
expensive, resulting in only a few WordNet Lexical Units being annotated. This
paper introduces two new techniques for automatically propagating sentiment
annotations from a partially annotated WordNet to its entirety and to a WordNet
in a different language: Multilingual Structured Synset Embeddings (MSSE) and
Cross-Lingual Deep Neural Sentiment Propagation (CLDNS). We evaluated the
proposed MSSE+CLDNS method extensively using Princeton WordNet and Polish
WordNet, which have many inter-lingual relations. Our results show that the
MSSE+CLDNS method outperforms existing propagation methods, indicating its
effectiveness in enriching WordNets with emotional metadata across multiple
languages. This work provides a solid foundation for large-scale, multilingual
sentiment analysis and is valuable for academic research and practical
applications.Comment: 6 pages, 1 figure, presented at ICDM Workshop: SENTIRE 202
Efficient Language Adaptive Pre-training: Extending State-of-the-Art Large Language Models for Polish
This study explores the potential of fine-tuning foundational English Large
Language Models (LLMs) for generating Polish text. The first step involves
Language Adaptive Pre-training (LAPT) on a high-quality dataset of 3.11 GB,
consisting of 276 million Polish tokens. The LAPT is followed by additional
fine-tuning aimed at solving nine KLEJ challenges. Our trained model
Curie-7B-v1 not only generates Polish text with the lowest perplexity of 3.02
among decoder-based Polish models but also closely rivals the performance of
the best Polish encoder-decoder models with a less than 2% gap on 8 out of 9
tasks. Curie-7B-v1 used approximately 2-3% of a typical dataset size to learn
Polish. The LAPT was completed in less than five days using a consumer GPU,
highlighting the method's efficiency. The proficiency of the model in Polish
was significantly enhanced, demonstrating the viability of this approach for
adding new languages to existing LLMs by training just 1.2% of its parameters.
To contribute to the community's collaborative progress, the model has been
released as open-source.Comment: 10 page
The stylometry of film dialogue : pros and pitfalls
We examine film dialogue with quantitative textual analysis (stylometry, sentiment analysis, distant reading). Working with transcribed dialogue in almost 300 productions, we explore the complex way in which most-frequent-words-based stylometry and lexicon-based sentiment analysis produce patterns of similarity and difference between screenwriters and/or a priori IMDB-defined genres. In fact, some of our results show that counting and comparing very frequent word lists reveals further similarities: of theme, implied audience, stylistic patternings. The results are encouraging enough to suggest that such quantitative approach to film dialogue may become a welcome addition to the arsenal of film studies methodology
Recommended from our members
Cross-Lingual and Low-Resource Sentiment Analysis
Identifying sentiment in a low-resource language is essential for understanding opinions internationally and for responding to the urgent needs of locals affected by disaster incidents in different world regions. While tools and resources for recognizing sentiment in high-resource languages are plentiful, determining the most effective methods for achieving this task in a low-resource language which lacks annotated data is still an open research question. Most existing approaches for cross-lingual sentiment analysis to date have relied on high-resource machine translation systems, large amounts of parallel data, or resources only available for Indo-European languages.
This work presents methods, resources, and strategies for identifying sentiment cross-lingually in a low-resource language. We introduce a cross-lingual sentiment model which can be trained on a high-resource language and applied directly to a low-resource language. The model offers the feature of lexicalizing the training data using a bilingual dictionary, but can perform well without any translation into the target language.
Through an extensive experimental analysis, evaluated on 17 target languages, we show that the model performs well with bilingual word vectors pre-trained on an appropriate translation corpus. We compare in-genre and in-domain parallel corpora, out-of-domain parallel corpora, in-domain comparable corpora, and monolingual corpora, and show that a relatively small, in-domain parallel corpus works best as a transfer medium if it is available. We describe the conditions under which other resources and embedding generation methods are successful, and these include our strategies for leveraging in-domain comparable corpora for cross-lingual sentiment analysis.
To enhance the ability of the cross-lingual model to identify sentiment in the target language, we present new feature representations for sentiment analysis that are incorporated in the cross-lingual model: bilingual sentiment embeddings that are used to create bilingual sentiment scores, and a method for updating the sentiment embeddings during training by lexicalization of the target language. This feature configuration works best for the largest number of target languages in both untargeted and targeted cross-lingual sentiment experiments.
The cross-lingual model is studied further by evaluating the role of the source language, which has traditionally been assumed to be English. We build cross-lingual models using 15 source languages, including two non-European and non-Indo-European source languages: Arabic and Chinese. We show that language families play an important role in the performance of the model, as does the morphological complexity of the source language.
In the last part of the work, we focus on sentiment analysis towards targets. We study Arabic as a representative morphologically complex language and develop models and morphological representation features for identifying entity targets and sentiment expressed towards them in Arabic open-domain text. Finally, we adapt our cross-lingual sentiment models for the detection of sentiment towards targets. Through cross-lingual experiments on Arabic and English, we demonstrate that our findings regarding resources, features, and language also hold true for the transfer of targeted sentiment
- …