246 research outputs found
A reliability-based approach for influence maximization using the evidence theory
The influence maximization is the problem of finding a set of social network
users, called influencers, that can trigger a large cascade of propagation.
Influencers are very beneficial to make a marketing campaign goes viral through
social networks for example. In this paper, we propose an influence measure
that combines many influence indicators. Besides, we consider the reliability
of each influence indicator and we present a distance-based process that allows
to estimate the reliability of each indicator. The proposed measure is defined
under the framework of the theory of belief functions. Furthermore, the
reliability-based influence measure is used with an influence maximization
model to select a set of users that are able to maximize the influence in the
network. Finally, we present a set of experiments on a dataset collected from
Twitter. These experiments show the performance of the proposed solution in
detecting social influencers with good quality.Comment: 14 pages, 8 figures, DaWak 2017 conferenc
Classification of Message Spreading in a Heterogeneous Social Network
Nowadays, social networks such as Twitter, Facebook and LinkedIn become
increasingly popular. In fact, they introduced new habits, new ways of
communication and they collect every day several information that have
different sources. Most existing research works fo-cus on the analysis of
homogeneous social networks, i.e. we have a single type of node and link in the
network. However, in the real world, social networks offer several types of
nodes and links. Hence, with a view to preserve as much information as
possible, it is important to consider so-cial networks as heterogeneous and
uncertain. The goal of our paper is to classify the social message based on its
spreading in the network and the theory of belief functions. The proposed
classifier interprets the spread of messages on the network, crossed paths and
types of links. We tested our classifier on a real word network that we
collected from Twitter, and our experiments show the performance of our belief
classifier
A reliability-based approach for influence maximization using the evidence theory
International audienceThe influence maximization is the problem of finding a set of social network users, called influencers, that can trigger a large cascade of propagation. Influencers are very beneficial to make a marketing campaign goes viral through social networks for example. In this paper, we propose an influence measure that combines many influence indicators. Besides, we consider the reliability of each influence indicator and we present a distance-based process that allows to estimate the reliability of each indicator. The proposed measure is defined under the framework of the theory of belief functions. Furthermore, the reliability-based influence measure is used with an influence maximization model to select a set of users that are able to maximize the influence in the network. Finally, we present a set of experiments on a dataset collected from Twitter. These experiments show the performance of the proposed solution in detecting social influencers with good quality
Maximizing positive opinion influence using an evidential approach
International audienceIn this paper, we propose a new data based model for influence maximization in online social networks. We use the theory of belief functions to overcome the data imperfection problem. Besides, the proposed model searches to detect influencer users that adopt a positive opinion about the product, the idea, etc, to be propagated. Moreover, we present some experiments to show the performance of our model
Median evidential c-means algorithm and its application to community detection
Median clustering is of great value for partitioning relational data. In this
paper, a new prototype-based clustering method, called Median Evidential
C-Means (MECM), which is an extension of median c-means and median fuzzy
c-means on the theoretical framework of belief functions is proposed. The
median variant relaxes the restriction of a metric space embedding for the
objects but constrains the prototypes to be in the original data set. Due to
these properties, MECM could be applied to graph clustering problems. A
community detection scheme for social networks based on MECM is investigated
and the obtained credal partitions of graphs, which are more refined than crisp
and fuzzy ones, enable us to have a better understanding of the graph
structures. An initial prototype-selection scheme based on evidential
semi-centrality is presented to avoid local premature convergence and an
evidential modularity function is defined to choose the optimal number of
communities. Finally, experiments in synthetic and real data sets illustrate
the performance of MECM and show its difference to other methods
Diffusion of Lexical Change in Social Media
Computer-mediated communication is driving fundamental changes in the nature
of written language. We investigate these changes by statistical analysis of a
dataset comprising 107 million Twitter messages (authored by 2.7 million unique
user accounts). Using a latent vector autoregressive model to aggregate across
thousands of words, we identify high-level patterns in diffusion of linguistic
change over the United States. Our model is robust to unpredictable changes in
Twitter's sampling rate, and provides a probabilistic characterization of the
relationship of macro-scale linguistic influence to a set of demographic and
geographic predictors. The results of this analysis offer support for prior
arguments that focus on geographical proximity and population size. However,
demographic similarity -- especially with regard to race -- plays an even more
central role, as cities with similar racial demographics are far more likely to
share linguistic influence. Rather than moving towards a single unified
"netspeak" dialect, language evolution in computer-mediated communication
reproduces existing fault lines in spoken American English.Comment: preprint of PLOS-ONE paper from November 2014; PLoS ONE 9(11) e11311
A multicriteria optimization framework for the definition of the spatial granularity of urban social media analytics
The spatial analysis of social media data has recently emerged as a significant source of knowledge for urban studies. Most of these analyses are based on an areal unit that is chosen without the support of clear criteria to ensure representativeness with regard to an observed phenomenon. Nonetheless, the results and conclusions that can be drawn from a social media analysis to a great extent depend on the areal unit chosen, since they are faced with the well-known Modifiable Areal Unit Problem. To address this problem, this article adopts a data-driven approach to determine the most suitable areal unit for the analysis of social media data. Our multicriteria optimization framework relies on the Pareto optimality to assess candidate areal units based on a set of user-defined criteria. We examine a case study that is used to investigate rainfall-related tweets and to determine the areal units that optimize spatial autocorrelation patterns through the combined use of indicators of global spatial autocorrelation and the variance of local spatial autocorrelation. The results show that the optimal areal units (30 km2 and 50 km2) provide more consistent spatial patterns than the other areal units and are thus likely to produce more reliable analytical results
Effects of Investor Sentiment Using Social Media on Corporate Financial Distress
The mainstream quantitative models in the finance literature have been ineffective in detecting possible bankruptcies during the 2007 to 2009 financial crisis. Coinciding with the same period, various researchers suggested that sentiments in social media can predict future events. The purpose of the study was to examine the relationship between investor sentiment within the social media and the financial distress of firms Grounded on the social amplification of risk framework that shows the media as an amplified channel for risk events, the central hypothesis of the study was that investor sentiments in the social media could predict t he level of financial distress of firms. Third quarter 2014 financial data and 66,038 public postings in the social media website Twitter were collected for 5,787 publicly held firms in the United States for this study. The Spearman rank correlation was applied using Altman Z-Score for measuring financial distress levels in corporate firms and Stanford natural language processing algorithm for detecting sentiment levels in the social media. The findings from the study suggested a non-significant relationship between investor sentiments in the social media and corporate financial distress, and, hence, did not support the research hypothesis. However, the model developed in this study for analyzing investor sentiments and corporate distress in firms is both original and extensible for future research and is also accessible as a low-cost solution for financial market sentiment analysis
Data-driven Computational Social Science: A Survey
Social science concerns issues on individuals, relationships, and the whole
society. The complexity of research topics in social science makes it the
amalgamation of multiple disciplines, such as economics, political science, and
sociology, etc. For centuries, scientists have conducted many studies to
understand the mechanisms of the society. However, due to the limitations of
traditional research methods, there exist many critical social issues to be
explored. To solve those issues, computational social science emerges due to
the rapid advancements of computation technologies and the profound studies on
social science. With the aids of the advanced research techniques, various
kinds of data from diverse areas can be acquired nowadays, and they can help us
look into social problems with a new eye. As a result, utilizing various data
to reveal issues derived from computational social science area has attracted
more and more attentions. In this paper, to the best of our knowledge, we
present a survey on data-driven computational social science for the first time
which primarily focuses on reviewing application domains involving human
dynamics. The state-of-the-art research on human dynamics is reviewed from
three aspects: individuals, relationships, and collectives. Specifically, the
research methodologies used to address research challenges in aforementioned
application domains are summarized. In addition, some important open challenges
with respect to both emerging research topics and research methods are
discussed.Comment: 28 pages, 8 figure
- …