1,012 research outputs found
Can We `Feel' the Temperature of Knowledge? Modelling Scientific Popularity Dynamics via Thermodynamics
Just like everything in the nature, scientific topics flourish and perish.
While existing literature well captures article's life-cycle via citation
patterns, little is known about how scientific popularity and impact evolves
for a specific topic. It would be most intuitive if we could `feel' topic's
activity just as we perceive the weather by temperature. Here, we conceive
knowledge temperature to quantify topic overall popularity and impact through
citation network dynamics. Knowledge temperature includes 2 parts. One part
depicts lasting impact by assessing knowledge accumulation with an analogy
between topic evolution and isobaric expansion. The other part gauges temporal
changes in knowledge structure, an embodiment of short-term popularity, through
the rate of entropy change with internal energy, 2 thermodynamic variables
approximated via node degree and edge number. Our analysis of representative
topics with size ranging from 1000 to over 30000 articles reveals that the key
to flourishing is topics' ability in accumulating useful information for future
knowledge generation. Topics particularly experience temperature surges when
their knowledge structure is altered by influential articles. The spike is
especially obvious when there appears a single non-trivial novel research focus
or merging in topic structure. Overall, knowledge temperature manifests topics'
distinct evolutionary cycles
The role of bot squads in the political propaganda on Twitter
Social Media are nowadays the privileged channel for information spreading
and news checking. Unexpectedly for most of the users, automated accounts, also
known as social bots, contribute more and more to this process of news
spreading. Using Twitter as a benchmark, we consider the traffic exchanged,
over one month of observation, on a specific topic, namely the migration flux
from Northern Africa to Italy. We measure the significant traffic of tweets
only, by implementing an entropy-based null model that discounts the activity
of users and the virality of tweets. Results show that social bots play a
central role in the exchange of significant content. Indeed, not only the
strongest hubs have a number of bots among their followers higher than
expected, but furthermore a group of them, that can be assigned to the same
political tendency, share a common set of bots as followers. The retwitting
activity of such automated accounts amplifies the presence on the platform of
the hubs' messages.Comment: Under Submissio
Toward enhancement of deep learning techniques using fuzzy logic: a survey
Deep learning has emerged recently as a type of artificial intelligence (AI) and machine learning (ML), it usually imitates the human way in gaining a particular knowledge type. Deep learning is considered an essential data science element, which comprises predictive modeling and statistics. Deep learning makes the processes of collecting, interpreting, and analyzing big data easier and faster. Deep neural networks are kind of ML models, where the non-linear processing units are layered for the purpose of extracting particular features from the inputs. Actually, the training process of similar networks is very expensive and it also depends on the used optimization method, hence optimal results may not be provided. The techniques of deep learning are also vulnerable to data noise. For these reasons, fuzzy systems are used to improve the performance of deep learning algorithms, especially in combination with neural networks. Fuzzy systems are used to improve the representation accuracy of deep learning models. This survey paper reviews some of the deep learning based fuzzy logic models and techniques that were presented and proposed in the previous studies, where fuzzy logic is used to improve deep learning performance. The approaches are divided into two categories based on how both of the samples are combined. Furthermore, the models' practicality in the actual world is revealed
A text-mining based model to detect unethical biases in online reviews: a case-study of Amazon.com
The rapid growth of social media in the last decades led e-commerce into a new era of value co-creation between the seller and the consumer. Since there is no contact with the product, people have to rely on the description of the seller, knowing that sometimes it may be biased and not entirely truth. Therefore, reviewing systems emerged in order to provide more trustworthy sources of information, since customer opinions may be less biased. The problem was, once sellers realized the importance of reviews and their direct
impact on sales, the need to control this key factor arose. One of the methods developed was to offer customers a certain product in exchange for an honest review. However, in the light of the results of some studies, these "honest" reviews were proved to be biased and skew the overall rating of the product.
The purpose of this work is to find patterns in these incentivized reviews and create a model that may predict whether a new review is biased or not. To study this subject, besides the sentiment analysis performed on the data, some other characteristics were taken into account, such as the overall rating, helpfulness rate, review length and the timestamp when the review was written.
Results show that some of the most significant characteristics when predicting an incentivized review are the length of a review, its helpfulness rate and the overall polarity score, calculated through VADER algorithm, as the most important sentiment-related factor.O rápido crescimento das redes sociais nas últimas décadas levaram o comércio electrónico a uma nova era de co-criação de valor entre o vendedor e o consumidor. Uma vez que não há contacto com o produto, os clientes têm de se basear na descrição do vendedor, mesmo sabendo que por vezes tal descrição pode ser tendenciosa e não totalmente verdadeira. Deste modo, surgiu um sistema de reviews com o propósito de
disponibilizar um meio de informação de maior confiança, uma vez que se trata de partilha de informação entre clientes e por isso mais imparcial. No entanto, quando os vendedores se aperceberam da importância das "reviews" e o seu impacto direto nas vendas, surgiu a necessidade de controlar este fator chave. Uma das formas de o fazer foi através da oferta de determinados produtos em troca de "reviews" honestas. Contudo, à luz dos resultados de alguns estudos, foi demonstrado que estas "reviews" "honestas" são
tendenciosas e enviesam a classificação geral do produto.
O objetivo deste estudo foi o de encontrar padrões na forma como estas "reviews" incentivadas são escritas e criar um modelo para prever se uma determinada review seria enviesada. Para esta análise, além da análise de sentimentos realizada sobre os dados, outras características foram tidas em conta, tal como a classificação geral, a taxa de "helpfulness", o tamanho da "review" e a hora a que foi escrita.
Os modelos gerados mostraram que as características mais importantes na previsão de parcialidade numa "review" são o tamanho e a taxa de utilidade e como característica sentimental mais relevante a pontuação geral da "review", calculada através do algoritmo VADER
Breadth analysis of Online Social Networks
This thesis is mainly motivated by the analysis, understanding, and prediction of human behaviour
by means of the study of their digital fingeprints. Unlike a classical PhD thesis, where
you choose a topic and go further on a deep analysis on a research topic, we carried out a breadth
analysis on the research topic of complex networks, such as those that humans create themselves
with their relationships and interactions. These kinds of digital communities where humans interact
and create relationships are commonly called Online Social Networks. Then, (i) we have
collected their interactions, as text messages they share among each other, in order to analyze the
sentiment and topic of such messages. We have basically applied the state-of-the-art techniques
for Natural Language Processing, widely developed and tested on English texts, in a collection
of Spanish Tweets and we compare the results. Next, (ii) we focused on Topic Detection, creating
our own classifier and applying it to the former Tweets dataset. The breakthroughs are two:
our classifier relies on text-graphs from the input text and we achieved a figure of 70% accuracy,
outperforming previous results. After that, (iii) we moved to analyze the network structure (or
topology) and their data values to detect outliers. We hypothesize that in social networks there
is a large mass of users that behaves similarly, while a reduced set of them behave in a different
way. However, specially among this last group, we try to separate those with high activity, or
low activity, or any other paramater/feature that make them belong to different kind of outliers.
We aim to detect influential users in one of these outliers set. We propose a new unsupervised
method, Massive Unsupervised Outlier Detection (MUOD), labeling the outliers detected os of
shape, magnitude, amplitude or combination of those. We applied this method to a subset of
roughly 400 million Google+ users, identifying and discriminating automatically sets of outlier
users. Finally, (iv) we find interesting to address the monitorization of real complex networks.
We created a framework to dynamically adapt the temporality of large-scale dynamic networks,
reducing compute overhead by at least 76%, data volume by 60% and overall cloud costs by at
least 54%, while always maintaining accuracy above 88%.PublicadoPrograma de Doctorado en Ingeniería Matemática por la Universidad Carlos III de MadridPresidente: Rosa María Benito Zafrilla.- Secretario: Ángel Cuevas Rumín.- Vocal: José Ernesto Jiménez Merin
Deep Neural Networks for Bot Detection
The problem of detecting bots, automated social media accounts governed by
software but disguising as human users, has strong implications. For example,
bots have been used to sway political elections by distorting online discourse,
to manipulate the stock market, or to push anti-vaccine conspiracy theories
that caused health epidemics. Most techniques proposed to date detect bots at
the account level, by processing large amount of social media posts, and
leveraging information from network structure, temporal dynamics, sentiment
analysis, etc.
In this paper, we propose a deep neural network based on contextual long
short-term memory (LSTM) architecture that exploits both content and metadata
to detect bots at the tweet level: contextual features are extracted from user
metadata and fed as auxiliary input to LSTM deep nets processing the tweet
text.
Another contribution that we make is proposing a technique based on synthetic
minority oversampling to generate a large labeled dataset, suitable for deep
nets training, from a minimal amount of labeled data (roughly 3,000 examples of
sophisticated Twitter bots). We demonstrate that, from just one single tweet,
our architecture can achieve high classification accuracy (AUC > 96%) in
separating bots from humans.
We apply the same architecture to account-level bot detection, achieving
nearly perfect classification accuracy (AUC > 99%). Our system outperforms
previous state of the art while leveraging a small and interpretable set of
features yet requiring minimal training data
Trend Prediction Based on Multi-Modal Affective Analysis from Social Networking Posts
This paper propose a method to predict the stage of buzz-trend generation by analyzing the emotional information posted on social networking services for multimodal information, such as posted text and attached images, based on the content of the posts. The proposed method can analyze the diffusion scale from various angles, using only the information at the time of posting, when predicting in advance and the information of time error, when used for posterior analysis. Specifically, tweets and reply tweets were converted into vectors using the BERT general-purpose language model that was trained in advance, and the attached images were converted into feature vectors using a trained neural network model for image recognition. In addition, to analyze the emotional information of the posted content, we used a proprietary emotional analysis model to estimate emotions from tweets, reply tweets, and image features, which were then added to the input as emotional features. The results of the evaluation experiments showed that the proposed method, which added linguistic features (BERT vectors) and image features to tweets, achieved higher performance than the method using only a single feature. Although we could not observe the effectiveness of the emotional features, the more emotions a tweet and its reply match had, the more empathy action occurred and the larger the like and RT values tended to be, which could ultimately increase the likelihood of a tweet going viral
- …