6 research outputs found
The Argentine economy on Twitter
We propose and implement a methodology for data collection and analysis of Twitter discussions linked to the Argentine economy. Starting with a list of “seed users” later expanded based on following-follower relationships, we build a network of interactions and fetch their tweet timelines. Then, we use a community detection model to compress the structure of underlying relationships and a standard topic model to represent the latent issues discussed in each community. Results suggest that this strategy is able to learn a useful organization and to summarize the contents of social media exchanges of the Argentine economic tweetosphere. Potential applications could be to characterize the links between different economic sectors and to construct community-level indicators of opinions.Sociedad Argentina de Informática e Investigación Operativ
"When and Where?": Behavior Dominant Location Forecasting with Micro-blog Streams
The proliferation of smartphones and wearable devices has increased the
availability of large amounts of geospatial streams to provide significant
automated discovery of knowledge in pervasive environments, but most prominent
information related to altering interests have not yet adequately capitalized.
In this paper, we provide a novel algorithm to exploit the dynamic fluctuations
in user's point-of-interest while forecasting the future place of visit with
fine granularity. Our proposed algorithm is based on the dynamic formation of
collective personality communities using different languages, opinions,
geographical and temporal distributions for finding out optimized equivalent
content. We performed extensive empirical experiments involving, real-time
streams derived from 0.6 million stream tuples of micro-blog comprising 1945
social person fusion with graph algorithm and feed-forward neural network model
as a predictive classification model. Lastly, The framework achieves 62.10%
mean average precision on 1,20,000 embeddings on unlabeled users and
surprisingly 85.92% increment on the state-of-the-art approach.Comment: Accepted as a full paper in the 2nd International Workshop on Social
Computing co-located with ICDM, 2018 Singapor
POISED: Spotting Twitter Spam Off the Beaten Paths
Cybercriminals have found in online social networks a propitious medium to
spread spam and malicious content. Existing techniques for detecting spam
include predicting the trustworthiness of accounts and analyzing the content of
these messages. However, advanced attackers can still successfully evade these
defenses.
Online social networks bring people who have personal connections or share
common interests to form communities. In this paper, we first show that users
within a networked community share some topics of interest. Moreover, content
shared on these social network tend to propagate according to the interests of
people. Dissemination paths may emerge where some communities post similar
messages, based on the interests of those communities. Spam and other malicious
content, on the other hand, follow different spreading patterns.
In this paper, we follow this insight and present POISED, a system that
leverages the differences in propagation between benign and malicious messages
on social networks to identify spam and other unwanted content. We test our
system on a dataset of 1.3M tweets collected from 64K users, and we show that
our approach is effective in detecting malicious messages, reaching 91%
precision and 93% recall. We also show that POISED's detection is more
comprehensive than previous systems, by comparing it to three state-of-the-art
spam detection systems that have been proposed by the research community in the
past. POISED significantly outperforms each of these systems. Moreover, through
simulations, we show how POISED is effective in the early detection of spam
messages and how it is resilient against two well-known adversarial machine
learning attacks
Exploiting the wisdom of the crowds for characterizing and connecting heterogeneous resources
Heterogeneous content is an inherent problem for cross-system search, recommendation and personalization. In this paper we investigate differences in topic coverage and the impact of topicstopics in different kinds of Web services. We use entity extraction and categorization to create ‘fin-gerprints ’ that allow for meaningful comparison. As a basis taxonomy, we use the 23 main categories of Wikipedia Cat-egory Graph, which has been assembled over the years by the wisdom of the crowds. Following a proof of concept of our approach, we analyze differences in topic coverage and topic impact. The results show many differences between Web services like Twitter, Flickr and Delicious, which re-flect users ’ behavior and the usage of each system. The paper concludes with a user study that demonstrates the benefits of fingerprints over traditional textual methods for recommendations of heterogeneous resources
Finding twitter communities with common interests using following links of celebrities
One important problem in target advertising and viral marketing on online social networking sites is the efficient identification of com-munities with common interests in large social networks. Existing methods involve large scale community detection on the entire so-cial network before determining the interests of individuals within these communities. This approach is both computationally inten-sive and may result in communities without a common interest. We propose an efficient approach for detecting communities that share common interests on Twitter. Our approach involves first identify-ing celebrities that are representative of an interest category before detecting communities based on linkages among followers of these celebrities. We also study the characteristics of these communities and the effects of deepening or specialization of interest