6 research outputs found

    The Argentine economy on Twitter

    Get PDF
    We propose and implement a methodology for data collection and analysis of Twitter discussions linked to the Argentine economy. Starting with a list of “seed users” later expanded based on following-follower relationships, we build a network of interactions and fetch their tweet timelines. Then, we use a community detection model to compress the structure of underlying relationships and a standard topic model to represent the latent issues discussed in each community. Results suggest that this strategy is able to learn a useful organization and to summarize the contents of social media exchanges of the Argentine economic tweetosphere. Potential applications could be to characterize the links between different economic sectors and to construct community-level indicators of opinions.Sociedad Argentina de Informática e Investigación Operativ

    "When and Where?": Behavior Dominant Location Forecasting with Micro-blog Streams

    Full text link
    The proliferation of smartphones and wearable devices has increased the availability of large amounts of geospatial streams to provide significant automated discovery of knowledge in pervasive environments, but most prominent information related to altering interests have not yet adequately capitalized. In this paper, we provide a novel algorithm to exploit the dynamic fluctuations in user's point-of-interest while forecasting the future place of visit with fine granularity. Our proposed algorithm is based on the dynamic formation of collective personality communities using different languages, opinions, geographical and temporal distributions for finding out optimized equivalent content. We performed extensive empirical experiments involving, real-time streams derived from 0.6 million stream tuples of micro-blog comprising 1945 social person fusion with graph algorithm and feed-forward neural network model as a predictive classification model. Lastly, The framework achieves 62.10% mean average precision on 1,20,000 embeddings on unlabeled users and surprisingly 85.92% increment on the state-of-the-art approach.Comment: Accepted as a full paper in the 2nd International Workshop on Social Computing co-located with ICDM, 2018 Singapor

    POISED: Spotting Twitter Spam Off the Beaten Paths

    Get PDF
    Cybercriminals have found in online social networks a propitious medium to spread spam and malicious content. Existing techniques for detecting spam include predicting the trustworthiness of accounts and analyzing the content of these messages. However, advanced attackers can still successfully evade these defenses. Online social networks bring people who have personal connections or share common interests to form communities. In this paper, we first show that users within a networked community share some topics of interest. Moreover, content shared on these social network tend to propagate according to the interests of people. Dissemination paths may emerge where some communities post similar messages, based on the interests of those communities. Spam and other malicious content, on the other hand, follow different spreading patterns. In this paper, we follow this insight and present POISED, a system that leverages the differences in propagation between benign and malicious messages on social networks to identify spam and other unwanted content. We test our system on a dataset of 1.3M tweets collected from 64K users, and we show that our approach is effective in detecting malicious messages, reaching 91% precision and 93% recall. We also show that POISED's detection is more comprehensive than previous systems, by comparing it to three state-of-the-art spam detection systems that have been proposed by the research community in the past. POISED significantly outperforms each of these systems. Moreover, through simulations, we show how POISED is effective in the early detection of spam messages and how it is resilient against two well-known adversarial machine learning attacks

    Exploiting the wisdom of the crowds for characterizing and connecting heterogeneous resources

    Full text link
    Heterogeneous content is an inherent problem for cross-system search, recommendation and personalization. In this paper we investigate differences in topic coverage and the impact of topicstopics in different kinds of Web services. We use entity extraction and categorization to create ‘fin-gerprints ’ that allow for meaningful comparison. As a basis taxonomy, we use the 23 main categories of Wikipedia Cat-egory Graph, which has been assembled over the years by the wisdom of the crowds. Following a proof of concept of our approach, we analyze differences in topic coverage and topic impact. The results show many differences between Web services like Twitter, Flickr and Delicious, which re-flect users ’ behavior and the usage of each system. The paper concludes with a user study that demonstrates the benefits of fingerprints over traditional textual methods for recommendations of heterogeneous resources

    Finding twitter communities with common interests using following links of celebrities

    No full text
    One important problem in target advertising and viral marketing on online social networking sites is the efficient identification of com-munities with common interests in large social networks. Existing methods involve large scale community detection on the entire so-cial network before determining the interests of individuals within these communities. This approach is both computationally inten-sive and may result in communities without a common interest. We propose an efficient approach for detecting communities that share common interests on Twitter. Our approach involves first identify-ing celebrities that are representative of an interest category before detecting communities based on linkages among followers of these celebrities. We also study the characteristics of these communities and the effects of deepening or specialization of interest
    corecore