351 research outputs found

    Using Text Similarity to Detect Social Interactions not Captured by Formal Reply Mechanisms

    Full text link
    In modeling social interaction online, it is important to understand when people are reacting to each other. Many systems have explicit indicators of replies, such as threading in discussion forums or replies and retweets in Twitter. However, it is likely these explicit indicators capture only part of people's reactions to each other, thus, computational social science approaches that use them to infer relationships or influence are likely to miss the mark. This paper explores the problem of detecting non-explicit responses, presenting a new approach that uses tf-idf similarity between a user's own tweets and recent tweets by people they follow. Based on a month's worth of posting data from 449 ego networks in Twitter, this method demonstrates that it is likely that at least 11% of reactions are not captured by the explicit reply and retweet mechanisms. Further, these uncaptured reactions are not evenly distributed between users: some users, who create replies and retweets without using the official interface mechanisms, are much more responsive to followees than they appear. This suggests that detecting non-explicit responses is an important consideration in mitigating biases and building more accurate models when using these markers to study social interaction and information diffusion.Comment: A final version of this work was published in the 2015 IEEE 11th International Conference on e-Science (e-Science

    Social software development: Insights and solutions

    Get PDF

    The ISIS Twitter census: defining and describing the population of ISIS supporters on Twitter

    Get PDF
    Presents a demographic snapshot of ISIS supporters on Twitter by analysing a sample of 20,000 ISIS-supporting Twitter accounts, mapping the locations, preferred languages, and the number and type of followers of these accounts. Overview Although much ink has been spilled on ISIS’s activity on Twitter, very basic questions about the group’s social media strategy remain unanswered. In a new analysis paper, J.M. Berger and Jonathon Morgan answer fundamental questions about how many Twitter users support ISIS, who and where they are, and how they participate in its highly organized online activities. Previous analyses of ISIS’s Twitter reach have relied on limited segments of the overall ISIS social network. The small, cellular nature of that network—and the focus on particular subsets within the network such as foreign fighters—may create misleading conclusions. This information vacuum extends to discussions of how the West should respond to the group’s online campaigns. Berger and Morgan present a demographic snapshot of ISIS supporters on Twitter by analyzing a sample of 20,000 ISIS-supporting Twitter accounts. Using a sophisticated and innovative methodology, the authors map the locations, preferred languages, and the number and type of followers of these accounts. Among the key findings: From September through December 2014, the authors estimate that at least 46,000 Twitter accounts were used by ISIS supporters, although not all of them were active at the same time.  Typical ISIS supporters were located within the organization’s territories in Syria and Iraq, as well as in regions contested by ISIS. Hundreds of ISIS-supporting accounts sent tweets with location metadata embedded.  Almost one in five ISIS supporters selected English as their primary language when using Twitter. Three quarters selected Arabic. ISIS-supporting accounts had an average of about 1,000 followers each, considerably higher than an ordinary Twitter user. ISIS-supporting accounts were also considerably more active than non-supporting users. A minimum of 1,000 ISIS-supporting accounts were suspended by Twitter between September and December 2014. Accounts that tweeted most often and had the most followers were most likely to be suspended. Much of ISIS’s social media success can be attributed to a relatively small group of hyperactive users, numbering between 500 and 2,000 accounts, which tweet in concentrated bursts of high volume. Based on their key findings, the authors recommend social media companies and the U.S government work together to devise appropriate responses to extremism on social media. Approaches to the problem of extremist use of social media, Berger and Morgan contend, are most likely to succeed when they are mainstreamed into wider dialogues among the broad range of community, private, and public stakeholders

    Temporal models for mining, ranking and recommendation in the Web

    Get PDF
    Due to their first-hand, diverse and evolution-aware reflection of nearly all areas of life, heterogeneous temporal datasets i.e., the Web, collaborative knowledge bases and social networks have been emerged as gold-mines for content analytics of many sorts. In those collections, time plays an essential role in many crucial information retrieval and data mining tasks, such as from user intent understanding, document ranking to advanced recommendations. There are two semantically closed and important constituents when modeling along the time dimension, i.e., entity and event. Time is crucially served as the context for changes driven by happenings and phenomena (events) that related to people, organizations or places (so-called entities) in our social lives. Thus, determining what users expect, or in other words, resolving the uncertainty confounded by temporal changes is a compelling task to support consistent user satisfaction. In this thesis, we address the aforementioned issues and propose temporal models that capture the temporal dynamics of such entities and events to serve for the end tasks. Specifically, we make the following contributions in this thesis: (1) Query recommendation and document ranking in the Web - we address the issues for suggesting entity-centric queries and ranking effectiveness surrounding the happening time period of an associated event. In particular, we propose a multi-criteria optimization framework that facilitates the combination of multiple temporal models to smooth out the abrupt changes when transitioning between event phases for the former and a probabilistic approach for search result diversification of temporally ambiguous queries for the latter. (2) Entity relatedness in Wikipedia - we study the long-term dynamics of Wikipedia as a global memory place for high-impact events, specifically the reviving memories of past events. Additionally, we propose a neural network-based approach to measure the temporal relatedness of entities and events. The model engages different latent representations of an entity (i.e., from time, link-based graph and content) and use the collective attention from user navigation as the supervision. (3) Graph-based ranking and temporal anchor-text mining inWeb Archives - we tackle the problem of discovering important documents along the time-span ofWeb Archives, leveraging the link graph. Specifically, we combine the problems of relevance, temporal authority, diversity and time in a unified framework. The model accounts for the incomplete link structure and natural time lagging in Web Archives in mining the temporal authority. (4) Methods for enhancing predictive models at early-stage in social media and clinical domain - we investigate several methods to control model instability and enrich contexts of predictive models at the “cold-start” period. We demonstrate their effectiveness for the rumor detection and blood glucose prediction cases respectively. Overall, the findings presented in this thesis demonstrate the importance of tracking these temporal dynamics surround salient events and entities for IR applications. We show that determining such changes in time-based patterns and trends in prevalent temporal collections can better satisfy user expectations, and boost ranking and recommendation effectiveness over time

    Enhancing topology adaptation in information-sharing social networks

    Get PDF
    The advent of Internet and World Wide Web has led to unprecedent growth of the information available. People usually face the information overload by following a limited number of sources which best fit their interests. It has thus become important to address issues like who gets followed and how to allow people to discover new and better information sources. In this paper we conduct an empirical analysis on different on-line social networking sites, and draw inspiration from its results to present different source selection strategies in an adaptive model for social recommendation. We show that local search rules which enhance the typical topological features of real social communities give rise to network configurations that are globally optimal. These rules create networks which are effective in information diffusion and resemble structures resulting from real social systems

    Event-Based User Classification in Weibo Media

    Get PDF

    PREDICTION IN SOCIAL MEDIA FOR MONITORING AND RECOMMENDATION

    Get PDF
    Social media including blogs and microblogs provide a rich window into user online activity. Monitoring social media datasets can be expensive due to the scale and inherent noise in such data streams. Monitoring and prediction can provide significant benefit for many applications including brand monitoring and making recommendations. Consider a focal topic and posts on multiple blog channels on this topic. Being able to target a few potentially influential blog channels which will contain relevant posts is valuable. Once these channels have been identified, a user can proactively join the conversation themselves to encourage positive word-of-mouth and to mitigate negative word-of-mouth. Links between different blog channels, and retweets and mentions between different microblog users, are a proxy of information flow and influence. When trying to monitor where information will flow and who will be influenced by a focal user, it is valuable to predict future links, retweets and mentions. Predictions of users who will post on a focal topic or who will be influenced by a focal user can yield valuable recommendations. In this thesis we address the problem of prediction in social media to select social media channels for monitoring and recommendation. Our analysis focuses on individual authors and linkers. We address a series of prediction problems including future author prediction problem and future link prediction problem in the blogosphere, as well as prediction in microblogs such as twitter. For the future author prediction in the blogosphere, where there are network properties and content properties, we develop prediction methods inspired by information retrieval approaches that use historical posts in the blog channel for prediction. We also train a ranking support vector machine (SVM) to solve the problem, considering both network properties and content properties. We identify a number of features which have impact on prediction accuracy. For the future link prediction in the blogosphere, we compare multiple link prediction methods, and show that our proposed solution which combines the network properties of the blog with content properties does better than methods which examine network properties or content properties in isolation. Most of the previous work has only looked at either one or the other. For the prediction in microblogs, where there are follower network, retweet network, and mention network, we propose a prediction model to utilize the hybrid network for prediction. In this model, we define a potential function that reflects the likelihood of a candidate user having a specific type of link to a focal user in the future and identify an optimization problem by the principle of maximum likelihood to determine the parameters in the model. We propose different approximate approaches based on the prediction model. Our approaches are demonstrated to outperform the baseline methods which only consider one network or utilize hybrid networks in a naive way. The prediction model can be applied to other similar problems where hybrid networks exist

    Estimating attention flow in online video networks

    Full text link
    © 2019 Association for Computing Machinery. Online videos have shown tremendous increase in Internet traffic. Most video hosting sites implement recommender systems, which connect the videos into a directed network and conceptually act as a source of pathways for users to navigate. At present, little is known about how human attention is allocated over such large-scale networks, and about the impacts of the recommender systems. In this paper, we first construct the Vevo network — a YouTube video network with 60,740 music videos interconnected by the recommendation links, and we collect their associated viewing dynamics. This results in a total of 310 million views every day over a period of 9 weeks. Next, we present large-scale measurements that connect the structure of the recommendation network and the video attention dynamics. We use the bow-tie structure to characterize the Vevo network and we find that its core component (23.1% of the videos), which occupies most of the attention (82.6% of the views), is made out of videos that are mainly recommended among themselves. This is indicative of the links between video recommendation and the inequality of attention allocation. Finally, we address the task of estimating the attention flow in the video recommendation network. We propose a model that accounts for the network effects for predicting video popularity, and we show it consistently outperforms the baselines. This model also identifies a group of artists gaining attention because of the recommendation network. Altogether, our observations and our models provide a new set of tools to better understand the impacts of recommender systems on collective social attention
    • …
    corecore