11 research outputs found

    The text and images of the gopdebate: what the public is "talking" about on instagram

    Get PDF
    The richness of Instagram data makes it possible to tell a compelling story about the public’s “talk” on Instagram. Our work focuses on the use of Instagram by citizens to express their thoughts on the 2016 GOP presidential election. We collected Instagram posts with the hashtags #GOPDebate and #GOPDebates for five months. Using topic-modeling and sentiment analysis techniques to analyze both textual (post captions) and visual (images) attributes we are able to illustrate the topical network of political discussions and actors whom the public talks about on each topic. Our work contributes to literature on the role of social media, and specifically Instagram, in the political domain. The methodology also demonstrates how textual and visual attributes can be used together to categorize photo content

    Event detection, tracking, and visualization in Twitter: a mention-anomaly-based approach

    Full text link
    The ever-growing number of people using Twitter makes it a valuable source of timely information. However, detecting events in Twitter is a difficult task, because tweets that report interesting events are overwhelmed by a large volume of tweets on unrelated topics. Existing methods focus on the textual content of tweets and ignore the social aspect of Twitter. In this paper we propose MABED (i.e. mention-anomaly-based event detection), a novel statistical method that relies solely on tweets and leverages the creation frequency of dynamic links (i.e. mentions) that users insert in tweets to detect significant events and estimate the magnitude of their impact over the crowd. MABED also differs from the literature in that it dynamically estimates the period of time during which each event is discussed, rather than assuming a predefined fixed duration for all events. The experiments we conducted on both English and French Twitter data show that the mention-anomaly-based approach leads to more accurate event detection and improved robustness in presence of noisy Twitter content. Qualitatively speaking, we find that MABED helps with the interpretation of detected events by providing clear textual descriptions and precise temporal descriptions. We also show how MABED can help understanding users' interest. Furthermore, we describe three visualizations designed to favor an efficient exploration of the detected events.Comment: 17 page

    Public scientific communication on Twitter:visual analytic approach

    Get PDF
    Purpose - The purpose of this paper is to assess high-dimensional visualisation, combined with pattern matching, as an approach to observing dynamic changes in the ways people tweet about science topics. Design/methodology/approach - The high-dimensional visualisation approach was applied to three scientific topics to test its effectiveness for longitudinal analysis of message framing on Twitter over two disjoint periods in time. The paper uses coding frames to drive categorisation and visual analytics of tweets discussing the science topics. Findings - The findings point to the potential of this mixed methods approach, as it allows sufficiently high sensitivity to recognise and support the analysis of non-trending as well as trending topics on Twitter. Research limitations/implications - Three topics are studied and these illustrate a range of frames, but results may not be representative of all scientific topics. Social implications - Funding bodies increasingly encourage scientists to participate in public engagement. As social media provides an avenue actively utilised for public communication, understanding the nature of the dialog on this medium is important for the scientific community and the public at large. Originality/value - This study differs from standard approaches to the analysis of microblog data, which tend to focus on machine driven analysis large-scale datasets. It provides evidence that this approach enables practical and effective analysis of the content of midsize to large collections of microposts

    What Were the Tweets About? Topical Associations between Public Events and Twitter Feeds

    No full text
    Social media channels such as Twitter have emerged as platforms for crowds to respond to public and televised events such as speeches and debates. However, the very large volume of responses presents challenges for attempts to extract sense from them. In this work, we present an analytical method based on joint statistical modeling of topical influences from the events and associated Twitter feeds. The model enables the auto-segmentation of the events and the characterization of tweets into two categories: (1) episodic tweets that respond specifically to the content in the segmentsof the events, and (2) steady tweets that respond generally about the events. By applying our method to two large sets of tweets in response to President Obama's speech on the Middle East in May 2011 and a Republican Primary debate in September 2011, we present what these tweets were about. We also reveal the nature and magnitude of the influences of the event on the tweets over its timeline. In a user study, we further show that users find the topics and the episodic tweets discovered by our method to be of higher quality and more interesting as compared to the state-of-the-art, with improvements in the range of 18-41%

    LEARNING ON GRAPHS: ALGORITHMS FOR CLASSIFICATION AND SEQUENTIAL DECISIONS

    Get PDF
    In recent years, networked data have become widespread due to the increasing importance of social networks and other web-related applications. This growing interest is driving researchers to design new algorithms for solving important problems that involve networked data. In this thesis we present a few practical yet principled algorithms for learning and sequential decision-making on graphs. Classification of networked data is an important problem that has recently received a great deal of attention from the machine learning community. This is due to its many important practical applications: computer vision, bioinformatics, spam detection and text categorization, just to cite a few of the more conspicuous examples. We focus our attention on the task called ``node classification'', often studied in the semi-supervised (transductive) setting. We present two algorithms, motivated by different theoretical frameworks. The first algorithm is studied in the well-known online adversarial setting, within which it enjoys an optimal mistake bound (up to logarithmic factors). The second algorithm is based on a game-theoretic approach, where each node of the network is maximizing its own payoff. The setting corresponds to a Graph Transduction Game in which the graph is a tree. For this special case, we show that the Nash Equilibrium of the game can be reached in linear time. We complement our theoretical findings with an extensive set of experiments using datasets from many different domains. In the second part of the thesis, we present a rapidly emerging theme in the analysis of networked data: signed networks, graphs whose edges carry a label encoding the positive or negative nature of the relationship between the connected nodes. For example, social networks and e-commerce offer several examples of signed relationships: Slashdot users can tag other users as friends or foes, Epinions users can rate each other positively or negatively, Ebay users develop trust and distrust towards sellers in the network. More generally, two individuals that are related because they rate similar products in a recommendation website may agree or disagree in their ratings. Many heuristics for link classification in social networks are based on a form of social balance summarized by the motto \u201cthe enemy of my enemy is my friend\u201d. This is equivalent to saying that the signs on the edges of a social graph tend to be consistent with some two-clustering structure of the nodes, where edges connecting nodes from the same cluster are positive and edges connecting nodes from different clusters are negative. We present algorithms for the batch transductive active learning setting, where the topology of the graph is known in advance and our algorithms can ask for the label of some specific edges during the training phase (before starting with the predictions). These algorithms can achieve different tradeoffs between the number of mistakes during the test phase and the number of labels required during the training phase. We also presented an experimental comparison against some state-of-the-art spectral heuristics presented in a previous work, where we show that the simplest or our algorithms is already competitive with the best of these heuristics. In the last chapter we present another way to exploit relational information for sequential predictions: the networks of bandits. Contextual bandits adequately formalize the exploration-exploitation trade-offs arising in several industrially relevant applications, such online advertisement and recommendation systems. Many practical applications have a strong social component whose integration in the bandit algorithm could lead to a significant performance improvement: for example, since often friends have similar taste, we may want to serve contents to a group of users by taking advantage of an underlying network of social relationships among them. We introduce a novel algorithmic approach to a particular networked bandit problem. More specifically, we run a bandit algorithm on each network node (e.g., user), allowing it to ``share'' feedback signals with the other nodes by employing the multi-task kernel. We derive the regret analysis of this algorithm and, finally, we report on the results of an experimental comparison between our approach and the state of the art techniques, on both artificial and real-world social networks

    Toward Effective Knowledge Discovery in Social Media Streams

    Get PDF
    The last few decades have seen an unprecedented growth in the amount of new data. New computing and communications resources, such as cloud data platforms and mo- bile devices have enabled individuals to contribute new ideas, share points of view and exchange newsworthy bits with each other at a previously unfathomable rate. While there are many ways a modern person can communicate digitally with others, social media outlets, such as Twitter or Facebook have been occupying much of the focus of inter-person social networking in recent years. The millions of pieces of content published on social media sites have been both a blessing and a curse for those trying to make sense of the discourse. On one hand, the sheer amount of easily available, real time, contextually relevant content has been a cause of much excitement in academia and the industry. On the other hand, however, the amount of new diverse content that is being continuously published on social sites makes it difficult for researchers and industry participants to effectively grasp. Therefore, the goal of this thesis is to discover a set of approaches and techniques that would help enable data miners to quickly develop intuitions regarding the happenings in the social media space. To that aim, I concentrate on effectively visualizing social media streams as hierarchical structures, as such structures have been shown to be useful in human sense makingPh.D., Information Studies -- Drexel University, 201

    Opinion Mining of Sociopolitical Comments from Social Media

    Get PDF

    Technologically Mediated Discourse and Information Exchange through Medium Specific Syntactical Features: The 2012 Presidential Election on Twitter

    Get PDF
    Political discourse has been historically constrained by geographic proximity of participants. The introduction of the Internet and specifically social media has altered these geographic constraints and political discourse is now one of the most prevalent activities in social media. The increasing use of technology to acquire political information and participate in the political process in the United States creates a gap between what is understood about political activity in a democratic society and the specific technological features people use. As more individuals begin to use technology for political activity, understanding how the technology is used becomes increasingly important. Previous research exploring political discourse on social media has focused on one discrete event or a narrow time period. This narrow focus limits the understanding of the complex environment that comprises an election. This study takes a longitudinal approach and uses network analysis, co-occurrence analysis and temporal frequency analysis to examine a 53 million Twitter message (tweet) corpus collected during the 2012 Presidential Election (August 20, 2012 - November 13, 2012) to understand how individuals use Twitter to engage in political discourse. The queries used to compose the dataset were theoretically informed based on democratic theory and previous socio-technical research. This study makes three contributions to the existing literature. First, this study identifies that individuals use syntactical features differently in the context of an acute event such as a debate. Second, this study indicates that, although candidates and media are the most talked about and talked to, these interactions elicit no response. Third, this study reveals that information shared through URLs was predominantly user-generated content from Twitter and mass media information suggesting a reflexive information-sharing environment. This study illustrates that even with the availability of the numerous technological and syntactical features to facilitate interactions and share information, there is still a limited realization of the promise that technologies such as Twitter afford. Instead of fundamentally changing the political discourse process by having individuals use it for two-way communication, Twitter amplifies the existing political environment where there is limited cohesive discourse and communication is one-way.Ph.D., Information Studies -- Drexel University, 201
    corecore