83,260 research outputs found
Clustering Memes in Social Media
The increasing pervasiveness of social media creates new opportunities to
study human social behavior, while challenging our capability to analyze their
massive data streams. One of the emerging tasks is to distinguish between
different kinds of activities, for example engineered misinformation campaigns
versus spontaneous communication. Such detection problems require a formal
definition of meme, or unit of information that can spread from person to
person through the social network. Once a meme is identified, supervised
learning methods can be applied to classify different types of communication.
The appropriate granularity of a meme, however, is hardly captured from
existing entities such as tags and keywords. Here we present a framework for
the novel task of detecting memes by clustering messages from large streams of
social data. We evaluate various similarity measures that leverage content,
metadata, network features, and their combinations. We also explore the idea of
pre-clustering on the basis of existing entities. A systematic evaluation is
carried out using a manually curated dataset as ground truth. Our analysis
shows that pre-clustering and a combination of heterogeneous features yield the
best trade-off between number of clusters and their quality, demonstrating that
a simple combination based on pairwise maximization of similarity is as
effective as a non-trivial optimization of parameters. Our approach is fully
automatic, unsupervised, and scalable for real-time detection of memes in
streaming data.Comment: Proceedings of the 2013 IEEE/ACM International Conference on Advances
in Social Networks Analysis and Mining (ASONAM'13), 201
From Tweet to Graph: Social Network Analysis for Semantic Information Extraction
International audienceThis paper represents a study along the cutting edge of the current analysis of online social network in relation with the contents communicated among users. Twitter data is carefully selected around a fixed hash-tag in order to study the specified content in relation with other contents that users bring to connection. A separate network of hash-tags related (in tweets) is constructed for different days; the networks are analyzed within advanced Gephi package, providing several measures -degree, betweenness centrality, communities, as well as the longest path, by which the evolution of communication around specified concepts is quantified. Our study is absolutely in the current trend of analysis of online social networks that, going beyond mere topology, reveals relevant linguistic and social categories and their dynamics
Online Popularity and Topical Interests through the Lens of Instagram
Online socio-technical systems can be studied as proxy of the real world to
investigate human behavior and social interactions at scale. Here we focus on
Instagram, a media-sharing online platform whose popularity has been rising up
to gathering hundred millions users. Instagram exhibits a mixture of features
including social structure, social tagging and media sharing. The network of
social interactions among users models various dynamics including
follower/followee relations and users' communication by means of
posts/comments. Users can upload and tag media such as photos and pictures, and
they can "like" and comment each piece of information on the platform. In this
work we investigate three major aspects on our Instagram dataset: (i) the
structural characteristics of its network of heterogeneous interactions, to
unveil the emergence of self organization and topically-induced community
structure; (ii) the dynamics of content production and consumption, to
understand how global trends and popular users emerge; (iii) the behavior of
users labeling media with tags, to determine how they devote their attention
and to explore the variety of their topical interests. Our analysis provides
clues to understand human behavior dynamics on socio-technical systems,
specifically users and content popularity, the mechanisms of users'
interactions in online environments and how collective trends emerge from
individuals' topical interests.Comment: 11 pages, 11 figures, Proceedings of ACM Hypertext 201
Tag disambiguation based on social network information
Within 20 years the Web has grown from a tool for scientists at CERN into a global information space. While returning to its roots as a read/write tool, its entering a more social and participatory phase. Hence a new, improved version called the Social Web where users are responsible for generating and sharing content on the global information space, they are also accountable for replicating the information. This collaborative activity can be observed in two of the most widely practised Social Web services such as social network sites and social tagging systems. Users annotate their interests and inclinations with free form keywords while they share them with their social connections. Although these keywords (tag) assist information organization and retrieval, theysuffer from polysemy.In this study we employ the effectiveness of social network sites to address the issue of ambiguity in social tagging. Moreover, we also propose that homophily in social network sites can be a useful aspect is disambiguating tags. We have extracted the ‘Likes’ of 20 Facebook users and employ them in disambiguation tags on Flickr. Classifiers are generated on the retrieved clusters from Flickr using K-Nearest-Neighbour algorithm and then their degree of similarity is calculated with user keywords. As tag disambiguation techniques lack gold standards for evaluation, we asked the users to indicate the contexts and used them as ground truth while examining the results. We analyse the performance of our approach by quantitative methods and report successful results. Our proposed method is able classify images with an accuracy of 6 out of 10 (on average). Qualitative analysis reveal some factors that affect the findings, and if addressed can produce more precise results
Identifying experts and authoritative documents in social bookmarking systems
Social bookmarking systems allow people to create pointers to Web resources in a shared, Web-based environment. These services allow users to add free-text labels, or “tags”, to their bookmarks as a way to organize resources for later recall. Ease-of-use, low cognitive barriers, and a lack of controlled vocabulary have allowed social bookmaking systems to grow exponentially over time. However, these same characteristics also raise concerns. Tags lack the formality of traditional classificatory metadata and suffer from the same vocabulary problems as full-text search engines. It is unclear how many valuable resources are untagged or tagged with noisy, irrelevant tags. With few restrictions to entry, annotation spamming adds noise to public social bookmarking systems. Furthermore, many algorithms for discovering semantic relations among tags do not scale to the Web.
Recognizing these problems, we develop a novel graph-based Expert and Authoritative Resource Location (EARL) algorithm to find the most authoritative documents and expert users on a given topic in a social bookmarking system. In EARL’s first phase, we reduce noise in a Delicious dataset by isolating a smaller sub-network of “candidate experts”, users whose tagging behavior shows potential domain and classification expertise. In the second phase, a HITS-based graph analysis is performed on the candidate experts’ data to rank the top experts and authoritative documents by topic. To identify topics of interest in Delicious, we develop a distributed method to find subsets of frequently co-occurring tags shared by many candidate experts.
We evaluated EARL’s ability to locate authoritative resources and domain experts in Delicious by conducting two independent experiments. The first experiment relies on human judges’ n-point scale ratings of resources suggested by three graph-based algorithms and Google. The second experiment evaluated the proposed approach’s ability to identify classification expertise through human judges’ n-point scale ratings of classification terms versus expert-generated data
Network Analysis on Incomplete Structures.
Over the past decade, networks have become an increasingly popular abstraction for problems in the physical, life, social and information sciences. Network analysis can be used to extract insights into an underlying system from the structure of its network representation. One of the challenges of applying network analysis is the fact that networks do not always have an observed and complete structure. This dissertation focuses on the problem of imputation and/or inference in the presence of incomplete network structures. I propose four novel systems, each of which, contain a module that involves the inference or imputation of an incomplete network that is necessary to complete the end task.
I first propose EdgeBoost, a meta-algorithm and framework that repeatedly applies a non-deterministic link predictor to improve the efficacy of community detection algorithms on networks with missing edges. On average EdgeBoost improves performance of existing algorithms by 7% on artificial data and 17% on ego networks collected from Facebook. The second system, Butterworth, identifies a social network user's topic(s) of interests and automatically generates a set of social feed ``rankers'' that enable the user to see topic specific sub-feeds. Butterworth uses link prediction to infer the missing semantics between members of a user's social network in order to detect topical clusters embedded in the network structure. For automatically generated topic lists, Butterworth achieves an average top-10 precision of 78%, as compared to a time-ordered baseline of 45%. Next, I propose Dobby, a system for constructing a knowledge graph of user-defined keyword tags. Leveraging a sparse set of labeled edges, Dobby trains a supervised learning algorithm to infer the hypernym relationships between keyword tags. Dobby was evaluated by constructing a knowledge graph of LinkedIn's skills dataset, achieving an average precision of 85% on a set of human labeled hypernym edges between skills. Lastly, I propose Lobbyback, a system that automatically identifies clusters of documents that exhibit text reuse and generates ``prototypes'' that represent a canonical version of text shared between the documents. Lobbyback infers a network structure in a corpus of documents and uses community detection in order to extract the document clusters.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133443/1/mattburg_1.pd
Tagging amongst friends: an exploration of social media exchange on mobile devices
Mobile social software tools have great potential in transforming the way users communicate
on the move, by augmenting their everyday environment with pertinent information from
their online social networks. A fundamental aspect to the success of these tools is in
developing an understanding of their emergent real-world use and also the aspirations of
users; this thesis focuses on investigating one facet of this: the exchange of social media. To
facilitate this investigation, three mobile social tools have been developed for use on locationaware
smartphone handsets. The first is an exploratory social game, 'Gophers' that utilises
task oriented gameplay, social agents and GSM cell positioning to create an engaging
ecosystem in which users create and exchange geotagged social media. Supplementing this is
a pair of social awareness and tagging services that integrate with a user's existing online
social network; the 'ItchyFeet' service uses GPS positioning to allow the user and their social
network peers to collaboratively build a landscape of socially important geotagged locations,
which are used as indicators of a user's context on their Facebook profile; likewise
'MobiClouds' revisits this concept by exploring the novel concept of Bluetooth 'people
tagging' to facilitate the creation of tags that are more indicative of users' social surroundings.
The thesis reports on findings from formal trials of these technologies, using groups of
volunteer social network users based around the city of Lincoln, UK, where the incorporation
of daily diaries, interviews and automated logging precisely monitored application use.
Through analysis of trial data, a guide for designers of future mobile social tools has been
devised and the factors that typically influence users when creating tags are identified. The
thesis makes a number of further contributions to the area. Firstly, it identifies the natural
desire of users to update their status whilst mobile; a practice recently popularised by
commercial 'check in' services. It also explores the overarching narratives that developed over
time, which formed an integral part of the tagging process and augmented social media with a
higher level meaning. Finally, it reveals how social media is affected by the tag positioning
method selected and also by personal circumstances, such as the proximity of social peers
- …