24,807 research outputs found
Latent Space Model for Multi-Modal Social Data
With the emergence of social networking services, researchers enjoy the
increasing availability of large-scale heterogenous datasets capturing online
user interactions and behaviors. Traditional analysis of techno-social systems
data has focused mainly on describing either the dynamics of social
interactions, or the attributes and behaviors of the users. However,
overwhelming empirical evidence suggests that the two dimensions affect one
another, and therefore they should be jointly modeled and analyzed in a
multi-modal framework. The benefits of such an approach include the ability to
build better predictive models, leveraging social network information as well
as user behavioral signals. To this purpose, here we propose the Constrained
Latent Space Model (CLSM), a generalized framework that combines Mixed
Membership Stochastic Blockmodels (MMSB) and Latent Dirichlet Allocation (LDA)
incorporating a constraint that forces the latent space to concurrently
describe the multiple data modalities. We derive an efficient inference
algorithm based on Variational Expectation Maximization that has a
computational cost linear in the size of the network, thus making it feasible
to analyze massive social datasets. We validate the proposed framework on two
problems: prediction of social interactions from user attributes and behaviors,
and behavior prediction exploiting network information. We perform experiments
with a variety of multi-modal social systems, spanning location-based social
networks (Gowalla), social media services (Instagram, Orkut), e-commerce and
review sites (Amazon, Ciao), and finally citation networks (Cora). The results
indicate significant improvement in prediction accuracy over state of the art
methods, and demonstrate the flexibility of the proposed approach for
addressing a variety of different learning problems commonly occurring with
multi-modal social data.Comment: 12 pages, 7 figures, 2 table
Finding co-solvers on Twitter, with a little help from Linked Data
In this paper we propose a method for suggesting potential collaborators for solving innovation challenges online, based on their competence, similarity of interests and social proximity with the user. We rely on Linked Data to derive a measure of semantic relatedness that we use to enrich both user profiles and innovation problems with additional relevant topics, thereby improving the performance of co-solver recommendation. We evaluate this approach against state of the art methods for query enrichment based on the distribution of topics in user profiles, and demonstrate its usefulness in recommending collaborators that are both complementary in competence and compatible with the user. Our experiments are grounded using data from the social networking service Twitter.com
#mytweet via Instagram: Exploring User Behaviour across Multiple Social Networks
We study how users of multiple online social networks (OSNs) employ and share
information by studying a common user pool that use six OSNs - Flickr, Google+,
Instagram, Tumblr, Twitter, and YouTube. We analyze the temporal and topical
signature of users' sharing behaviour, showing how they exhibit distinct
behaviorial patterns on different networks. We also examine cross-sharing
(i.e., the act of user broadcasting their activity to multiple OSNs
near-simultaneously), a previously-unstudied behaviour and demonstrate how
certain OSNs play the roles of originating source and destination sinks.Comment: IEEE/ACM International Conference on Advances in Social Networks
Analysis and Mining, 2015. This is the pre-peer reviewed version and the
final version is available at
http://wing.comp.nus.edu.sg/publications/2015/lim-et-al-15.pd
LODE: Linking Digital Humanities Content to the Web of Data
Numerous digital humanities projects maintain their data collections in the
form of text, images, and metadata. While data may be stored in many formats,
from plain text to XML to relational databases, the use of the resource
description framework (RDF) as a standardized representation has gained
considerable traction during the last five years. Almost every digital
humanities meeting has at least one session concerned with the topic of digital
humanities, RDF, and linked data. While most existing work in linked data has
focused on improving algorithms for entity matching, the aim of the
LinkedHumanities project is to build digital humanities tools that work "out of
the box," enabling their use by humanities scholars, computer scientists,
librarians, and information scientists alike. With this paper, we report on the
Linked Open Data Enhancer (LODE) framework developed as part of the
LinkedHumanities project. With LODE we support non-technical users to enrich a
local RDF repository with high-quality data from the Linked Open Data cloud.
LODE links and enhances the local RDF repository without compromising the
quality of the data. In particular, LODE supports the user in the enhancement
and linking process by providing intuitive user-interfaces and by suggesting
high-quality linking candidates using tailored matching algorithms. We hope
that the LODE framework will be useful to digital humanities scholars
complementing other digital humanities tools
A large multilingual and multi-domain dataset for recommender systems
This paper presents a multi-domain interests dataset to train and test Recommender Systems, and the methodology to create the dataset
from Twitter messages in English and Italian. The English dataset includes an average of 90 preferences per user on music, books,
movies, celebrities, sport, politics and much more, for about half million users. Preferences are either extracted from messages of
users who use Spotify, Goodreads and other similar content sharing platforms, or induced from their ”topical” friends, i.e., followees
representing an interest rather than a social relation between peers. In addition, preferred items are matched with Wikipedia articles
describing them. This unique feature of our dataset provides a mean to derive a semantic categorization of the preferred items, exploiting
available semantic resources linked to Wikipedia such as the Wikipedia Category Graph, DBpedia, BabelNet and others
Drag it together with Groupie: making RDF data authoring easy and fun for anyone
One of the foremost challenges towards realizing a “Read-write Web of Data” [3] is making it possible for everyday computer users to easily find, manipulate, create, and publish data back to the Web so that it can be made available for others to use. However, many aspects of Linked Data make authoring and manipulation difficult for “normal” (ie non-coder) end-users. First, data can be high-dimensional, having arbitrary many properties per “instance”, and interlinked to arbitrary many other instances in a many different ways. Second, collections of Linked Data tend to be vastly more heterogeneous than in typical structured databases, where instances are kept in uniform collections (e.g., database tables). Third, while highly flexible, the problem of having all structures reduced as a graph is verbosity: even simple structures can appear complex. Finally, many of the concepts involved in linked data authoring - for example, terms used to define ontologies are highly abstract and foreign to regular citizen-users.To counter this complexity we have devised a drag-and-drop direct manipulation interface that makes authoring Linked Data easy, fun, and accessible to a wide audience. Groupie allows users to author data simply by dragging blobs representing entities into other entities to compose relationships, establishing one relational link at a time. Since the underlying representation is RDF, Groupie facilitates the inclusion of references to entities and properties defined elsewhere on the Web through integration with popular Linked Data indexing services. Finally, to make it easy for new users to build upon others’ work, Groupie provides a communal space where all data sets created by users can be shared, cloned and modified, allowing individual users to help each other model complex domains thereby leveraging collective intelligence
- …