131 research outputs found
Latent Space Model for Multi-Modal Social Data
With the emergence of social networking services, researchers enjoy the
increasing availability of large-scale heterogenous datasets capturing online
user interactions and behaviors. Traditional analysis of techno-social systems
data has focused mainly on describing either the dynamics of social
interactions, or the attributes and behaviors of the users. However,
overwhelming empirical evidence suggests that the two dimensions affect one
another, and therefore they should be jointly modeled and analyzed in a
multi-modal framework. The benefits of such an approach include the ability to
build better predictive models, leveraging social network information as well
as user behavioral signals. To this purpose, here we propose the Constrained
Latent Space Model (CLSM), a generalized framework that combines Mixed
Membership Stochastic Blockmodels (MMSB) and Latent Dirichlet Allocation (LDA)
incorporating a constraint that forces the latent space to concurrently
describe the multiple data modalities. We derive an efficient inference
algorithm based on Variational Expectation Maximization that has a
computational cost linear in the size of the network, thus making it feasible
to analyze massive social datasets. We validate the proposed framework on two
problems: prediction of social interactions from user attributes and behaviors,
and behavior prediction exploiting network information. We perform experiments
with a variety of multi-modal social systems, spanning location-based social
networks (Gowalla), social media services (Instagram, Orkut), e-commerce and
review sites (Amazon, Ciao), and finally citation networks (Cora). The results
indicate significant improvement in prediction accuracy over state of the art
methods, and demonstrate the flexibility of the proposed approach for
addressing a variety of different learning problems commonly occurring with
multi-modal social data.Comment: 12 pages, 7 figures, 2 table
Predicting Semantic Relations using Global Graph Properties
Semantic graphs, such as WordNet, are resources which curate natural language
on two distinguishable layers. On the local level, individual relations between
synsets (semantic building blocks) such as hypernymy and meronymy enhance our
understanding of the words used to express their meanings. Globally, analysis
of graph-theoretic properties of the entire net sheds light on the structure of
human language as a whole. In this paper, we combine global and local
properties of semantic graphs through the framework of Max-Margin Markov Graph
Models (M3GM), a novel extension of Exponential Random Graph Model (ERGM) that
scales to large multi-relational graphs. We demonstrate how such global
modeling improves performance on the local task of predicting semantic
relations between synsets, yielding new state-of-the-art results on the WN18RR
dataset, a challenging version of WordNet link prediction in which "easy"
reciprocal cases are removed. In addition, the M3GM model identifies
multirelational motifs that are characteristic of well-formed lexical semantic
ontologies.Comment: EMNLP 201
Interactive visualization of heterogeneous social networks using glyphs
There is a growing need for visualizing heterogeneous social networks as new data sets become available. However, the existing visualization tools do not address the challenge of reading topological information introduced by heterogeneous node and link types. To resolve this issue, we introduce glyphs to node-link diagrams to conveniently represent the multivariate nature of heterogeneous node and link types. This provides the opportunity to visually reorganize topological information of the heterogeneous social networks without losing connectivity information. Moreover, a set of interaction techniques are provided to the analyst to give total control over the reorganization process. Finally, a case study is presented to using InfoVis 2008 data set to show the exploration process
Neural RELAGGS
Multi-relational databases are the basis of most consolidated data
collections in science and industry today. Most learning and mining algorithms,
however, require data to be represented in a propositional form. While there is
a variety of specialized machine learning algorithms that can operate directly
on multi-relational data sets, propositionalization algorithms transform
multi-relational databases into propositional data sets, thereby allowing the
application of traditional machine learning and data mining algorithms without
their modification. One prominent propositionalization algorithm is RELAGGS by
Krogel and Wrobel, which transforms the data by nested aggregations. We propose
a new neural network based algorithm in the spirit of RELAGGS that employs
trainable composite aggregate functions instead of the static aggregate
functions used in the original approach. In this way, we can jointly train the
propositionalization with the prediction model, or, alternatively, use the
learned aggegrations as embeddings in other algorithms. We demonstrate the
increased predictive performance by comparing N-RELAGGS with RELAGGS and
multiple other state-of-the-art algorithms.Comment: Submitted to Machine Learning Journa
- …