4,295 research outputs found
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
The role of emotional variables in the classification and prediction of collective social dynamics
We demonstrate the power of data mining techniques for the analysis of
collective social dynamics within British Tweets during the Olympic Games 2012.
The classification accuracy of online activities related to the successes of
British athletes significantly improved when emotional components of tweets
were taken into account, but employing emotional variables for activity
prediction decreased the classifiers' quality. The approach could be easily
adopted for any prediction or classification study with a set of
problem-specific variables.Comment: 16 pages, 9 figures, 2 tables and 1 appendi
Human gait recognition with matrix representation
Human gait is an important biometric feature. It can be perceived from a great distance and has recently attracted greater attention in video-surveillance-related applications, such as closed-circuit television. We explore gait recognition based on a matrix representation in this paper. First, binary silhouettes over one gait cycle are averaged. As a result, each gait video sequence, containing a number of gait cycles, is represented by a series of gray-level averaged images. Then, a matrix-based unsupervised algorithm, namely coupled subspace analysis (CSA), is employed as a preprocessing step to remove noise and retain the most representative information. Finally, a supervised algorithm, namely discriminant analysis with tensor representation, is applied to further improve classification ability. This matrix-based scheme demonstrates a much better gait recognition performance than state-of-the-art algorithms on the standard USF HumanID Gait database
Latent Space Model for Multi-Modal Social Data
With the emergence of social networking services, researchers enjoy the
increasing availability of large-scale heterogenous datasets capturing online
user interactions and behaviors. Traditional analysis of techno-social systems
data has focused mainly on describing either the dynamics of social
interactions, or the attributes and behaviors of the users. However,
overwhelming empirical evidence suggests that the two dimensions affect one
another, and therefore they should be jointly modeled and analyzed in a
multi-modal framework. The benefits of such an approach include the ability to
build better predictive models, leveraging social network information as well
as user behavioral signals. To this purpose, here we propose the Constrained
Latent Space Model (CLSM), a generalized framework that combines Mixed
Membership Stochastic Blockmodels (MMSB) and Latent Dirichlet Allocation (LDA)
incorporating a constraint that forces the latent space to concurrently
describe the multiple data modalities. We derive an efficient inference
algorithm based on Variational Expectation Maximization that has a
computational cost linear in the size of the network, thus making it feasible
to analyze massive social datasets. We validate the proposed framework on two
problems: prediction of social interactions from user attributes and behaviors,
and behavior prediction exploiting network information. We perform experiments
with a variety of multi-modal social systems, spanning location-based social
networks (Gowalla), social media services (Instagram, Orkut), e-commerce and
review sites (Amazon, Ciao), and finally citation networks (Cora). The results
indicate significant improvement in prediction accuracy over state of the art
methods, and demonstrate the flexibility of the proposed approach for
addressing a variety of different learning problems commonly occurring with
multi-modal social data.Comment: 12 pages, 7 figures, 2 table
Ask the GRU: Multi-Task Learning for Deep Text Recommendations
In a variety of application domains the content to be recommended to users is
associated with text. This includes research papers, movies with associated
plot summaries, news articles, blog posts, etc. Recommendation approaches based
on latent factor models can be extended naturally to leverage text by employing
an explicit mapping from text to factors. This enables recommendations for new,
unseen content, and may generalize better, since the factors for all items are
produced by a compactly-parametrized model. Previous work has used topic models
or averages of word embeddings for this mapping. In this paper we present a
method leveraging deep recurrent neural networks to encode the text sequence
into a latent vector, specifically gated recurrent units (GRUs) trained
end-to-end on the collaborative filtering task. For the task of scientific
paper recommendation, this yields models with significantly higher accuracy. In
cold-start scenarios, we beat the previous state-of-the-art, all of which
ignore word order. Performance is further improved by multi-task learning,
where the text encoder network is trained for a combination of content
recommendation and item metadata prediction. This regularizes the collaborative
filtering model, ameliorating the problem of sparsity of the observed rating
matrix.Comment: 8 page
- …