6,704 research outputs found
Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking
This paper proposes a new neural architecture for collaborative ranking with
implicit feedback. Our model, LRML (\textit{Latent Relational Metric Learning})
is a novel metric learning approach for recommendation. More specifically,
instead of simple push-pull mechanisms between user and item pairs, we propose
to learn latent relations that describe each user item interaction. This helps
to alleviate the potential geometric inflexibility of existing metric learing
approaches. This enables not only better performance but also a greater extent
of modeling capability, allowing our model to scale to a larger number of
interactions. In order to do so, we employ a augmented memory module and learn
to attend over these memory blocks to construct latent relations. The
memory-based attention module is controlled by the user-item interaction,
making the learned relation vector specific to each user-item pair. Hence, this
can be interpreted as learning an exclusive and optimal relational translation
for each user-item interaction. The proposed architecture demonstrates the
state-of-the-art performance across multiple recommendation benchmarks. LRML
outperforms other metric learning models by in terms of Hits@10 and
nDCG@10 on large datasets such as Netflix and MovieLens20M. Moreover,
qualitative studies also demonstrate evidence that our proposed model is able
to infer and encode explicit sentiment, temporal and attribute information
despite being only trained on implicit feedback. As such, this ascertains the
ability of LRML to uncover hidden relational structure within implicit
datasets.Comment: WWW 201
Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data
Due to its causal semantics, Bayesian networks (BN) have been widely employed
to discover the underlying data relationship in exploratory studies, such as
brain research. Despite its success in modeling the probability distribution of
variables, BN is naturally a generative model, which is not necessarily
discriminative. This may cause the ignorance of subtle but critical network
changes that are of investigation values across populations. In this paper, we
propose to improve the discriminative power of BN models for continuous
variables from two different perspectives. This brings two general
discriminative learning frameworks for Gaussian Bayesian networks (GBN). In the
first framework, we employ Fisher kernel to bridge the generative models of GBN
and the discriminative classifiers of SVMs, and convert the GBN parameter
learning to Fisher kernel learning via minimizing a generalization error bound
of SVMs. In the second framework, we employ the max-margin criterion and build
it directly upon GBN models to explicitly optimize the classification
performance of the GBNs. The advantages and disadvantages of the two frameworks
are discussed and experimentally compared. Both of them demonstrate strong
power in learning discriminative parameters of GBNs for neuroimaging based
brain network analysis, as well as maintaining reasonable representation
capacity. The contributions of this paper also include a new Directed Acyclic
Graph (DAG) constraint with theoretical guarantee to ensure the graph validity
of GBN.Comment: 16 pages and 5 figures for the article (excluding appendix
High-Dimensional Dependency Structure Learning for Physical Processes
In this paper, we consider the use of structure learning methods for
probabilistic graphical models to identify statistical dependencies in
high-dimensional physical processes. Such processes are often synthetically
characterized using PDEs (partial differential equations) and are observed in a
variety of natural phenomena, including geoscience data capturing atmospheric
and hydrological phenomena. Classical structure learning approaches such as the
PC algorithm and variants are challenging to apply due to their high
computational and sample requirements. Modern approaches, often based on sparse
regression and variants, do come with finite sample guarantees, but are usually
highly sensitive to the choice of hyper-parameters, e.g., parameter
for sparsity inducing constraint or regularization. In this paper, we present
ACLIME-ADMM, an efficient two-step algorithm for adaptive structure learning,
which estimates an edge specific parameter in the first step,
and uses these parameters to learn the structure in the second step. Both steps
of our algorithm use (inexact) ADMM to solve suitable linear programs, and all
iterations can be done in closed form in an efficient block parallel manner. We
compare ACLIME-ADMM with baselines on both synthetic data simulated by partial
differential equations (PDEs) that model advection-diffusion processes, and
real data (50 years) of daily global geopotential heights to study information
flow in the atmosphere. ACLIME-ADMM is shown to be efficient, stable, and
competitive, usually better than the baselines especially on difficult
problems. On real data, ACLIME-ADMM recovers the underlying structure of global
atmospheric circulation, including switches in wind directions at the equator
and tropics entirely from the data.Comment: 21 pages, 8 figures, International Conference on Data Mining 201
Causal Discovery from Temporal Data: An Overview and New Perspectives
Temporal data, representing chronological observations of complex systems,
has always been a typical data structure that can be widely generated by many
domains, such as industry, medicine and finance. Analyzing this type of data is
extremely valuable for various applications. Thus, different temporal data
analysis tasks, eg, classification, clustering and prediction, have been
proposed in the past decades. Among them, causal discovery, learning the causal
relations from temporal data, is considered an interesting yet critical task
and has attracted much research attention. Existing casual discovery works can
be divided into two highly correlated categories according to whether the
temporal data is calibrated, ie, multivariate time series casual discovery, and
event sequence casual discovery. However, most previous surveys are only
focused on the time series casual discovery and ignore the second category. In
this paper, we specify the correlation between the two categories and provide a
systematical overview of existing solutions. Furthermore, we provide public
datasets, evaluation metrics and new perspectives for temporal data casual
discovery.Comment: 52 pages, 6 figure
- …