2,313 research outputs found
On Defining SPARQL with Boolean Tensor Algebra
The Resource Description Framework (RDF) represents information as
subject-predicate-object triples. These triples are commonly interpreted as a
directed labelled graph. We propose an alternative approach, interpreting the
data as a 3-way Boolean tensor. We show how SPARQL queries - the standard
queries for RDF - can be expressed as elementary operations in Boolean algebra,
giving us a complete re-interpretation of RDF and SPARQL. We show how the
Boolean tensor interpretation allows for new optimizations and analyses of the
complexity of SPARQL queries. For example, estimating the size of the results
for different join queries becomes much simpler
Scalable Bayesian Non-Negative Tensor Factorization for Massive Count Data
We present a Bayesian non-negative tensor factorization model for
count-valued tensor data, and develop scalable inference algorithms (both batch
and online) for dealing with massive tensors. Our generative model can handle
overdispersed counts as well as infer the rank of the decomposition. Moreover,
leveraging a reparameterization of the Poisson distribution as a multinomial
facilitates conjugacy in the model and enables simple and efficient Gibbs
sampling and variational Bayes (VB) inference updates, with a computational
cost that only depends on the number of nonzeros in the tensor. The model also
provides a nice interpretability for the factors; in our model, each factor
corresponds to a "topic". We develop a set of online inference algorithms that
allow further scaling up the model to massive tensors, for which batch
inference methods may be infeasible. We apply our framework on diverse
real-world applications, such as \emph{multiway} topic modeling on a scientific
publications database, analyzing a political science data set, and analyzing a
massive household transactions data set.Comment: ECML PKDD 201
- …