458 research outputs found
Zero-Truncated Poisson Tensor Factorization for Massive Binary Tensors
We present a scalable Bayesian model for low-rank factorization of massive
tensors with binary observations. The proposed model has the following key
properties: (1) in contrast to the models based on the logistic or probit
likelihood, using a zero-truncated Poisson likelihood for binary data allows
our model to scale up in the number of \emph{ones} in the tensor, which is
especially appealing for massive but sparse binary tensors; (2)
side-information in form of binary pairwise relationships (e.g., an adjacency
network) between objects in any tensor mode can also be leveraged, which can be
especially useful in "cold-start" settings; and (3) the model admits simple
Bayesian inference via batch, as well as \emph{online} MCMC; the latter allows
scaling up even for \emph{dense} binary data (i.e., when the number of ones in
the tensor/network is also massive). In addition, non-negative factor matrices
in our model provide easy interpretability, and the tensor rank can be inferred
from the data. We evaluate our model on several large-scale real-world binary
tensors, achieving excellent computational scalability, and also demonstrate
its usefulness in leveraging side-information provided in form of
mode-network(s).Comment: UAI (Uncertainty in Artificial Intelligence) 201
Knowledge Graph Completion via Complex Tensor Factorization
In statistical relational learning, knowledge graph completion deals with automatically
understanding the structure of large knowledge graphs—labeled directed graphs—and predicting missing relationships—labeled edges. State-of-the-art embedding models
propose different trade-offs between modeling expressiveness, and time and space complexity.
We reconcile both expressiveness and complexity through the use of complex-valued
embeddings and explore the link between such complex-valued embeddings and unitary
diagonalization. We corroborate our approach theoretically and show that all real square
matrices—thus all possible relation/adjacency matrices—are the real part of some unitarily
diagonalizable matrix. This results opens the door to a lot of other applications of square
matrices factorization. Our approach based on complex embeddings is arguably simple,
as it only involves a Hermitian dot product, the complex counterpart of the standard dot
product between real vectors, whereas other methods resort to more and more complicated
composition functions to increase their expressiveness. The proposed complex embeddings
are scalable to large data sets as it remains linear in both space and time, while consistently
outperforming alternative approaches on standard link prediction benchmarks
Let's Make Block Coordinate Descent Go Fast: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence
Block coordinate descent (BCD) methods are widely-used for large-scale
numerical optimization because of their cheap iteration costs, low memory
requirements, amenability to parallelization, and ability to exploit problem
structure. Three main algorithmic choices influence the performance of BCD
methods: the block partitioning strategy, the block selection rule, and the
block update rule. In this paper we explore all three of these building blocks
and propose variations for each that can lead to significantly faster BCD
methods. We (i) propose new greedy block-selection strategies that guarantee
more progress per iteration than the Gauss-Southwell rule; (ii) explore
practical issues like how to implement the new rules when using "variable"
blocks; (iii) explore the use of message-passing to compute matrix or Newton
updates efficiently on huge blocks for problems with a sparse dependency
between variables; and (iv) consider optimal active manifold identification,
which leads to bounds on the "active set complexity" of BCD methods and leads
to superlinear convergence for certain problems with sparse solutions (and in
some cases finite termination at an optimal solution). We support all of our
findings with numerical results for the classic machine learning problems of
least squares, logistic regression, multi-class logistic regression, label
propagation, and L1-regularization
Performance Portable Solid Mechanics via Matrix-Free -Multigrid
Finite element analysis of solid mechanics is a foundational tool of modern
engineering, with low-order finite element methods and assembled sparse
matrices representing the industry standard for implicit analysis. We use
performance models and numerical experiments to demonstrate that high-order
methods greatly reduce the costs to reach engineering tolerances while enabling
effective use of GPUs. We demonstrate the reliability, efficiency, and
scalability of matrix-free -multigrid methods with algebraic multigrid
coarse solvers through large deformation hyperelastic simulations of multiscale
structures. We investigate accuracy, cost, and execution time on multi-node CPU
and GPU systems for moderate to large models using AMD MI250X (OLCF Crusher),
NVIDIA A100 (NERSC Perlmutter), and V100 (LLNL Lassen and OLCF Summit),
resulting in order of magnitude efficiency improvements over a broad range of
model properties and scales. We discuss efficient matrix-free representation of
Jacobians and demonstrate how automatic differentiation enables rapid
development of nonlinear material models without impacting debuggability and
workflows targeting GPUs
- …