728 research outputs found
SamBaTen: Sampling-based Batch Incremental Tensor Decomposition
Tensor decompositions are invaluable tools in analyzing multimodal datasets.
In many real-world scenarios, such datasets are far from being static, to the
contrary they tend to grow over time. For instance, in an online social network
setting, as we observe new interactions over time, our dataset gets updated in
its "time" mode. How can we maintain a valid and accurate tensor decomposition
of such a dynamically evolving multimodal dataset, without having to re-compute
the entire decomposition after every single update? In this paper we introduce
SaMbaTen, a Sampling-based Batch Incremental Tensor Decomposition algorithm,
which incrementally maintains the decomposition given new updates to the tensor
dataset. SaMbaTen is able to scale to datasets that the state-of-the-art in
incremental tensor decomposition is unable to operate on, due to its ability to
effectively summarize the existing tensor and the incoming updates, and perform
all computations in the reduced summary space. We extensively evaluate SaMbaTen
using synthetic and real datasets. Indicatively, SaMbaTen achieves comparable
accuracy to state-of-the-art incremental and non-incremental techniques, while
being 25-30 times faster. Furthermore, SaMbaTen scales to very large sparse and
dense dynamically evolving tensors of dimensions up to 100K x 100K x 100K where
state-of-the-art incremental approaches were not able to operate
Streaming Tensor Train Approximation
Tensor trains are a versatile tool to compress and work with high-dimensional
data and functions. In this work we introduce the Streaming Tensor Train
Approximation (STTA), a new class of algorithms for approximating a given
tensor in the tensor train format. STTA accesses
exclusively via two-sided random sketches of the original data, making it
streamable and easy to implement in parallel -- unlike existing deterministic
and randomized tensor train approximations. This property also allows STTA to
conveniently leverage structure in , such as sparsity and various
low-rank tensor formats, as well as linear combinations thereof. When Gaussian
random matrices are used for sketching, STTA is admissible to an analysis that
builds and extends upon existing results on the generalized Nystr\"om
approximation for matrices. Our results show that STTA can be expected to
attain a nearly optimal approximation error if the sizes of the sketches are
suitably chosen. A range of numerical experiments illustrates the performance
of STTA compared to existing deterministic and randomized approaches.Comment: 21 pages, code available at https://github.com/RikVoorhaar/tt-sketc
- …