295 research outputs found
SamBaTen: Sampling-based Batch Incremental Tensor Decomposition
Tensor decompositions are invaluable tools in analyzing multimodal datasets.
In many real-world scenarios, such datasets are far from being static, to the
contrary they tend to grow over time. For instance, in an online social network
setting, as we observe new interactions over time, our dataset gets updated in
its "time" mode. How can we maintain a valid and accurate tensor decomposition
of such a dynamically evolving multimodal dataset, without having to re-compute
the entire decomposition after every single update? In this paper we introduce
SaMbaTen, a Sampling-based Batch Incremental Tensor Decomposition algorithm,
which incrementally maintains the decomposition given new updates to the tensor
dataset. SaMbaTen is able to scale to datasets that the state-of-the-art in
incremental tensor decomposition is unable to operate on, due to its ability to
effectively summarize the existing tensor and the incoming updates, and perform
all computations in the reduced summary space. We extensively evaluate SaMbaTen
using synthetic and real datasets. Indicatively, SaMbaTen achieves comparable
accuracy to state-of-the-art incremental and non-incremental techniques, while
being 25-30 times faster. Furthermore, SaMbaTen scales to very large sparse and
dense dynamically evolving tensors of dimensions up to 100K x 100K x 100K where
state-of-the-art incremental approaches were not able to operate
A Unified Optimization Approach for Sparse Tensor Operations on GPUs
Sparse tensors appear in many large-scale applications with multidimensional
and sparse data. While multidimensional sparse data often need to be processed
on manycore processors, attempts to develop highly-optimized GPU-based
implementations of sparse tensor operations are rare. The irregular computation
patterns and sparsity structures as well as the large memory footprints of
sparse tensor operations make such implementations challenging. We leverage the
fact that sparse tensor operations share similar computation patterns to
propose a unified tensor representation called F-COO. Combined with
GPU-specific optimizations, F-COO provides highly-optimized implementations of
sparse tensor computations on GPUs. The performance of the proposed unified
approach is demonstrated for tensor-based kernels such as the Sparse Matricized
Tensor- Times-Khatri-Rao Product (SpMTTKRP) and the Sparse Tensor- Times-Matrix
Multiply (SpTTM) and is used in tensor decomposition algorithms. Compared to
state-of-the-art work we improve the performance of SpTTM and SpMTTKRP up to
3.7 and 30.6 times respectively on NVIDIA Titan-X GPUs. We implement a
CANDECOMP/PARAFAC (CP) decomposition and achieve up to 14.9 times speedup using
the unified method over state-of-the-art libraries on NVIDIA Titan-X GPUs
Identifying and Alleviating Concept Drift in Streaming Tensor Decomposition
Tensor decompositions are used in various data mining applications from
social network to medical applications and are extremely useful in discovering
latent structures or concepts in the data. Many real-world applications are
dynamic in nature and so are their data. To deal with this dynamic nature of
data, there exist a variety of online tensor decomposition algorithms. A
central assumption in all those algorithms is that the number of latent
concepts remains fixed throughout the entire stream. However, this need not be
the case. Every incoming batch in the stream may have a different number of
latent concepts, and the difference in latent concepts from one tensor batch to
another can provide insights into how our findings in a particular application
behave and deviate over time. In this paper, we define "concept" and "concept
drift" in the context of streaming tensor decomposition, as the manifestation
of the variability of latent concepts throughout the stream. Furthermore, we
introduce SeekAndDestroy, an algorithm that detects concept drift in streaming
tensor decomposition and is able to produce results robust to that drift. To
the best of our knowledge, this is the first work that investigates concept
drift in streaming tensor decomposition. We extensively evaluate SeekAndDestroy
on synthetic datasets, which exhibit a wide variety of realistic drift. Our
experiments demonstrate the effectiveness of SeekAndDestroy, both in the
detection of concept drift and in the alleviation of its effects, producing
results with similar quality to decomposing the entire tensor in one shot.
Additionally, in real datasets, SeekAndDestroy outperforms other streaming
baselines, while discovering novel useful components.Comment: 16 Pages, Accepted at ECML-PKDD 201
- …