2,655 research outputs found
Tensor Networks for Big Data Analytics and Large-Scale Optimization Problems
In this paper we review basic and emerging models and associated algorithms
for large-scale tensor networks, especially Tensor Train (TT) decompositions
using novel mathematical and graphical representations. We discus the concept
of tensorization (i.e., creating very high-order tensors from lower-order
original data) and super compression of data achieved via quantized tensor
train (QTT) networks. The purpose of a tensorization and quantization is to
achieve, via low-rank tensor approximations "super" compression, and
meaningful, compact representation of structured data. The main objective of
this paper is to show how tensor networks can be used to solve a wide class of
big data optimization problems (that are far from tractable by classical
numerical methods) by applying tensorization and performing all operations
using relatively small size matrices and tensors and applying iteratively
optimized and approximative tensor contractions.
Keywords: Tensor networks, tensor train (TT) decompositions, matrix product
states (MPS), matrix product operators (MPO), basic tensor operations,
tensorization, distributed representation od data optimization problems for
very large-scale problems: generalized eigenvalue decomposition (GEVD),
PCA/SVD, canonical correlation analysis (CCA).Comment: arXiv admin note: text overlap with arXiv:1403.204
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Tucker Tensor analysis of Matern functions in spatial statistics
In this work, we describe advanced numerical tools for working with
multivariate functions and for the analysis of large data sets. These tools
will drastically reduce the required computing time and the storage cost, and,
therefore, will allow us to consider much larger data sets or finer meshes.
Covariance matrices are crucial in spatio-temporal statistical tasks, but are
often very expensive to compute and store, especially in 3D. Therefore, we
approximate covariance functions by cheap surrogates in a low-rank tensor
format. We apply the Tucker and canonical tensor decompositions to a family of
Matern- and Slater-type functions with varying parameters and demonstrate
numerically that their approximations exhibit exponentially fast convergence.
We prove the exponential convergence of the Tucker and canonical approximations
in tensor rank parameters. Several statistical operations are performed in this
low-rank tensor format, including evaluating the conditional covariance matrix,
spatially averaged estimation variance, computing a quadratic form,
determinant, trace, loglikelihood, inverse, and Cholesky decomposition of a
large covariance matrix. Low-rank tensor approximations reduce the computing
and storage costs essentially. For example, the storage cost is reduced from an
exponential to a linear scaling , where
is the spatial dimension, is the number of mesh points in one
direction, and is the tensor rank. Prerequisites for applicability of the
proposed techniques are the assumptions that the data, locations, and
measurements lie on a tensor (axes-parallel) grid and that the covariance
function depends on a distance, .Comment: 23 pages, 2 diagrams, 2 tables, 9 figure
Learning Relevant Features of Data with Multi-scale Tensor Networks
Inspired by coarse-graining approaches used in physics, we show how similar
algorithms can be adapted for data. The resulting algorithms are based on
layered tree tensor networks and scale linearly with both the dimension of the
input and the training set size. Computing most of the layers with an
unsupervised algorithm, then optimizing just the top layer for supervised
classification of the MNIST and fashion-MNIST data sets gives very good
results. We also discuss mixing a prior guess for supervised weights together
with an unsupervised representation of the data, yielding a smaller number of
features nevertheless able to give good performance.Comment: 12 pages, 13 figure
Recommended from our members
The connectome mapper: an open-source processing pipeline to map connectomes with MRI.
Researchers working in the field of global connectivity analysis using diffusion magnetic resonance imaging (MRI) can count on a wide selection of software packages for processing their data, with methods ranging from the reconstruction of the local intra-voxel axonal structure to the estimation of the trajectories of the underlying fibre tracts. However, each package is generally task-specific and uses its own conventions and file formats. In this article we present the Connectome Mapper, a software pipeline aimed at helping researchers through the tedious process of organising, processing and analysing diffusion MRI data to perform global brain connectivity analyses. Our pipeline is written in Python and is freely available as open-source at www.cmtk.org
Deep Positron: A Deep Neural Network Using the Posit Number System
The recent surge of interest in Deep Neural Networks (DNNs) has led to
increasingly complex networks that tax computational and memory resources. Many
DNNs presently use 16-bit or 32-bit floating point operations. Significant
performance and power gains can be obtained when DNN accelerators support
low-precision numerical formats. Despite considerable research, there is still
a knowledge gap on how low-precision operations can be realized for both DNN
training and inference. In this work, we propose a DNN architecture, Deep
Positron, with posit numerical format operating successfully at 8 bits
for inference. We propose a precision-adaptable FPGA soft core for exact
multiply-and-accumulate for uniform comparison across three numerical formats,
fixed, floating-point and posit. Preliminary results demonstrate that 8-bit
posit has better accuracy than 8-bit fixed or floating-point for three
different low-dimensional datasets. Moreover, the accuracy is comparable to
32-bit floating-point on a Xilinx Virtex-7 FPGA device. The trade-offs between
DNN performance and hardware resources, i.e. latency, power, and resource
utilization, show that posit outperforms in accuracy and latency at 8-bit and
below.Comment: 6 pages, Design, Automation and Test in Europe 201
A Microbenchmark Characterization of the Emu Chick
The Emu Chick is a prototype system designed around the concept of migratory
memory-side processing. Rather than transferring large amounts of data across
power-hungry, high-latency interconnects, the Emu Chick moves lightweight
thread contexts to near-memory cores before the beginning of each memory read.
The current prototype hardware uses FPGAs to implement cache-less "Gossamer
cores for doing computational work and a stationary core to run basic operating
system functions and migrate threads between nodes. In this multi-node
characterization of the Emu Chick, we extend an earlier single-node
investigation (Hein, et al. AsHES 2018) of the the memory bandwidth
characteristics of the system through benchmarks like STREAM, pointer chasing,
and sparse matrix-vector multiplication. We compare the Emu Chick hardware to
architectural simulation and an Intel Xeon-based platform. Our results
demonstrate that for many basic operations the Emu Chick can use available
memory bandwidth more efficiently than a more traditional, cache-based
architecture although bandwidth usage suffers for computationally intensive
workloads like SpMV. Moreover, the Emu Chick provides stable, predictable
performance with up to 65% of the peak bandwidth utilization on a random-access
pointer chasing benchmark with weak locality
Protecting Big Data Privacy Using Randomized Tensor Network Decomposition and Dispersed Tensor Computation
Data privacy is an important issue for organizations and enterprises to
securely outsource data storage, sharing, and computation on clouds / fogs.
However, data encryption is complicated in terms of the key management and
distribution; existing secure computation techniques are expensive in terms of
computational / communication cost and therefore do not scale to big data
computation. Tensor network decomposition and distributed tensor computation
have been widely used in signal processing and machine learning for
dimensionality reduction and large-scale optimization. However, the potential
of distributed tensor networks for big data privacy preservation have not been
considered before, this motivates the current study. Our primary intuition is
that tensor network representations are mathematically non-unique, unlinkable,
and uninterpretable; tensor network representations naturally support a range
of multilinear operations for compressed and distributed / dispersed
computation. Therefore, we propose randomized algorithms to decompose big data
into randomized tensor network representations and analyze the privacy leakage
for 1D to 3D data tensors. The randomness mainly comes from the complex
structural information commonly found in big data; randomization is based on
controlled perturbation applied to the tensor blocks prior to decomposition.
The distributed tensor representations are dispersed on multiple clouds / fogs
or servers / devices with metadata privacy, this provides both distributed
trust and management to seamlessly secure big data storage, communication,
sharing, and computation. Experiments show that the proposed randomization
techniques are helpful for big data anonymization and efficient for big data
storage and computation
User-transparent Distributed TensorFlow
Deep Learning (DL) algorithms have become the {\em de facto} choice for data
analysis. Several DL implementations -- primarily limited to a single compute
node -- such as Caffe, TensorFlow, Theano and Torch have become readily
available. Distributed DL implementations capable of execution on large scale
systems are becoming important to address the computational needs of large data
produced by scientific simulations and experiments. Yet, the adoption of
distributed DL implementations faces significant impediments: 1) most
implementations require DL analysts to modify their code significantly -- which
is a show-stopper, 2) several distributed DL implementations are geared towards
cloud computing systems -- which is inadequate for execution on massively
parallel systems such as supercomputers.
This work addresses each of these problems. We provide a distributed memory
DL implementation by incorporating required changes in the TensorFlow runtime
itself. This dramatically reduces the entry barrier for using a distributed
TensorFlow implementation. We use Message Passing Interface (MPI) -- which
provides performance portability, especially since MPI specific changes are
abstracted from users. Lastly -- and arguably most importantly -- we make our
implementation available for broader use, under the umbrella of Machine
Learning Toolkit for Extreme Scale (MaTEx) at {\texttt
http://hpc.pnl.gov/matex}. We refer to our implementation as MaTEx-TensorFlow.Comment: 9 pages, 8 figure
Recommended from our members
Neuroimaging study designs, computational analyses and data provenance using the LONI pipeline.
Modern computational neuroscience employs diverse software tools and multidisciplinary expertise to analyze heterogeneous brain data. The classical problems of gathering meaningful data, fitting specific models, and discovering appropriate analysis and visualization tools give way to a new class of computational challenges--management of large and incongruous data, integration and interoperability of computational resources, and data provenance. We designed, implemented and validated a new paradigm for addressing these challenges in the neuroimaging field. Our solution is based on the LONI Pipeline environment [3], [4], a graphical workflow environment for constructing and executing complex data processing protocols. We developed study-design, database and visual language programming functionalities within the LONI Pipeline that enable the construction of complete, elaborate and robust graphical workflows for analyzing neuroimaging and other data. These workflows facilitate open sharing and communication of data and metadata, concrete processing protocols, result validation, and study replication among different investigators and research groups. The LONI Pipeline features include distributed grid-enabled infrastructure, virtualized execution environment, efficient integration, data provenance, validation and distribution of new computational tools, automated data format conversion, and an intuitive graphical user interface. We demonstrate the new LONI Pipeline features using large scale neuroimaging studies based on data from the International Consortium for Brain Mapping [5] and the Alzheimer's Disease Neuroimaging Initiative [6]. User guides, forums, instructions and downloads of the LONI Pipeline environment are available at http://pipeline.loni.ucla.edu
- …