8,843 research outputs found
Convolutional Neural Networks with Transformed Input based on Robust Tensor Network Decomposition
Tensor network decomposition, originated from quantum physics to model
entangled many-particle quantum systems, turns out to be a promising
mathematical technique to efficiently represent and process big data in
parsimonious manner. In this study, we show that tensor networks can
systematically partition structured data, e.g. color images, for distributed
storage and communication in privacy-preserving manner. Leveraging the sea of
big data and metadata privacy, empirical results show that neighbouring
subtensors with implicit information stored in tensor network formats cannot be
identified for data reconstruction. This technique complements the existing
encryption and randomization techniques which store explicit data
representation at one place and highly susceptible to adversarial attacks such
as side-channel attacks and de-anonymization. Furthermore, we propose a theory
for adversarial examples that mislead convolutional neural networks to
misclassification using subspace analysis based on singular value decomposition
(SVD). The theory is extended to analyze higher-order tensors using
tensor-train SVD (TT-SVD); it helps to explain the level of susceptibility of
different datasets to adversarial attacks, the structural similarity of
different adversarial attacks including global and localized attacks, and the
efficacy of different adversarial defenses based on input transformation. An
efficient and adaptive algorithm based on robust TT-SVD is then developed to
detect strong and static adversarial attacks
Spectral Network Embedding: A Fast and Scalable Method via Sparsity
Network embedding aims to learn low-dimensional representations of nodes in a
network, while the network structure and inherent properties are preserved. It
has attracted tremendous attention recently due to significant progress in
downstream network learning tasks, such as node classification, link
prediction, and visualization. However, most existing network embedding methods
suffer from the expensive computations due to the large volume of networks. In
this paper, we propose a faster network embedding
method, called Progle, by elegantly utilizing the sparsity property of online
networks and spectral analysis. In Progle, we first construct a \textit{sparse}
proximity matrix and train the network embedding efficiently via sparse matrix
decomposition. Then we introduce a network propagation pattern via spectral
analysis to incorporate local and global structure information into the
embedding. Besides, this model can be generalized to integrate network
information into other insufficiently trained embeddings at speed. Benefiting
from sparse spectral network embedding, our experiment on four different
datasets shows that Progle outperforms or is comparable to state-of-the-art
unsupervised comparison approaches---DeepWalk, LINE, node2vec, GraRep, and
HOPE, regarding accuracy, while is faster than the fastest
word2vec-based method. Finally, we validate the scalability of Progle both in
real large-scale networks and multiple scales of synthetic networks
Spectral Learning on Matrices and Tensors
Spectral methods have been the mainstay in several domains such as machine learning, applied mathematics and scientific computing. They involve finding a certain kind of spectral decomposition to obtain basis functions that can capture important structures or directions for the problem at hand. The most common spectral method is the principal component analysis (PCA). It utilizes the principal components or the top eigenvectors of the data covariance matrix to carry out dimensionality reduction as one of its applications. This data pre-processing step is often effective in separating signal from noise. PCA and other spectral techniques applied to matrices have several limitations. By limiting to only pairwise moments, they are effectively making a Gaussian approximation on the underlying data. Hence, they fail on data with hidden variables which lead to non-Gaussianity. However, in almost any data set, there are latent effects that cannot be directly observed, e.g., topics in a document corpus, or underlying causes of a disease. By extending the spectral decomposition methods to higher order moments, we demonstrate the ability to learn a wide range of latent variable models efficiently. Higher-order moments can be represented by tensors, and intuitively, they can encode more information than just pairwise moment matrices. More crucially, tensor decomposition can pick up latent effects that are missed by matrix methods. For instance, tensor decomposition can uniquely identify non-orthogonal components. Exploiting these aspects turns out to be fruitful for provable unsupervised learning of a wide range of latent variable models. We also outline the computational techniques to design efficient tensor decomposition methods. They are embarrassingly parallel and thus scalable to large data sets. Whilst there exist many optimized linear algebra software packages, efficient tensor algebra packages are also beginning to be developed. We introduce Tensorly, which has a simple python interface for expressing tensor operations. It has a flexible back-end system supporting NumPy, PyTorch, TensorFlow and MXNet amongst others. This allows it to carry out multi-GPU and CPU operations, and can also be seamlessly integrated with deep-learning functionalities
A literature survey of matrix methods for data science
Efficient numerical linear algebra is a core ingredient in many applications
across almost all scientific and industrial disciplines. With this survey we
want to illustrate that numerical linear algebra has played and is playing a
crucial role in enabling and improving data science computations with many new
developments being fueled by the availability of data and computing resources.
We highlight the role of various different factorizations and the power of
changing the representation of the data as well as discussing topics such as
randomized algorithms, functions of matrices, and high-dimensional problems. We
briefly touch upon the role of techniques from numerical linear algebra used
within deep learning
Deep Unfolded Robust PCA with Application to Clutter Suppression in Ultrasound
Contrast enhanced ultrasound is a radiation-free imaging modality which uses
encapsulated gas microbubbles for improved visualization of the vascular bed
deep within the tissue. It has recently been used to enable imaging with
unprecedented subwavelength spatial resolution by relying on super-resolution
techniques. A typical preprocessing step in super-resolution ultrasound is to
separate the microbubble signal from the cluttering tissue signal. This step
has a crucial impact on the final image quality. Here, we propose a new
approach to clutter removal based on robust principle component analysis (PCA)
and deep learning. We begin by modeling the acquired contrast enhanced
ultrasound signal as a combination of a low rank and sparse components. This
model is used in robust PCA and was previously suggested in the context of
ultrasound Doppler processing and dynamic magnetic resonance imaging. We then
illustrate that an iterative algorithm based on this model exhibits improved
separation of microbubble signal from the tissue signal over commonly practiced
methods. Next, we apply the concept of deep unfolding to suggest a deep network
architecture tailored to our clutter filtering problem which exhibits improved
convergence speed and accuracy with respect to its iterative counterpart. We
compare the performance of the suggested deep network on both simulations and
in-vivo rat brain scans, with a commonly practiced deep-network architecture
and the fast iterative shrinkage algorithm, and show that our architecture
exhibits better image quality and contrast
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
Neural language models (NLMs) have recently gained a renewed interest by
achieving state-of-the-art performance across many natural language processing
(NLP) tasks. However, NLMs are very computationally demanding largely due to
the computational cost of the softmax layer over a large vocabulary. We observe
that, in decoding of many NLP tasks, only the probabilities of the top-K
hypotheses need to be calculated preciously and K is often much smaller than
the vocabulary size. This paper proposes a novel softmax layer approximation
algorithm, called Fast Graph Decoder (FGD), which quickly identifies, for a
given context, a set of K words that are most likely to occur according to a
NLM. We demonstrate that FGD reduces the decoding time by an order of magnitude
while attaining close to the full softmax baseline accuracy on neural machine
translation and language modeling tasks. We also prove the theoretical
guarantee on the softmax approximation quality
Convolutional Imputation of Matrix Networks
A matrix network is a family of matrices, with relatedness modeled by a
weighted graph. We consider the task of completing a partially observed matrix
network. We assume a novel sampling scheme where a fraction of matrices might
be completely unobserved. How can we recover the entire matrix network from
incomplete observations? This mathematical problem arises in many applications
including medical imaging and social networks.
To recover the matrix network, we propose a structural assumption that the
matrices have a graph Fourier transform which is low-rank. We formulate a
convex optimization problem and prove an exact recovery guarantee for the
optimization problem. Furthermore, we numerically characterize the exact
recovery regime for varying rank and sampling rate and discover a new phase
transition phenomenon. Then we give an iterative imputation algorithm to
efficiently solve the optimization problem and complete large scale matrix
networks. We demonstrate the algorithm with a variety of applications such as
MRI and Facebook user network.Comment: Accepted by ICML 201
Learning Efficient Tensor Representations with Ring Structure Networks
Tensor train (TT) decomposition is a powerful representation for high-order
tensors, which has been successfully applied to various machine learning tasks
in recent years. However, since the tensor product is not commutative,
permutation of data dimensions makes solutions and TT-ranks of TT decomposition
inconsistent. To alleviate this problem, we propose a permutation symmetric
network structure by employing circular multilinear products over a sequence of
low-order core tensors. This network structure can be graphically interpreted
as a cyclic interconnection of tensors, and thus we call it tensor ring (TR)
representation. We develop several efficient algorithms to learn TR
representation with adaptive TR-ranks by employing low-rank approximations.
Furthermore, mathematical properties are investigated, which enables us to
perform basic operations in a computationally efficiently way by using TR
representations. Experimental results on synthetic signals and real-world
datasets demonstrate that the proposed TR network is more expressive and
consistently informative than existing TT networks.Comment: arXiv admin note: substantial text overlap with arXiv:1606.0553
Shortcut Matrix Product States and its applications
Matrix Product States (MPS), also known as Tensor Train (TT) decomposition in
mathematics, has been proposed originally for describing an (especially
one-dimensional) quantum system, and recently has found applications in various
applications such as compressing high-dimensional data, supervised kernel
linear classifier, and unsupervised generative modeling. However, when applied
to systems which are not defined on one-dimensional lattices, a serious
drawback of the MPS is the exponential decay of the correlations, which limits
its power in capturing long-range dependences among variables in the system. To
alleviate this problem, we propose to introduce long-range interactions, which
act as shortcuts, to MPS, resulting in a new model \textit{ Shortcut Matrix
Product States} (SMPS). When chosen properly, the shortcuts can decrease
significantly the correlation length of the MPS, while preserving the
computational efficiency. We develop efficient training methods of SMPS for
various tasks, establish some of their mathematical properties, and show how to
find a good location to add shortcuts. Finally, using extensive numerical
experiments we evaluate its performance in a variety of applications, including
function fitting, partition function calculation of d Ising model, and
unsupervised generative modeling of handwritten digits, to illustrate its
advantages over vanilla matrix product states.Comment: 15pages, 11 figure
A survey of dimensionality reduction techniques
Experimental life sciences like biology or chemistry have seen in the recent
decades an explosion of the data available from experiments. Laboratory
instruments become more and more complex and report hundreds or thousands
measurements for a single experiment and therefore the statistical methods face
challenging tasks when dealing with such high dimensional data. However, much
of the data is highly redundant and can be efficiently brought down to a much
smaller number of variables without a significant loss of information. The
mathematical procedures making possible this reduction are called
dimensionality reduction techniques; they have widely been developed by fields
like Statistics or Machine Learning, and are currently a hot research topic. In
this review we categorize the plethora of dimension reduction techniques
available and give the mathematical insight behind them
- …