22,725 research outputs found
Isoparametric hypersurfaces in Randers space forms
In this paper, we discuss anisotropic submanifolds and isoparametric
hypersurfaces in a Randers space form (N,F) with the navigation datum (h,W). We
find that (N, F) with respect to the BH-volume and (N,h) have the same
isoparametric hypersurfaces although, in general, their isoparametric functions
are different. This implies that the classification of isoparametric
hypersurfaces in a Randers space form is the same as that in Riemannian case.
Lastly, we give some examples of isoparametric functions in Randers space
forms.Comment: arXiv admin note: text overlap with arXiv:1709.0289
Learning Speech Rate in Speech Recognition
A significant performance reduction is often observed in speech recognition
when the rate of speech (ROS) is too low or too high. Most of present
approaches to addressing the ROS variation focus on the change of speech
signals in dynamic properties caused by ROS, and accordingly modify the dynamic
model, e.g., the transition probabilities of the hidden Markov model (HMM).
However, an abnormal ROS changes not only the dynamic but also the static
property of speech signals, and thus can not be compensated for purely by
modifying the dynamic model. This paper proposes an ROS learning approach based
on deep neural networks (DNN), which involves an ROS feature as the input of
the DNN model and so the spectrum distortion caused by ROS can be learned and
compensated for. The experimental results show that this approach can deliver
better performance for too slow and too fast utterances, demonstrating our
conjecture that ROS impacts both the dynamic and the static property of speech.
In addition, the proposed approach can be combined with the conventional HMM
transition adaptation method, offering additional performance gains
Rademacher Complexity for Adversarially Robust Generalization
Many machine learning models are vulnerable to adversarial attacks; for
example, adding adversarial perturbations that are imperceptible to humans can
often make machine learning models produce wrong predictions with high
confidence. Moreover, although we may obtain robust models on the training
dataset via adversarial training, in some problems the learned models cannot
generalize well to the test data. In this paper, we focus on
attacks, and study the adversarially robust generalization problem through the
lens of Rademacher complexity. For binary linear classifiers, we prove tight
bounds for the adversarial Rademacher complexity, and show that the adversarial
Rademacher complexity is never smaller than its natural counterpart, and it has
an unavoidable dimension dependence, unless the weight vector has bounded
norm. The results also extend to multi-class linear classifiers. For
(nonlinear) neural networks, we show that the dimension dependence in the
adversarial Rademacher complexity also exists. We further consider a surrogate
adversarial loss for one-hidden layer ReLU network and prove margin bounds for
this setting. Our results indicate that having norm constraints on the
weight matrices might be a potential way to improve generalization in the
adversarial setting. We demonstrate experimental results that validate our
theoretical findings.Comment: ICML 201
Distributed Sequence Memory of Multidimensional Inputs in Recurrent Networks
Recurrent neural networks (RNNs) have drawn interest from machine learning
researchers because of their effectiveness at preserving past inputs for
time-varying data processing tasks. To understand the success and limitations
of RNNs, it is critical that we advance our analysis of their fundamental
memory properties. We focus on echo state networks (ESNs), which are RNNs with
simple memoryless nodes and random connectivity. In most existing analyses, the
short-term memory (STM) capacity results conclude that the ESN network size
must scale linearly with the input size for unstructured inputs. The main
contribution of this paper is to provide general results characterizing the STM
capacity for linear ESNs with multidimensional input streams when the inputs
have common low-dimensional structure: sparsity in a basis or significant
statistical dependence between inputs. In both cases, we show that the number
of nodes in the network must scale linearly with the information rate and
poly-logarithmically with the ambient input dimension. The analysis relies on
advanced applications of random matrix theory and results in explicit
non-asymptotic bounds on the recovery error. Taken together, this analysis
provides a significant step forward in our understanding of the STM properties
in RNNs.Comment: 37 pages, 3 figure
Long-range Effects on the Pyroelectric Coefficient of Ferroelectric Superlattice
Long-range effects on the pyroelectric coefficient of a ferroelectric
superlattice consisting of two different ferroelectric materials are
investigated based on the Transverse Ising Model. The effects of the
interfacial coupling and the thickness of one period on the pyroelectric
coefficient of the ferroelectric superlattice are studied by taking into
account the long-range interaction. It is found that with the increase of the
strength of the long-range interaction, the pyroelectric coefficient decreases
when the temperature is lower than the phase transition temperature; the number
of the pyroelectric peaks decreases gradually and the phase transition
temperature increases. It is also found that with the decrease of the
interfacial coupling and the thickness of one period, the phase transition
temperature and the number of the pyroelectric peaks decrease.Comment: 19 pages, 7 figure
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning
We study robust distributed learning that involves minimizing a non-convex
loss function with saddle points. We consider the Byzantine setting where some
worker machines have abnormal or even arbitrary and adversarial behavior. In
this setting, the Byzantine machines may create fake local minima near a saddle
point that is far away from any true local minimum, even when robust gradient
estimators are used. We develop ByzantinePGD, a robust first-order algorithm
that can provably escape saddle points and fake local minima, and converge to
an approximate true local minimizer with low iteration complexity. As a
by-product, we give a simpler algorithm and analysis for escaping saddle points
in the usual non-Byzantine setting. We further discuss three robust gradient
estimators that can be used in ByzantinePGD, including median, trimmed mean,
and iterative filtering. We characterize their performance in concrete
statistical settings, and argue for their near-optimality in low and high
dimensional regimes.Comment: ICML 201
Scheduling Constraint Based Abstraction Refinement for Multi-Threaded Program Verification
Bounded model checking is among the most efficient techniques for the
automatic verification of concurrent programs. However, encoding all possible
interleavings often requires a huge and complex formula, which significantly
limits the salability. This paper proposes a novel and efficient abstraction
refinement method for multi-threaded program verification. Observing that the
huge formula is usually dominated by the exact encoding of the scheduling
constraint, this paper proposes a \tsc based abstraction refinement method,
which avoids the huge and complex encoding of BMC. In addition, to obtain an
effective refinement, we have devised two graph-based algorithms over event
order graph for counterexample validation and refinement generation, which can
always obtain a small yet effective refinement constraint. Enhanced by two
constraint-based algorithms for counterexample validation and refinement
generation, we have proved that our method is sound and complete w.r.t. the
given loop unwinding depth. Experimental results on \svcompc benchmarks
indicate that our method is promising and significantly outperforms the
existing state-of-the-art tools.Comment: 27 pages, 16 figure
Fast Linearized Bregman Iteration for Compressive Sensing and Sparse Denoising
We propose and analyze an extremely fast, efficient, and simple method for
solving the problem:min{parallel to u parallel to(1) : Au = f, u is an element
of R-n}.This method was first described in [J. Darbon and S. Osher, preprint,
2007], with more details in [W. Yin, S. Osher, D. Goldfarb and J. Darbon, SIAM
J. Imaging Sciences, 1(1), 143-168, 2008] and rigorous theory given in [J. Cai,
S. Osher and Z. Shen, Math. Comp., to appear, 2008, see also UCLA CAM Report
08-06] and [J. Cai, S. Osher and Z. Shen, UCLA CAM Report, 08-52, 2008]. The
motivation was compressive sensing, which now has a vast and exciting history,
which seems to have started with Candes, et. al. [E. Candes, J. Romberg and T.
Tao, 52(2), 489-509, 2006] and Donoho, [D. L. Donoho, IEEE Trans. Inform.
Theory, 52, 1289-1306, 2006]. See [W. Yin, S. Osher, D. Goldfarb and J. Darbon,
SIAM J. Imaging Sciences 1(1), 143-168, 2008] and [J. Cai, S. Osher and Z.
Shen, Math. Comp., to appear, 2008, see also UCLA CAM Report, 08-06] and [J.
Cai, S. Osher and Z. Shen, UCLA CAM Report, 08-52, 2008] for a large set of
references. Our method introduces an improvement called "kicking" of the very
efficient method of [J. Darbon and S. Osher, preprint, 2007] and [W. Yin, S.
Osher, D. Goldfarb and J. Darbon, SIAM J. Imaging Sciences, 1(1), 143-168,
2008] and also applies it to the problem of denoising of undersampled signals.
The use of Bregman iteration for denoising of images began in [S. Osher, M.
Burger, D. Goldfarb, J. Xu and W. Yin, Multiscale Model. Simul, 4(2), 460-489,
2005] and led to improved results for total variation based methods. Here we
apply it to denoise signals, especially essentially sparse signals, which might
even be undersampled
PhaseCode: Fast and Efficient Compressive Phase Retrieval based on Sparse-Graph-Codes
We consider the problem of recovering a -sparse complex signal from
intensity measurements. We propose the PhaseCode algorithm, and show that
in the noiseless case, PhaseCode can recover an arbitrarily-close-to-one
fraction of the non-zero signal components using only slightly more than
measurements when the support of the signal is uniformly random, with
order-optimal time and memory complexity of . It is known that the
fundamental limit for the number of measurements in compressive phase retrieval
problem is to recover the signal exactly and with no assumptions on
its support distribution. This shows that under mild relaxation of the
conditions, our algorithm is the first constructive \emph{capacity-approaching}
compressive phase retrieval algorithm: in fact, our algorithm is also
order-optimal in complexity and memory. Next, motivated by some important
practical classes of optical systems, we consider a Fourier-friendly
constrained measurement setting, and show that its performance matches that of
the unconstrained setting. In the Fourier-friendly setting that we consider,
the measurement matrix is constrained to be a cascade of Fourier matrices and
diagonal matrices. We further demonstrate how PhaseCode can be robustified to
noise. Throughout, we provide extensive simulation results that validate the
practical power of our proposed algorithms for the sparse unconstrained and
Fourier-friendly measurement settings, for noiseless and noisy scenarios. A key
contribution of our work is the novel use of coding-theoretic tools like
density evolution methods for the design and analysis of fast and efficient
algorithms for compressive phase-retrieval problems.Comment: To appear in IEEE Transactions on Information Theor
Learning Mixtures of Sparse Linear Regressions Using Sparse Graph Codes
In this paper, we consider the mixture of sparse linear regressions model.
Let be unknown sparse
parameter vectors with a total of non-zero coefficients. Noisy linear
measurements are obtained in the form ,
each of which is generated randomly from one of the sparse vectors with the
label unknown. The goal is to estimate the parameter vectors
efficiently with low sample and computational costs. This problem presents
significant challenges as one needs to simultaneously solve the demixing
problem of recovering the labels as well as the estimation problem
of recovering the sparse vectors .
Our solution to the problem leverages the connection between modern coding
theory and statistical inference. We introduce a new algorithm, Mixed-Coloring,
which samples the mixture strategically using query vectors
constructed based on ideas from sparse graph codes. Our novel code design
allows for both efficient demixing and parameter estimation. In the noiseless
setting, for a constant number of sparse parameter vectors, our algorithm
achieves the order-optimal sample and time complexities of . In the
presence of Gaussian noise, for the problem with two parameter vectors (i.e.,
), we show that the Robust Mixed-Coloring algorithm achieves near-optimal
sample and time complexities. When for
some constant (i.e., is sublinear in ), we can achieve
sample and time complexities both sublinear in the ambient dimension. In one of
our experiments, to recover a mixture of two regressions with dimension
and sparsity , our algorithm is more than times faster than EM
algorithm, with about one third of its sample cost.Comment: To appear in IEEE Transactions on Information Theor
- β¦