22,659 research outputs found

    Isoparametric hypersurfaces in Randers space forms

    Full text link
    In this paper, we discuss anisotropic submanifolds and isoparametric hypersurfaces in a Randers space form (N,F) with the navigation datum (h,W). We find that (N, F) with respect to the BH-volume and (N,h) have the same isoparametric hypersurfaces although, in general, their isoparametric functions are different. This implies that the classification of isoparametric hypersurfaces in a Randers space form is the same as that in Riemannian case. Lastly, we give some examples of isoparametric functions in Randers space forms.Comment: arXiv admin note: text overlap with arXiv:1709.0289

    Learning Speech Rate in Speech Recognition

    Full text link
    A significant performance reduction is often observed in speech recognition when the rate of speech (ROS) is too low or too high. Most of present approaches to addressing the ROS variation focus on the change of speech signals in dynamic properties caused by ROS, and accordingly modify the dynamic model, e.g., the transition probabilities of the hidden Markov model (HMM). However, an abnormal ROS changes not only the dynamic but also the static property of speech signals, and thus can not be compensated for purely by modifying the dynamic model. This paper proposes an ROS learning approach based on deep neural networks (DNN), which involves an ROS feature as the input of the DNN model and so the spectrum distortion caused by ROS can be learned and compensated for. The experimental results show that this approach can deliver better performance for too slow and too fast utterances, demonstrating our conjecture that ROS impacts both the dynamic and the static property of speech. In addition, the proposed approach can be combined with the conventional HMM transition adaptation method, offering additional performance gains

    Rademacher Complexity for Adversarially Robust Generalization

    Full text link
    Many machine learning models are vulnerable to adversarial attacks; for example, adding adversarial perturbations that are imperceptible to humans can often make machine learning models produce wrong predictions with high confidence. Moreover, although we may obtain robust models on the training dataset via adversarial training, in some problems the learned models cannot generalize well to the test data. In this paper, we focus on β„“βˆž\ell_\infty attacks, and study the adversarially robust generalization problem through the lens of Rademacher complexity. For binary linear classifiers, we prove tight bounds for the adversarial Rademacher complexity, and show that the adversarial Rademacher complexity is never smaller than its natural counterpart, and it has an unavoidable dimension dependence, unless the weight vector has bounded β„“1\ell_1 norm. The results also extend to multi-class linear classifiers. For (nonlinear) neural networks, we show that the dimension dependence in the adversarial Rademacher complexity also exists. We further consider a surrogate adversarial loss for one-hidden layer ReLU network and prove margin bounds for this setting. Our results indicate that having β„“1\ell_1 norm constraints on the weight matrices might be a potential way to improve generalization in the adversarial setting. We demonstrate experimental results that validate our theoretical findings.Comment: ICML 201

    Distributed Sequence Memory of Multidimensional Inputs in Recurrent Networks

    Full text link
    Recurrent neural networks (RNNs) have drawn interest from machine learning researchers because of their effectiveness at preserving past inputs for time-varying data processing tasks. To understand the success and limitations of RNNs, it is critical that we advance our analysis of their fundamental memory properties. We focus on echo state networks (ESNs), which are RNNs with simple memoryless nodes and random connectivity. In most existing analyses, the short-term memory (STM) capacity results conclude that the ESN network size must scale linearly with the input size for unstructured inputs. The main contribution of this paper is to provide general results characterizing the STM capacity for linear ESNs with multidimensional input streams when the inputs have common low-dimensional structure: sparsity in a basis or significant statistical dependence between inputs. In both cases, we show that the number of nodes in the network must scale linearly with the information rate and poly-logarithmically with the ambient input dimension. The analysis relies on advanced applications of random matrix theory and results in explicit non-asymptotic bounds on the recovery error. Taken together, this analysis provides a significant step forward in our understanding of the STM properties in RNNs.Comment: 37 pages, 3 figure

    Long-range Effects on the Pyroelectric Coefficient of Ferroelectric Superlattice

    Full text link
    Long-range effects on the pyroelectric coefficient of a ferroelectric superlattice consisting of two different ferroelectric materials are investigated based on the Transverse Ising Model. The effects of the interfacial coupling and the thickness of one period on the pyroelectric coefficient of the ferroelectric superlattice are studied by taking into account the long-range interaction. It is found that with the increase of the strength of the long-range interaction, the pyroelectric coefficient decreases when the temperature is lower than the phase transition temperature; the number of the pyroelectric peaks decreases gradually and the phase transition temperature increases. It is also found that with the decrease of the interfacial coupling and the thickness of one period, the phase transition temperature and the number of the pyroelectric peaks decrease.Comment: 19 pages, 7 figure

    Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning

    Full text link
    We study robust distributed learning that involves minimizing a non-convex loss function with saddle points. We consider the Byzantine setting where some worker machines have abnormal or even arbitrary and adversarial behavior. In this setting, the Byzantine machines may create fake local minima near a saddle point that is far away from any true local minimum, even when robust gradient estimators are used. We develop ByzantinePGD, a robust first-order algorithm that can provably escape saddle points and fake local minima, and converge to an approximate true local minimizer with low iteration complexity. As a by-product, we give a simpler algorithm and analysis for escaping saddle points in the usual non-Byzantine setting. We further discuss three robust gradient estimators that can be used in ByzantinePGD, including median, trimmed mean, and iterative filtering. We characterize their performance in concrete statistical settings, and argue for their near-optimality in low and high dimensional regimes.Comment: ICML 201

    Scheduling Constraint Based Abstraction Refinement for Multi-Threaded Program Verification

    Full text link
    Bounded model checking is among the most efficient techniques for the automatic verification of concurrent programs. However, encoding all possible interleavings often requires a huge and complex formula, which significantly limits the salability. This paper proposes a novel and efficient abstraction refinement method for multi-threaded program verification. Observing that the huge formula is usually dominated by the exact encoding of the scheduling constraint, this paper proposes a \tsc based abstraction refinement method, which avoids the huge and complex encoding of BMC. In addition, to obtain an effective refinement, we have devised two graph-based algorithms over event order graph for counterexample validation and refinement generation, which can always obtain a small yet effective refinement constraint. Enhanced by two constraint-based algorithms for counterexample validation and refinement generation, we have proved that our method is sound and complete w.r.t. the given loop unwinding depth. Experimental results on \svcompc benchmarks indicate that our method is promising and significantly outperforms the existing state-of-the-art tools.Comment: 27 pages, 16 figure

    Fast Linearized Bregman Iteration for Compressive Sensing and Sparse Denoising

    Full text link
    We propose and analyze an extremely fast, efficient, and simple method for solving the problem:min{parallel to u parallel to(1) : Au = f, u is an element of R-n}.This method was first described in [J. Darbon and S. Osher, preprint, 2007], with more details in [W. Yin, S. Osher, D. Goldfarb and J. Darbon, SIAM J. Imaging Sciences, 1(1), 143-168, 2008] and rigorous theory given in [J. Cai, S. Osher and Z. Shen, Math. Comp., to appear, 2008, see also UCLA CAM Report 08-06] and [J. Cai, S. Osher and Z. Shen, UCLA CAM Report, 08-52, 2008]. The motivation was compressive sensing, which now has a vast and exciting history, which seems to have started with Candes, et. al. [E. Candes, J. Romberg and T. Tao, 52(2), 489-509, 2006] and Donoho, [D. L. Donoho, IEEE Trans. Inform. Theory, 52, 1289-1306, 2006]. See [W. Yin, S. Osher, D. Goldfarb and J. Darbon, SIAM J. Imaging Sciences 1(1), 143-168, 2008] and [J. Cai, S. Osher and Z. Shen, Math. Comp., to appear, 2008, see also UCLA CAM Report, 08-06] and [J. Cai, S. Osher and Z. Shen, UCLA CAM Report, 08-52, 2008] for a large set of references. Our method introduces an improvement called "kicking" of the very efficient method of [J. Darbon and S. Osher, preprint, 2007] and [W. Yin, S. Osher, D. Goldfarb and J. Darbon, SIAM J. Imaging Sciences, 1(1), 143-168, 2008] and also applies it to the problem of denoising of undersampled signals. The use of Bregman iteration for denoising of images began in [S. Osher, M. Burger, D. Goldfarb, J. Xu and W. Yin, Multiscale Model. Simul, 4(2), 460-489, 2005] and led to improved results for total variation based methods. Here we apply it to denoise signals, especially essentially sparse signals, which might even be undersampled

    PhaseCode: Fast and Efficient Compressive Phase Retrieval based on Sparse-Graph-Codes

    Full text link
    We consider the problem of recovering a KK-sparse complex signal xx from mm intensity measurements. We propose the PhaseCode algorithm, and show that in the noiseless case, PhaseCode can recover an arbitrarily-close-to-one fraction of the KK non-zero signal components using only slightly more than 4K4K measurements when the support of the signal is uniformly random, with order-optimal time and memory complexity of Θ(K)\Theta(K). It is known that the fundamental limit for the number of measurements in compressive phase retrieval problem is 4Kβˆ’o(K)4K - o(K) to recover the signal exactly and with no assumptions on its support distribution. This shows that under mild relaxation of the conditions, our algorithm is the first constructive \emph{capacity-approaching} compressive phase retrieval algorithm: in fact, our algorithm is also order-optimal in complexity and memory. Next, motivated by some important practical classes of optical systems, we consider a Fourier-friendly constrained measurement setting, and show that its performance matches that of the unconstrained setting. In the Fourier-friendly setting that we consider, the measurement matrix is constrained to be a cascade of Fourier matrices and diagonal matrices. We further demonstrate how PhaseCode can be robustified to noise. Throughout, we provide extensive simulation results that validate the practical power of our proposed algorithms for the sparse unconstrained and Fourier-friendly measurement settings, for noiseless and noisy scenarios. A key contribution of our work is the novel use of coding-theoretic tools like density evolution methods for the design and analysis of fast and efficient algorithms for compressive phase-retrieval problems.Comment: To appear in IEEE Transactions on Information Theor

    Learning Mixtures of Sparse Linear Regressions Using Sparse Graph Codes

    Full text link
    In this paper, we consider the mixture of sparse linear regressions model. Let Ξ²(1),…,Ξ²(L)∈Cn{\beta}^{(1)},\ldots,{\beta}^{(L)}\in\mathbb{C}^n be L L unknown sparse parameter vectors with a total of K K non-zero coefficients. Noisy linear measurements are obtained in the form yi=xiHΞ²(β„“i)+wiy_i={x}_i^H {\beta}^{(\ell_i)} + w_i, each of which is generated randomly from one of the sparse vectors with the label β„“i \ell_i unknown. The goal is to estimate the parameter vectors efficiently with low sample and computational costs. This problem presents significant challenges as one needs to simultaneously solve the demixing problem of recovering the labels β„“i \ell_i as well as the estimation problem of recovering the sparse vectors Ξ²(β„“) {\beta}^{(\ell)} . Our solution to the problem leverages the connection between modern coding theory and statistical inference. We introduce a new algorithm, Mixed-Coloring, which samples the mixture strategically using query vectors xi {x}_i constructed based on ideas from sparse graph codes. Our novel code design allows for both efficient demixing and parameter estimation. In the noiseless setting, for a constant number of sparse parameter vectors, our algorithm achieves the order-optimal sample and time complexities of Θ(K)\Theta(K). In the presence of Gaussian noise, for the problem with two parameter vectors (i.e., L=2L=2), we show that the Robust Mixed-Coloring algorithm achieves near-optimal Θ(Kpolylog(n))\Theta(K polylog(n)) sample and time complexities. When K=O(nΞ±)K=O(n^{\alpha}) for some constant α∈(0,1)\alpha\in(0,1) (i.e., KK is sublinear in nn), we can achieve sample and time complexities both sublinear in the ambient dimension. In one of our experiments, to recover a mixture of two regressions with dimension n=500n=500 and sparsity K=50K=50, our algorithm is more than 300300 times faster than EM algorithm, with about one third of its sample cost.Comment: To appear in IEEE Transactions on Information Theor
    • …
    corecore