142 research outputs found

    Lattice points in model domains of finite type in Rd\mathbb{R}^d, II

    Full text link
    We study the lattice point problem associated with a special class of high-dimensional finite type domains via estimating the Fourier transforms of corresponding indicator functions

    An Optimal Transport View on Generalization

    Full text link
    We derive upper bounds on the generalization error of learning algorithms based on their \emph{algorithmic transport cost}: the expected Wasserstein distance between the output hypothesis and the output hypothesis conditioned on an input example. The bounds provide a novel approach to study the generalization of learning algorithms from an optimal transport view and impose less constraints on the loss function, such as sub-gaussian or bounded. We further provide several upper bounds on the algorithmic transport cost in terms of total variation distance, relative entropy (or KL-divergence), and VC dimension, thus further bridging optimal transport theory and information theory with statistical learning theory. Moreover, we also study different conditions for loss functions under which the generalization error of a learning algorithm can be upper bounded by different probability metrics between distributions relating to the output hypothesis and/or the input data. Finally, under our established framework, we analyze the generalization in deep learning and conclude that the generalization error in deep neural networks (DNNs) decreases exponentially to zero as the number of layers increases. Our analyses of generalization error in deep learning mainly exploit the hierarchical structure in DNNs and the contraction property of ff-divergence, which may be of independent interest in analyzing other learning models with hierarchical structure.Comment: 27 pages, 2 figures, 1 tabl

    Theoretical Analysis of Adversarial Learning: A Minimax Approach

    Full text link
    Here we propose a general theoretical method for analyzing the risk bound in the presence of adversaries. Specifically, we try to fit the adversarial learning problem into the minimax framework. We first show that the original adversarial learning problem can be reduced to a minimax statistical learning problem by introducing a transport map between distributions. Then, we prove a new risk bound for this minimax problem in terms of covering numbers under a weak version of Lipschitz condition. Our method can be applied to multi-class classification problems and commonly used loss functions such as the hinge and ramp losses. As some illustrative examples, we derive the adversarial risk bounds for SVMs, deep neural networks, and PCA, and our bounds have two data-dependent terms, which can be optimized for achieving adversarial robustness.Comment: 27 pages, add some reference

    An Information-Theoretic View for Deep Learning

    Full text link
    Deep learning has transformed computer vision, natural language processing, and speech recognition\cite{badrinarayanan2017segnet, dong2016image, ren2017faster, ji20133d}. However, two critical questions remain obscure: (1) why do deep neural networks generalize better than shallow networks; and (2) does it always hold that a deeper network leads to better performance? Specifically, letting LL be the number of convolutional and pooling layers in a deep neural network, and nn be the size of the training sample, we derive an upper bound on the expected generalization error for this network, i.e., \begin{eqnarray*} \mathbb{E}[R(W)-R_S(W)] \leq \exp{\left(-\frac{L}{2}\log{\frac{1}{\eta}}\right)}\sqrt{\frac{2\sigma^2}{n}I(S,W) } \end{eqnarray*} where σ>0\sigma >0 is a constant depending on the loss function, 0<η<10<\eta<1 is a constant depending on the information loss for each convolutional or pooling layer, and I(S,W)I(S, W) is the mutual information between the training sample SS and the output hypothesis WW. This upper bound shows that as the number of convolutional and pooling layers LL increases in the network, the expected generalization error will decrease exponentially to zero. Layers with strict information loss, such as the convolutional layers, reduce the generalization error for the whole network; this answers the first question. However, algorithms with zero expected generalization error does not imply a small test error or E[R(W)]\mathbb{E}[R(W)]. This is because E[RS(W)]\mathbb{E}[R_S(W)] is large when the information for fitting the data is lost as the number of layers increases. This suggests that the claim `the deeper the better' is conditioned on a small training error or E[RS(W)]\mathbb{E}[R_S(W)]. Finally, we show that deep learning satisfies a weak notion of stability and the sample complexity of deep neural networks will decrease as LL increases.Comment: Add details in the proof of Theorem

    Improving "Fast Iterative Shrinkage-Thresholding Algorithm": Faster, Smarter and Greedier

    Full text link
    The "fast iterative shrinkage-thresholding algorithm", a.k.a. FISTA, is one of the most well-known first-order optimisation scheme in the literature, as it achieves the worst-case O(1/k2)O(1/k^2) optimal convergence rate in terms of objective function value. However, despite such an optimal theoretical convergence rate, in practice the (local) oscillatory behaviour of FISTA often damps its efficiency. Over the past years, various efforts are made in the literature to improve the practical performance of FISTA, such as monotone FISTA, restarting FISTA and backtracking strategies. In this paper, we propose a simple yet effective modification to the original FISTA scheme which has two advantages: it allows us to 1) prove the convergence of generated sequence; 2) design a so-called "lazy-start" strategy which can up to an order faster than the original scheme. Moreover, by exploring the properties of FISTA scheme, we propose novel adaptive and greedy strategies which probes the limit of the algorithm. The advantages of the proposed schemes are tested through problems arising from inverse problem, machine learning and signal/image processing.Comment: correct proof of one lemm

    A Novel Consensus-based Distributed Algorithm for Economic Dispatch Based on Local Estimation of Power Mismatch

    Full text link
    This paper proposes a novel consensus-based distributed control algorithm for solving the economic dispatch problem of distributed generators. A legacy central controller can be eliminated in order to avoid a single point of failure, relieve computational burden, maintain data privacy, and support plug-and-play functionalities. The optimal economic dispatch is achieved by allowing the iterative coordination of local agents (consumers and distributed generators). As coordination information, the local estimation of power mismatch is shared among distributed generators through communication networks and does not contain any private information, ultimately contributing to a fair electricity market. Additionally, the proposed distributed algorithm is particularly designed for easy implementation and configuration of a large number of agents in which the distributed decision making can be implemented in a simple proportional-integral (PI) or integral (I) controller. In MATLAB/Simulink simulation, the accuracy of the proposed distributed algorithm is demonstrated in a 29-node system in comparison with the centralized algorithm. Scalability and a fast convergence rate are also demonstrated in a 1400-node case study. Further, the experimental test demonstrates the practical performance of the proposed distributed algorithm using the VOLTTRON platform and a cluster of low-cost credit-card-size single-board PCs.Comment: 16 Pages, 13 figures Figures order and references are corrected

    Experimental realization of quantum algorithms for linear system inspired by adiabatic quantum computing

    Full text link
    Quantum adiabatic algorithm is of vital importance in quantum computation field. It offers us an alternative approach to manipulate the system instead of quantum gate model. Recently, an interesting work arXiv:1805.10549 indicated that we can solve linear equation system via algorithm inspired by adiabatic quantum computing. Here we demonstrate the algorithm and realize the solution of 8-dimensional linear equations Ax=bA\textbf{x}=\textbf{b} in a 4-qubit nuclear magnetic resonance system. The result is by far the solution of maximum-dimensional linear equation with a limited number of qubits in experiments, which includes some ingenious simplifications. Our experiment provides the new possibility of solving so many practical problems related to linear equations systems and has the potential applications in designing the future quantum algorithms

    Quantum Pure State Tomography via Variational Hybrid Quantum-Classical Method

    Full text link
    To obtain a complete description of a quantum system, one usually employs standard quantum state tomography, which however requires exponential number of measurements to perform and hence is impractical when the system's size grows large. In this work, we introduce a self-learning tomographic scheme based on the variational hybrid quantum-classical method. The key part of the scheme is a learning procedure, in which we learn a control sequence capable of driving the unknown target state coherently to a simple fiducial state, so that the target state can be directly reconstructed by applying the control sequence reversely. In this manner, the state tomography problem is converted to a state-to-state transfer problem. To solve the latter problem, we use the closed-loop learning control approach. Our scheme is further experimentally tested using techniques of a 4-qubit nuclear magnetic resonance. {Experimental results indicate that the proposed tomographic scheme can handle a broad class of states including entangled states in quantum information, as well as dynamical states of quantum many-body systems common to condensed matter physics.Comment: 9 pages, 5 figures. To be published in Physical Review Applie

    Experimental observation of information flow in the anti-PT\mathcal{PT}-symmetric system

    Full text link
    The recently theoretical and experimental researches related to PT\mathcal{PT}-symmetric system have attracted unprecedented attention because of various novel features and potentials in extending canonical quantum mechanics. However, as the counterpart of PT\mathcal{PT}-symmetry, there are only a few researches on anti-PT\mathcal{PT}-symmetry. Here, we propose an algorithm for simulating the universal anti-PT\mathcal{PT}-symmetric system with quantum circuit. Utilizing the protocols, an oscillation of information flow is observed for the first time in our Nuclear Magnetic Resonance quantum simulator. We will show that information will recover from the environment completely when the anti-PT\mathcal{PT}-symmetry is broken, whereas no information can be retrieved in the symmetry-unbroken phase. Our work opens the gate for practical quantum simulation and experimental investigation of universal anti-PT\mathcal{PT}-symmetric system in quantum computer

    AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

    Full text link
    This paper addresses a fundamental challenge in 3D medical image processing: how to deal with imaging thickness. For anisotropic medical volumes, there is a significant performance gap between thin-slice (mostly 1mm) and thick-slice (mostly 5mm) volumes. Prior arts tend to use 3D approaches for the thin-slice and 2D approaches for the thick-slice, respectively. We aim at a unified approach for both thin- and thick-slice medical volumes. Inspired by recent advances in video analysis, we propose AlignShift, a novel parameter-free operator to convert theoretically any 2D pretrained network into thickness-aware 3D network. Remarkably, the converted networks behave like 3D for the thin-slice, nevertheless degenerate to 2D for the thick-slice adaptively. The unified thickness-aware representation learning is achieved by shifting and fusing aligned "virtual slices" as per the input imaging thickness. Extensive experiments on public large-scale DeepLesion benchmark, consisting of 32K lesions for universal lesion detection, validate the effectiveness of our method, which outperforms previous state of the art by considerable margins without whistles and bells. More importantly, to our knowledge, this is the first method that bridges the performance gap between thin- and thick-slice volumes by a unified framework. To improve research reproducibility, our code in PyTorch is open source at https://github.com/M3DV/AlignShift.Comment: MICCAI 2020 (early accepted). Camera ready version. Code is available at https://github.com/M3DV/AlignShif
    • …
    corecore