411 research outputs found

    Duality relation between coherence and path information in the presence of quantum memory

    Full text link
    The wave-particle duality demonstrates a competition relation between wave and particle behavior for a particle going through an interferometer. This duality can be formulated as an inequality, which upper bounds the sum of interference visibility and path information. However, if the particle is entangled with a quantum memory, then the bound may decrease. Here, we find the duality relation between coherence and path information for a particle going through a multipath interferometer in the presence of a quantum memory, offering an upper bound on the duality relation which is directly connected with the amount of entanglement between the particle and the quantum memory.Comment: 6 pages, 1 figure, comments are welcom

    Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate

    Full text link
    Recent works (e.g., (Li and Arora, 2020)) suggest that the use of popular normalization schemes (including Batch Normalization) in today's deep learning can move it far from a traditional optimization viewpoint, e.g., use of exponentially increasing learning rates. The current paper highlights other ways in which behavior of normalized nets departs from traditional viewpoints, and then initiates a formal framework for studying their mathematics via suitable adaptation of the conventional framework namely, modeling SGD-induced training trajectory via a suitable stochastic differential equation (SDE) with a noise term that captures gradient noise. This yields: (a) A new ' intrinsic learning rate' parameter that is the product of the normal learning rate and weight decay factor. Analysis of the SDE shows how the effective speed of learning varies and equilibrates over time under the control of intrinsic LR. (b) A challenge -- via theory and experiments -- to popular belief that good generalization requires large learning rates at the start of training. (c) New experiments, backed by mathematical intuition, suggesting the number of steps to equilibrium (in function space) scales as the inverse of the intrinsic learning rate, as opposed to the exponential time convergence bound implied by SDE analysis. We name it the Fast Equilibrium Conjecture and suggest it holds the key to why Batch Normalization is effective.Comment: 25 pages, 12 figures. Accepted By 34th Conference on Neural Information Processing Systems (NeurIPS 2020

    Quantifying the resource content of quantum channels: An operational approach

    Full text link
    We propose a general method to operationally quantify the resourcefulness of quantum channels via channel discrimination, an important information processing task. A main result is that the maximum success probability of distinguishing a given channel from the set of free channels by free probe states is exactly characterized by the resource generating power, i.e. the maximum amount of resource produced by the action of the channel, given by the trace distance to the set of free states. We apply this framework to the resource theory of quantum coherence, as an informative example. The general results can also be easily applied to other resource theories such as entanglement, magic states, and asymmetry.Comment: v2. 9 pages, new references are added v1. 8 pages, no figure

    Beyond Hard Samples: Robust and Effective Grammatical Error Correction with Cycle Self-Augmenting

    Full text link
    Recent studies have revealed that grammatical error correction methods in the sequence-to-sequence paradigm are vulnerable to adversarial attack, and simply utilizing adversarial examples in the pre-training or post-training process can significantly enhance the robustness of GEC models to certain types of attack without suffering too much performance loss on clean data. In this paper, we further conduct a thorough robustness evaluation of cutting-edge GEC methods for four different types of adversarial attacks and propose a simple yet very effective Cycle Self-Augmenting (CSA) method accordingly. By leveraging the augmenting data from the GEC models themselves in the post-training process and introducing regularization data for cycle training, our proposed method can effectively improve the model robustness of well-trained GEC models with only a few more training epochs as an extra cost. More concretely, further training on the regularization data can prevent the GEC models from over-fitting on easy-to-learn samples and thus can improve the generalization capability and robustness towards unseen data (adversarial noise/samples). Meanwhile, the self-augmented data can provide more high-quality pseudo pairs to improve model performance on the original testing data. Experiments on four benchmark datasets and seven strong models indicate that our proposed training method can significantly enhance the robustness of four types of attacks without using purposely built adversarial examples in training. Evaluation results on clean data further confirm that our proposed CSA method significantly improves the performance of four baselines and yields nearly comparable results with other state-of-the-art models. Our code is available at https://github.com/ZetangForward/CSA-GEC

    The Marginal Value of Momentum for Small Learning Rate SGD

    Full text link
    Momentum is known to accelerate the convergence of gradient descent in strongly convex settings without stochastic gradient noise. In stochastic optimization, such as training neural networks, folklore suggests that momentum may help deep learning optimization by reducing the variance of the stochastic gradient update, but previous theoretical analyses do not find momentum to offer any provable acceleration. Theoretical results in this paper clarify the role of momentum in stochastic settings where the learning rate is small and gradient noise is the dominant source of instability, suggesting that SGD with and without momentum behave similarly in the short and long time horizons. Experiments show that momentum indeed has limited benefits for both optimization and generalization in practical training regimes where the optimal learning rate is not very large, including small- to medium-batch training from scratch on ImageNet and fine-tuning language models on downstream tasks

    Coexistence Designs of Radar and Communication Systems in a Multi-path Scenario

    Full text link
    The focus of this study is on the spectrum sharing between multiple-input multiple-output (MIMO) communications and co-located MIMO radar systems in multi-path environments. The major challenge is to suppress the mutual interference between the two systems while combining the useful multi-path components received at each system. We tackle this challenge by jointly designing the communication precoder, radar transmit waveform and receive filter. Specifically, the signal-to-interference-plus-noise ratio (SINR) at the radar receiver is maximized subject to constraints on the radar waveform, communication rate and transmit power. The multi-path propagation complicates the expressions of the radar SINR and communication rate, leading to a non-convex problem. To solve it, a sub-optimal algorithm based on the alternating maximization is used to optimize the precoder, radar transmit waveform and receive filter iteratively. Simulation results are provided to demonstrate the effectiveness of the proposed design
    • …
    corecore