369 research outputs found

    Learning Pair Potentials using Differentiable Simulations

    Full text link
    Learning pair interactions from experimental or simulation data is of great interest for molecular simulations. We propose a general stochastic method for learning pair interactions from data using differentiable simulations (DiffSim). DiffSim defines a loss function based on structural observables, such as the radial distribution function, through molecular dynamics (MD) simulations. The interaction potentials are then learned directly by stochastic gradient descent, using backpropagation to calculate the gradient of the structural loss metric with respect to the interaction potential through the MD simulation. This gradient-based method is flexible and can be configured to simulate and optimize multiple systems simultaneously. For example, it is possible to simultaneously learn potentials for different temperatures or for different compositions. We demonstrate the approach by recovering simple pair potentials, such as Lennard-Jones systems, from radial distribution functions. We find that DiffSim can be used to probe a wider functional space of pair potentials compared to traditional methods like Iterative Boltzmann Inversion. We show that our methods can be used to simultaneously fit potentials for simulations at different compositions and temperatures to improve the transferability of the learned potentials.Comment: 12 pages, 10 figure

    Fine-Grained Prototypes Distillation for Few-Shot Object Detection

    Full text link
    Few-shot object detection (FSOD) aims at extending a generic detector for novel object detection with only a few training examples. It attracts great concerns recently due to the practical meanings. Meta-learning has been demonstrated to be an effective paradigm for this task. In general, methods based on meta-learning employ an additional support branch to encode novel examples (a.k.a. support images) into class prototypes, which are then fused with query branch to facilitate the model prediction. However, the class-level prototypes are difficult to precisely generate, and they also lack detailed information, leading to instability in performance.New methods are required to capture the distinctive local context for more robust novel object detection. To this end, we propose to distill the most representative support features into fine-grained prototypes. These prototypes are then assigned into query feature maps based on the matching results, modeling the detailed feature relations between two branches. This process is realized by our Fine-Grained Feature Aggregation (FFA) module. Moreover, in terms of high-level feature fusion, we propose Balanced Class-Agnostic Sampling (B-CAS) strategy and Non-Linear Fusion (NLF) module from differenct perspectives. They are complementary to each other and depict the high-level feature relations more effectively. Extensive experiments on PASCAL VOC and MS COCO benchmarks show that our method sets a new state-of-the-art performance in most settings. Our code is available at https://github.com/wangchen1801/FPD.Comment: Accepted by AAAI202

    Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult

    Full text link
    Large learning rates, when applied to gradient descent for nonconvex optimization, yield various implicit biases including the edge of stability (Cohen et al., 2021), balancing (Wang et al., 2022), and catapult (Lewkowycz et al., 2020). These phenomena cannot be well explained by classical optimization theory. Though significant theoretical progress has been made in understanding these implicit biases, it remains unclear for which objective functions would they be more likely. This paper provides an initial step in answering this question and also shows that these implicit biases are in fact various tips of the same iceberg. To establish these results, we develop a global convergence theory under large learning rates, for a family of nonconvex functions without globally Lipschitz continuous gradient, which was typically assumed in existing convergence analysis. Specifically, these phenomena are more likely to occur when the optimization objective function has good regularity. This regularity, together with gradient descent using a large learning rate that favors flatter regions, results in these nontrivial dynamical behaviors. Another corollary is the first non-asymptotic convergence rate bound for large-learning-rate gradient descent optimization of nonconvex functions. Although our theory only applies to specific functions so far, the possibility of extrapolating it to neural networks is also experimentally validated, for which different choices of loss, activation functions, and other techniques such as batch normalization can all affect regularity significantly and lead to very different training dynamics

    3d N=4 mirror symmetry with 1-form symmetry

    Get PDF
    The study of 3d mirror symmetry has greatly enhanced our understanding of various aspects of 3d N=4\mathcal{N}=4 theories. In this paper, starting with known mirror pairs of 3d N=4\mathcal{N}=4 quiver gauge theories and gauging discrete subgroups of the flavour or topological symmetry, we construct new mirror pairs with non-trivial 1-form symmetry. By providing explicit quiver descriptions of these theories, we thoroughly specify their symmetries (0-form, 1-form, and 2-group) and the mirror maps between them

    3d N=4\mathcal{N}=4 mirror symmetry with 1-form symmetry

    Get PDF
    The study of 3d mirror symmetry has greatly enhanced our understanding of various aspects of 3d N=4\mathcal{N}=4 theories. In this paper, starting with known mirror pairs of 3d N=4\mathcal{N}=4 quiver gauge theories and gauging discrete subgroups of the flavour or topological symmetry, we construct new mirror pairs with non-trivial 1-form symmetry. By providing explicit quiver descriptions of these theories, we thoroughly specify their symmetries (0-form, 1-form, and 2-group) and the mirror maps between them.Comment: v2: 38 pages + appendices; typos fixed, clarifications added, references adde

    Magnetic quivers and line defects — on a duality between 3d N = 4 unitary and orthosymplectic quivers

    Get PDF
    Supersymmetric Sp(k) quantum chromodynamics with 8 supercharges in spacetime dimensions 3 to 6 can be realised by two different Type II brane configurations in the presence of orientifolds. Consequently, two types of magnetic quivers describe the Higgs branch of the Sp(k) SQCD theory. This is a salient example of a general phenomenon: a given hyper-Kähler Higgs branch may admit several magnetic quiver constructions. It is then natural to wonder if these different magnetic quivers, which are described by 3d N = 4 theories, are dual theories. In this work, the unitary and orthosymplectic magnetic quiver theories are subjected to a variety of tests, providing evidence that they are IR dual to each other. For this, sphere partition function and supersymmetric indices are compared. Also, we study half BPS line defects and find interesting regularities from the viewpoints of exact results, brane configurations, and 1-form symmetry
    • …
    corecore