369 research outputs found
Learning Pair Potentials using Differentiable Simulations
Learning pair interactions from experimental or simulation data is of great
interest for molecular simulations. We propose a general stochastic method for
learning pair interactions from data using differentiable simulations
(DiffSim). DiffSim defines a loss function based on structural observables,
such as the radial distribution function, through molecular dynamics (MD)
simulations. The interaction potentials are then learned directly by stochastic
gradient descent, using backpropagation to calculate the gradient of the
structural loss metric with respect to the interaction potential through the MD
simulation. This gradient-based method is flexible and can be configured to
simulate and optimize multiple systems simultaneously. For example, it is
possible to simultaneously learn potentials for different temperatures or for
different compositions. We demonstrate the approach by recovering simple pair
potentials, such as Lennard-Jones systems, from radial distribution functions.
We find that DiffSim can be used to probe a wider functional space of pair
potentials compared to traditional methods like Iterative Boltzmann Inversion.
We show that our methods can be used to simultaneously fit potentials for
simulations at different compositions and temperatures to improve the
transferability of the learned potentials.Comment: 12 pages, 10 figure
Fine-Grained Prototypes Distillation for Few-Shot Object Detection
Few-shot object detection (FSOD) aims at extending a generic detector for
novel object detection with only a few training examples. It attracts great
concerns recently due to the practical meanings. Meta-learning has been
demonstrated to be an effective paradigm for this task. In general, methods
based on meta-learning employ an additional support branch to encode novel
examples (a.k.a. support images) into class prototypes, which are then fused
with query branch to facilitate the model prediction. However, the class-level
prototypes are difficult to precisely generate, and they also lack detailed
information, leading to instability in performance.New methods are required to
capture the distinctive local context for more robust novel object detection.
To this end, we propose to distill the most representative support features
into fine-grained prototypes. These prototypes are then assigned into query
feature maps based on the matching results, modeling the detailed feature
relations between two branches. This process is realized by our Fine-Grained
Feature Aggregation (FFA) module. Moreover, in terms of high-level feature
fusion, we propose Balanced Class-Agnostic Sampling (B-CAS) strategy and
Non-Linear Fusion (NLF) module from differenct perspectives. They are
complementary to each other and depict the high-level feature relations more
effectively. Extensive experiments on PASCAL VOC and MS COCO benchmarks show
that our method sets a new state-of-the-art performance in most settings. Our
code is available at https://github.com/wangchen1801/FPD.Comment: Accepted by AAAI202
Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult
Large learning rates, when applied to gradient descent for nonconvex
optimization, yield various implicit biases including the edge of stability
(Cohen et al., 2021), balancing (Wang et al., 2022), and catapult (Lewkowycz et
al., 2020). These phenomena cannot be well explained by classical optimization
theory. Though significant theoretical progress has been made in understanding
these implicit biases, it remains unclear for which objective functions would
they be more likely. This paper provides an initial step in answering this
question and also shows that these implicit biases are in fact various tips of
the same iceberg. To establish these results, we develop a global convergence
theory under large learning rates, for a family of nonconvex functions without
globally Lipschitz continuous gradient, which was typically assumed in existing
convergence analysis. Specifically, these phenomena are more likely to occur
when the optimization objective function has good regularity. This regularity,
together with gradient descent using a large learning rate that favors flatter
regions, results in these nontrivial dynamical behaviors. Another corollary is
the first non-asymptotic convergence rate bound for large-learning-rate
gradient descent optimization of nonconvex functions. Although our theory only
applies to specific functions so far, the possibility of extrapolating it to
neural networks is also experimentally validated, for which different choices
of loss, activation functions, and other techniques such as batch normalization
can all affect regularity significantly and lead to very different training
dynamics
3d N=4 mirror symmetry with 1-form symmetry
The study of 3d mirror symmetry has greatly enhanced our understanding of various aspects of 3d theories. In this paper, starting with known mirror pairs of 3d quiver gauge theories and gauging discrete subgroups of the flavour or topological symmetry, we construct new mirror pairs with non-trivial 1-form symmetry. By providing explicit quiver descriptions of these theories, we thoroughly specify their symmetries (0-form, 1-form, and 2-group) and the mirror maps between them
3d mirror symmetry with 1-form symmetry
The study of 3d mirror symmetry has greatly enhanced our understanding of
various aspects of 3d theories. In this paper, starting with
known mirror pairs of 3d quiver gauge theories and gauging
discrete subgroups of the flavour or topological symmetry, we construct new
mirror pairs with non-trivial 1-form symmetry. By providing explicit quiver
descriptions of these theories, we thoroughly specify their symmetries (0-form,
1-form, and 2-group) and the mirror maps between them.Comment: v2: 38 pages + appendices; typos fixed, clarifications added,
references adde
Magnetic quivers and line defects — on a duality between 3d N = 4 unitary and orthosymplectic quivers
Supersymmetric Sp(k) quantum chromodynamics with 8 supercharges in spacetime dimensions 3 to 6 can be realised by two different Type II brane configurations in the presence of orientifolds. Consequently, two types of magnetic quivers describe the Higgs branch of the Sp(k) SQCD theory. This is a salient example of a general phenomenon: a given hyper-Kähler Higgs branch may admit several magnetic quiver constructions. It is then natural to wonder if these different magnetic quivers, which are described by 3d N = 4 theories, are dual theories. In this work, the unitary and orthosymplectic magnetic quiver theories are subjected to a variety of tests, providing evidence that they are IR dual to each other. For this, sphere partition function and supersymmetric indices are compared. Also, we study half BPS line defects and find interesting regularities from the viewpoints of exact results, brane configurations, and 1-form symmetry
- …