285 research outputs found
Quark: A Gradient-Free Quantum Learning Framework for Classification Tasks
As more practical and scalable quantum computers emerge, much attention has
been focused on realizing quantum supremacy in machine learning. Existing
quantum ML methods either (1) embed a classical model into a target Hamiltonian
to enable quantum optimization or (2) represent a quantum model using
variational quantum circuits and apply classical gradient-based optimization.
The former method leverages the power of quantum optimization but only supports
simple ML models, while the latter provides flexibility in model design but
relies on gradient calculation, resulting in barren plateau (i.e., gradient
vanishing) and frequent classical-quantum interactions. To address the
limitations of existing quantum ML methods, we introduce Quark, a gradient-free
quantum learning framework that optimizes quantum ML models using quantum
optimization. Quark does not rely on gradient computation and therefore avoids
barren plateau and frequent classical-quantum interactions. In addition, Quark
can support more general ML models than prior quantum ML methods and achieves a
dataset-size-independent optimization complexity. Theoretically, we prove that
Quark can outperform classical gradient-based methods by reducing model query
complexity for highly non-convex problems; empirically, evaluations on the Edge
Detection and Tiny-MNIST tasks show that Quark can support complex ML models
and significantly reduce the number of measurements needed for discovering
near-optimal weights for these tasks.Comment: under revie
Duality and Parafermions Revisited
Given a two-dimensional bosonic theory with a non-anomalous
symmetry, the orbifolding and fermionization can be understood holographically
using three-dimensional BF theory with level . From a Hamiltonian
perspective, the information of dualities is encoded in a topological boundary
state which is defined as an eigenstate of certain Wilson loop operators
(anyons) in the bulk. We generalize this story to two-dimensional theories with
non-anomalous symmetry, focusing on parafermionization. We find
the generic operators defining different topological boundary states including
orbifolding and parafermionization with or subgroups of
, and discuss their algebraic properties as well as the
duality web.Comment: 39 pages, 5 figure
Optimizing Mixture of Experts using Dynamic Recompilations
The Mixture of Experts architecture allows for outrageously large neural
networks by scaling model parameter size independently from computational
demand (FLOPs). However, current DNN frameworks cannot effectively support the
dynamic data flow in Mixture of Experts, and implementations on top of these
frameworks need to use workarounds that introduce significant overheads. To
address the limitation of these frameworks, we present DynaMoE, a DNN library
that uses dynamic recompilations to optimize and adapt the use of computational
resources to the dynamic needs of Mixture of Experts models. Our evaluation
shows that DynaMoE achieves a 1.8x speedup and supports 2.3x larger model sizes
when compared to existing MoE systems, even when not using recompilations. We
then present further optimizations enabled by dynamic recompilations that yield
an additional 1.7x speedup while simultaneously reducing memory pressure and
improving model quality.Comment: 13 pages, 15 figure
Graph Augmentation Clustering Network
Existing graph clustering networks heavily rely on a predefined graph and may
fail if the initial graph is of low quality. To tackle this issue, we propose a
novel graph augmentation clustering network capable of adaptively enhancing the
initial graph to achieve better clustering performance. Specifically, we first
integrate the node attribute and topology structure information to learn the
latent feature representation. Then, we explore the local geometric structure
information on the embedding space to construct an adjacency graph and
subsequently develop an adaptive graph augmentation architecture to fuse that
graph with the initial one dynamically. Finally, we minimize the Jeffreys
divergence between multiple derived distributions to conduct network training
in an unsupervised fashion. Extensive experiments on six commonly used
benchmark datasets demonstrate that the proposed method consistently
outperforms several state-of-the-art approaches. In particular, our method
improves the ARI by more than 9.39\% over the best baseline on DBLP. The source
codes and data have been submitted to the appendix
Dynamical quantum phase transitions in a spinor Bose-Einstein condensate and criticality enhanced quantum sensing
Quantum phase transitions universally exist in the ground and excited states
of quantum many-body systems, and they have a close relationship with the
nonequilibrium dynamical phase transitions, which however are challenging to
identify. In the system of spin-1 Bose-Einstein condensates, though dynamical
phase transitions with correspondence to equilibrium phase transitions in the
ground state and uppermost excited state have been probed, those taken place in
intermediate excited states remain untouched in experiments thus far. Here we
unravel that both the ground and excited-state quantum phase transitions in
spinor condensates can be diagnosed with dynamical phase transitions. A
connection between equilibrium phase transitions and nonequilibrium behaviors
of the system is disclosed in terms of the quantum Fisher information. We also
demonstrate that near the critical points parameter estimation beyond standard
quantum limit can be implemented. This work not only advances the exploration
of excited-state quantum phase transitions via a scheme that can immediately be
applied to a broad class of few-mode quantum systems, but also provides new
perspective on the relationship between quantum criticality and quantum
enhanced sensing
Deep Attention-guided Graph Clustering with Dual Self-supervision
Existing deep embedding clustering works only consider the deepest layer to
learn a feature embedding and thus fail to well utilize the available
discriminative information from cluster assignments, resulting performance
limitation. To this end, we propose a novel method, namely deep
attention-guided graph clustering with dual self-supervision (DAGC).
Specifically, DAGC first utilizes a heterogeneity-wise fusion module to
adaptively integrate the features of an auto-encoder and a graph convolutional
network in each layer and then uses a scale-wise fusion module to dynamically
concatenate the multi-scale features in different layers. Such modules are
capable of learning a discriminative feature embedding via an attention-based
mechanism. In addition, we design a distribution-wise fusion module that
leverages cluster assignments to acquire clustering results directly. To better
explore the discriminative information from the cluster assignments, we develop
a dual self-supervision solution consisting of a soft self-supervision strategy
with a triplet Kullback-Leibler divergence loss and a hard self-supervision
strategy with a pseudo supervision loss. Extensive experiments validate that
our method consistently outperforms state-of-the-art methods on six benchmark
datasets. Especially, our method improves the ARI by more than 18.14% over the
best baseline
- β¦