447 research outputs found
Initial-state-dependent quantum speed limit for dissipative state preparation: Framework and optimization
Dissipation has traditionally been considered a hindrance to quantum
information processing, but recent studies have shown that it can be harnessed
to generate desired quantum states. To be useful for practical applications,
the ability to speed up the dissipative evolution is crucial. In this study, we
focus on a Markovian dissipative state preparation scheme where the prepared
state is one of the energy eigenstates. We derive an initial-state-dependent
quantum speed limit (QSL) that offers a more refined measure of the actual
evolution time compared to the commonly used initial-state-independent
relaxation time. This allows for a passive optimization of dissipative
evolution across different initial states. By minimizing the dissipated heat
during the preparation process, conditioned on the minimization of evolution
time using the QSL, we find that the preferred initial state has a specific
permutation of diagonal elements with respect to an ordered energy eigenbasis
of increasing eigenvalues. In this configuration, the population on the
prepared state is the largest, and the remaining diagonal elements are sorted
in an order resembling that of a passive state in the same ordered energy
eigenbasis. We demonstrate the effectiveness of our strategy in a dissipative
Rydberg atom system for preparing the Bell state. Our work provides new
insights into the optimization of dissipative state preparation processes and
could have significant implications for practical quantum technologies.Comment: 9 pages, 2 figures. Comments are welcom
D: Decentralized Training over Decentralized Data
While training a machine learning model using multiple workers, each of which
collects data from their own data sources, it would be most useful when the
data collected from different workers can be {\em unique} and {\em different}.
Ironically, recent analysis of decentralized parallel stochastic gradient
descent (D-PSGD) relies on the assumption that the data hosted on different
workers are {\em not too different}. In this paper, we ask the question: {\em
Can we design a decentralized parallel stochastic gradient descent algorithm
that is less sensitive to the data variance across workers?} In this paper, we
present D, a novel decentralized parallel stochastic gradient descent
algorithm designed for large data variance \xr{among workers} (imprecisely,
"decentralized" data). The core of D is a variance blackuction extension of
the standard D-PSGD algorithm, which improves the convergence rate from
to where
denotes the variance among data on different workers. As a result, D is
robust to data variance among workers. We empirically evaluated D on image
classification tasks where each worker has access to only the data of a limited
set of labels, and find that D significantly outperforms D-PSGD
Universal Landauer-Like Inequality from the First Law of Thermodynamics
The first law of thermodynamics, which governs energy conservation, is
traditionally formulated as an equality. Surprisingly, we demonstrate that the
first law alone implies a universal Landauer-like inequality linking changes in
system entropy and energy. However, contrasting with the Landauer principle
derived from the second law of thermodynamics, our obtained Landauer-like
inequality solely relies on system information and is applicable in scenarios
where implementing the Landauer principle becomes challenging. Furthermore, the
Landauer-like inequality can complement the Landauer principle by establishing
a dual {\it upper} bound on heat dissipation. We illustrate the practical
utility of the Landauer-like inequality in dissipative quantum state
preparation and quantum information erasure applications. Our findings offer
new insights into identifying thermodynamic constraints relevant to the fields
of quantum thermodynamics and the energetics of quantum information processing
and more specifically, this approach could facilitate investigations into
systems coupled to non-thermal baths or scenarios where access to bath
information is limited.Comment: 9 pages, 3 figures, submitted. Comments are welcom
Distributed Learning over Unreliable Networks
Most of today's distributed machine learning systems assume {\em reliable
networks}: whenever two machines exchange information (e.g., gradients or
models), the network should guarantee the delivery of the message. At the same
time, recent work exhibits the impressive tolerance of machine learning
algorithms to errors or noise arising from relaxed communication or
synchronization. In this paper, we connect these two trends, and consider the
following question: {\em Can we design machine learning systems that are
tolerant to network unreliability during training?} With this motivation, we
focus on a theoretical problem of independent interest---given a standard
distributed parameter server architecture, if every communication between the
worker and the server has a non-zero probability of being dropped, does
there exist an algorithm that still converges, and at what speed? The technical
contribution of this paper is a novel theoretical analysis proving that
distributed learning over unreliable network can achieve comparable convergence
rate to centralized or distributed learning over reliable networks. Further, we
prove that the influence of the packet drop rate diminishes with the growth of
the number of \textcolor{black}{parameter servers}. We map this theoretical
result onto a real-world scenario, training deep neural networks over an
unreliable network layer, and conduct network simulation to validate the system
improvement by allowing the networks to be unreliable
CAT: Causal Audio Transformer for Audio Classification
The attention-based Transformers have been increasingly applied to audio
classification because of their global receptive field and ability to handle
long-term dependency. However, the existing frameworks which are mainly
extended from the Vision Transformers are not perfectly compatible with audio
signals. In this paper, we introduce a Causal Audio Transformer (CAT)
consisting of a Multi-Resolution Multi-Feature (MRMF) feature extraction with
an acoustic attention block for more optimized audio modeling. In addition, we
propose a causal module that alleviates over-fitting, helps with knowledge
transfer, and improves interpretability. CAT obtains higher or comparable
state-of-the-art classification performance on ESC50, AudioSet and UrbanSound8K
datasets, and can be easily generalized to other Transformer-based models.Comment: Accepted to ICASSP 202
- …