2,894 research outputs found
Quantum Compressed Sensing with Unsupervised Tensor-Network Machine Learning
We propose tensor-network compressed sensing (TNCS) by combining the ideas of
compressed sensing, tensor network (TN), and machine learning, which permits
novel and efficient quantum communications of realistic data. The strategy is
to use the unsupervised TN machine learning algorithm to obtain the entangled
state that describes the probability distribution of a huge
amount of classical information considered to be communicated. To transfer a
specific piece of information with , our proposal is to encode
such information in the separable state with the minimal distance to the
measured state that is obtained by partially measuring on
in a designed way. To this end, a measuring protocol analogous
to the compressed sensing with neural-network machine learning is suggested,
where the measurements are designed to minimize uncertainty of information from
the probability distribution given by . In this way, those who
have can reliably access the information by simply measuring on
. We propose q-sparsity to characterize the sparsity of quantum
states and the efficiency of the quantum communications by TNCS. The high
q-sparsity is essentially due to the fact that the TN states describing nicely
the probability distribution obey the area law of entanglement entropy. Testing
on realistic datasets (hand-written digits and fashion images), TNCS is shown
to possess high efficiency and accuracy, where the security of communications
is guaranteed by the fundamental quantum principles.Comment: 5+6 pages, 3+6 figures. Essential changes and new data were added to
this new versio
Graph ODE with Factorized Prototypes for Modeling Complicated Interacting Dynamics
This paper studies the problem of modeling interacting dynamical systems,
which is critical for understanding physical dynamics and biological processes.
Recent research predominantly uses geometric graphs to represent these
interactions, which are then captured by powerful graph neural networks (GNNs).
However, predicting interacting dynamics in challenging scenarios such as
out-of-distribution shift and complicated underlying rules remains unsolved. In
this paper, we propose a new approach named Graph ODE with factorized
prototypes (GOAT) to address the problem. The core of GOAT is to incorporate
factorized prototypes from contextual knowledge into a continuous graph ODE
framework. Specifically, GOAT employs representation disentanglement and system
parameters to extract both object-level and system-level contexts from
historical trajectories, which allows us to explicitly model their independent
influence and thus enhances the generalization capability under system changes.
Then, we integrate these disentangled latent representations into a graph ODE
model, which determines a combination of various interacting prototypes for
enhanced model expressivity. The entire model is optimized using an end-to-end
variational inference framework to maximize the likelihood. Extensive
experiments in both in-distribution and out-of-distribution settings validate
the superiority of GOAT
AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition
Wearable devices like smart glasses are approaching the compute capability to
seamlessly generate real-time closed captions for live conversations. We build
on our recently introduced directional Automatic Speech Recognition (ASR) for
smart glasses that have microphone arrays, which fuses multi-channel ASR with
serialized output training, for wearer/conversation-partner disambiguation as
well as suppression of cross-talk speech from non-target directions and noise.
When ASR work is part of a broader system-development process, one may be
faced with changes to microphone geometries as system development progresses.
This paper aims to make multi-channel ASR insensitive to limited variations
of microphone-array geometry. We show that a model trained on multiple similar
geometries is largely agnostic and generalizes well to new geometries, as long
as they are not too different. Furthermore, training the model this way
improves accuracy for seen geometries by 15 to 28\% relative. Lastly, we refine
the beamforming by a novel Non-Linearly Constrained Minimum Variance criterion.Comment: Accepted to ICASSP 202
DisenPOI: Disentangling Sequential and Geographical Influence for Point-of-Interest Recommendation
Point-of-Interest (POI) recommendation plays a vital role in various
location-aware services. It has been observed that POI recommendation is driven
by both sequential and geographical influences. However, since there is no
annotated label of the dominant influence during recommendation, existing
methods tend to entangle these two influences, which may lead to sub-optimal
recommendation performance and poor interpretability. In this paper, we address
the above challenge by proposing DisenPOI, a novel Disentangled dual-graph
framework for POI recommendation, which jointly utilizes sequential and
geographical relationships on two separate graphs and disentangles the two
influences with self-supervision. The key novelty of our model compared with
existing approaches is to extract disentangled representations of both
sequential and geographical influences with contrastive learning. To be
specific, we construct a geographical graph and a sequential graph based on the
check-in sequence of a user. We tailor their propagation schemes to become
sequence-/geo-aware to better capture the corresponding influences. Preference
proxies are extracted from check-in sequence as pseudo labels for the two
influences, which supervise the disentanglement via a contrastive loss.
Extensive experiments on three datasets demonstrate the superiority of the
proposed model.Comment: Accepted by ACM International Conference on Web Search and Data
Mining (WSDM'23
Directional Source Separation for Robust Speech Recognition on Smart Glasses
Modern smart glasses leverage advanced audio sensing and machine learning
technologies to offer real-time transcribing and captioning services,
considerably enriching human experiences in daily communications. However, such
systems frequently encounter challenges related to environmental noises,
resulting in degradation to speech recognition and speaker change detection. To
improve voice quality, this work investigates directional source separation
using the multi-microphone array. We first explore multiple beamformers to
assist source separation modeling by strengthening the directional properties
of speech signals. In addition to relying on predetermined beamformers, we
investigate neural beamforming in multi-channel source separation,
demonstrating that automatic learning directional characteristics effectively
improves separation quality. We further compare the ASR performance leveraging
separated outputs to noisy inputs. Our results show that directional source
separation benefits ASR for the wearer but not for the conversation partner.
Lastly, we perform the joint training of the directional source separation and
ASR model, achieving the best overall ASR performance.Comment: Submitted to ICASSP 202
Cyclic deformation leads to defect healing and strengthening of small-volume metal crystals
When microscopic and macroscopic specimens of metals are subjected to cyclic loading, the creation, interaction, and accumulation of defects lead to damage, cracking, and failure. Here we demonstrate that when aluminum single crystals of submicrometer dimensions are subjected to low-amplitude cyclic deformation at room temperature, the density of preexisting dislocation lines and loops can be dramatically reduced with virtually no change of the overall sample geometry and essentially no permanent plastic strain. This “cyclic healing” of the metal crystal leads to significant strengthening through dramatic reductions in dislocation density, in distinct contrast to conventional cyclic strain hardening mechanisms arising from increases in dislocation density and interactions among defects in microcrystalline and macrocrystalline metals and alloys. Our real-time, in situ transmission electron microscopy observations of tensile tests reveal that pinned dislocation lines undergo shakedown during cyclic straining, with the extent of dislocation unpinning dependent on the amplitude, sequence, and number of strain cycles. Those unpinned mobile dislocations moving close enough to the free surface of the thin specimens as a result of such repeated straining are then further attracted to the surface by image forces that facilitate their egress from the crystal. These results point to a versatile pathway for controlled mechanical annealing and defect engineering in submicrometer-sized metal crystals, thereby obviating the need for thermal annealing or significant plastic deformation that could cause change in shape and/or dimensions of the specimen.National Science Foundation (U.S.) (Grant DMR-1120901)National Science Foundation (U.S.) (DMR-1410636)Singapore-MIT Allianc
- …