118 research outputs found
Unified Segment-to-Segment Framework for Simultaneous Sequence Generation
Simultaneous sequence generation is a pivotal task for real-time scenarios,
such as streaming speech recognition, simultaneous machine translation and
simultaneous speech translation, where the target sequence is generated while
receiving the source sequence. The crux of achieving high-quality generation
with low latency lies in identifying the optimal moments for generating,
accomplished by learning a mapping between the source and target sequences.
However, existing methods often rely on task-specific heuristics for different
sequence types, limiting the model's capacity to adaptively learn the
source-target mapping and hindering the exploration of multi-task learning for
various simultaneous tasks. In this paper, we propose a unified
segment-to-segment framework (Seg2Seg) for simultaneous sequence generation,
which learns the mapping in an adaptive and unified manner. During the process
of simultaneous generation, the model alternates between waiting for a source
segment and generating a target segment, making the segment serve as the
natural bridge between the source and target. To accomplish this, Seg2Seg
introduces a latent segment as the pivot between source to target and explores
all potential source-target mappings via the proposed expectation training,
thereby learning the optimal moments for generating. Experiments on multiple
simultaneous generation tasks demonstrate that Seg2Seg achieves
state-of-the-art performance and exhibits better generality across various
tasks.Comment: Accepted at NeurIPS 202
End-to-End Simultaneous Speech Translation with Differentiable Segmentation
End-to-end simultaneous speech translation (SimulST) outputs translation
while receiving the streaming speech inputs (a.k.a. streaming speech
translation), and hence needs to segment the speech inputs and then translate
based on the current received speech. However, segmenting the speech inputs at
unfavorable moments can disrupt the acoustic integrity and adversely affect the
performance of the translation model. Therefore, learning to segment the speech
inputs at those moments that are beneficial for the translation model to
produce high-quality translation is the key to SimulST. Existing SimulST
methods, either using the fixed-length segmentation or external segmentation
model, always separate segmentation from the underlying translation model,
where the gap results in segmentation outcomes that are not necessarily
beneficial for the translation process. In this paper, we propose
Differentiable Segmentation (DiSeg) for SimulST to directly learn segmentation
from the underlying translation model. DiSeg turns hard segmentation into
differentiable through the proposed expectation training, enabling it to be
jointly trained with the translation model and thereby learn
translation-beneficial segmentation. Experimental results demonstrate that
DiSeg achieves state-of-the-art performance and exhibits superior segmentation
capability.Comment: Accepted at ACL 2023 finding
Simultaneous Machine Translation with Tailored Reference
Simultaneous machine translation (SiMT) generates translation while reading
the whole source sentence. However, existing SiMT models are typically trained
using the same reference disregarding the varying amounts of available source
information at different latency. Training the model with ground-truth at low
latency may introduce forced anticipations, whereas utilizing reference
consistent with the source word order at high latency results in performance
degradation. Consequently, it is crucial to train the SiMT model with
appropriate reference that avoids forced anticipations during training while
maintaining high quality. In this paper, we propose a novel method that
provides tailored reference for the SiMT models trained at different latency by
rephrasing the ground-truth. Specifically, we introduce the tailor, induced by
reinforcement learning, to modify ground-truth to the tailored reference. The
SiMT model is trained with the tailored reference and jointly optimized with
the tailor to enhance performance. Importantly, our method is applicable to a
wide range of current SiMT approaches. Experiments on three translation tasks
demonstrate that our method achieves state-of-the-art performance in both fixed
and adaptive policies.Comment: Accepted to EMNLP 2023; 15 pages, 8 figure
Learning Optimal Policy for Simultaneous Machine Translation via Binary Search
Simultaneous machine translation (SiMT) starts to output translation while
reading the source sentence and needs a precise policy to decide when to output
the generated translation. Therefore, the policy determines the number of
source tokens read during the translation of each target token. However, it is
difficult to learn a precise translation policy to achieve good latency-quality
trade-offs, because there is no golden policy corresponding to parallel
sentences as explicit supervision. In this paper, we present a new method for
constructing the optimal policy online via binary search. By employing explicit
supervision, our approach enables the SiMT model to learn the optimal policy,
which can guide the model in completing the translation during inference.
Experiments on four translation tasks show that our method can exceed strong
baselines across all latency scenarios.Comment: Accepted to ACL 2023. 14 pages, 5 figure
Energy-Efficient Flow Scheduling and Routing with Hard Deadlines in Data Center Networks
The power consumption of enormous network devices in data centers has emerged
as a big concern to data center operators. Despite many
traffic-engineering-based solutions, very little attention has been paid on
performance-guaranteed energy saving schemes. In this paper, we propose a novel
energy-saving model for data center networks by scheduling and routing
"deadline-constrained flows" where the transmission of every flow has to be
accomplished before a rigorous deadline, being the most critical requirement in
production data center networks. Based on speed scaling and power-down energy
saving strategies for network devices, we aim to explore the most energy
efficient way of scheduling and routing flows on the network, as well as
determining the transmission speed for every flow. We consider two general
versions of the problem. For the version of only flow scheduling where routes
of flows are pre-given, we show that it can be solved polynomially and we
develop an optimal combinatorial algorithm for it. For the version of joint
flow scheduling and routing, we prove that it is strongly NP-hard and cannot
have a Fully Polynomial-Time Approximation Scheme (FPTAS) unless P=NP. Based on
a relaxation and randomized rounding technique, we provide an efficient
approximation algorithm which can guarantee a provable performance ratio with
respect to a polynomial of the total number of flows.Comment: 11 pages, accepted by ICDCS'1
Non-autoregressive Streaming Transformer for Simultaneous Translation
Simultaneous machine translation (SiMT) models are trained to strike a
balance between latency and translation quality. However, training these models
to achieve high quality while maintaining low latency often leads to a tendency
for aggressive anticipation. We argue that such issue stems from the
autoregressive architecture upon which most existing SiMT models are built. To
address those issues, we propose non-autoregressive streaming Transformer
(NAST) which comprises a unidirectional encoder and a non-autoregressive
decoder with intra-chunk parallelism. We enable NAST to generate the blank
token or repetitive tokens to adjust its READ/WRITE strategy flexibly, and
train it to maximize the non-monotonic latent alignment with an alignment-based
latency loss. Experiments on various SiMT benchmarks demonstrate that NAST
outperforms previous strong autoregressive SiMT baselines.Comment: EMNLP 2023 main conference; Source code is available at
https://github.com/ictnlp/NAS
Computational Analysis of Missense Mutations Causing Snyder-Robinson Syndrome
The Snyder-Robinson syndrome is caused by missense mutations in the spermine sythase gene that encodes a protein (SMS) of 529 amino acids. Here we investigate, in silico, the molecular effect of three missense mutations, c.267G\u3eA (p.G56S), c.496T\u3eG (p.V132G), and c.550T\u3eC (p.I150T) in SMS that were clinically identified to cause the disease. Single-point energy calculations, molecular dynamics simulations, and pKa calculations revealed the effects of these mutations on SMS\u27s stability, flexibility, and interactions. It was predicted that the catalytic residue, Asp276, should be protonated prior binding the substrates. The pKa calculations indicated the p.I150T mutation causes pKa changes with respect to the wild-type SMS, which involve titratable residues interacting with the S-methyl-5′-thioadenosine (MTA) substrate. The p.I150T missense mutation was also found to decrease the stability of the C-terminal domain and to induce structural changes in the vicinity of the MTA binding site. The other two missense mutations, p.G56S and p.V132G, are away from active site and do not perturb its wild-type properties, but affect the stability of both the monomers and the dimer. Specifically, the p.G56S mutation is predicted to greatly reduce the affinity of monomers to form a dimer, and therefore should have a dramatic effect on SMS function because dimerization is essential for SMS activity. Hum Mutat 31:1043–1049, 2010
Robust Point Cloud Registration Framework Based on Deep Graph Matching(TPAMI Version)
3D point cloud registration is a fundamental problem in computer vision and
robotics. Recently, learning-based point cloud registration methods have made
great progress. However, these methods are sensitive to outliers, which lead to
more incorrect correspondences. In this paper, we propose a novel deep graph
matching-based framework for point cloud registration. Specifically, we first
transform point clouds into graphs and extract deep features for each point.
Then, we develop a module based on deep graph matching to calculate a soft
correspondence matrix. By using graph matching, not only the local geometry of
each point but also its structure and topology in a larger range are considered
in establishing correspondences, so that more correct correspondences are
found. We train the network with a loss directly defined on the
correspondences, and in the test stage the soft correspondences are transformed
into hard one-to-one correspondences so that registration can be performed by a
correspondence-based solver. Furthermore, we introduce a transformer-based
method to generate edges for graph construction, which further improves the
quality of the correspondences. Extensive experiments on object-level and
scene-level benchmark datasets show that the proposed method achieves
state-of-the-art performance. The code is available at:
\href{https://github.com/fukexue/RGM}{https://github.com/fukexue/RGM}.Comment: accepted by TPAMI 2022. arXiv admin note: substantial text overlap
with arXiv:2103.0425
The Influence of Receiver Selection Strategy on Packet Success Probability in Ad Hoc Network
Considering the importance of the receiver (RX) selection strategy for the packet success probability (PSP) in ad hoc network, this paper probes into the PSPs with nearest RX selection strategy and farthest RX selection strategy and determines the number of hops with the two strategies. Next, the performance of the successful transmission probability (STP) and PSP were discussed through numerical simulation with the above mentioned two strategies. The simulation results show that the PSP is affected by the terminal density, the RX selection strategy, the packet length and the STP; the number of hops mainly depends on the terminal density, the RX selection strategy, the length between the source TX and the destination RX. Furthermore, the nearest RX selection strategy and the farthest RX selection strategy differ insignificantly in the packet transmission duration between source TX to destination RX at a small terminal density
- …