118 research outputs found

    Unified Segment-to-Segment Framework for Simultaneous Sequence Generation

    Full text link
    Simultaneous sequence generation is a pivotal task for real-time scenarios, such as streaming speech recognition, simultaneous machine translation and simultaneous speech translation, where the target sequence is generated while receiving the source sequence. The crux of achieving high-quality generation with low latency lies in identifying the optimal moments for generating, accomplished by learning a mapping between the source and target sequences. However, existing methods often rely on task-specific heuristics for different sequence types, limiting the model's capacity to adaptively learn the source-target mapping and hindering the exploration of multi-task learning for various simultaneous tasks. In this paper, we propose a unified segment-to-segment framework (Seg2Seg) for simultaneous sequence generation, which learns the mapping in an adaptive and unified manner. During the process of simultaneous generation, the model alternates between waiting for a source segment and generating a target segment, making the segment serve as the natural bridge between the source and target. To accomplish this, Seg2Seg introduces a latent segment as the pivot between source to target and explores all potential source-target mappings via the proposed expectation training, thereby learning the optimal moments for generating. Experiments on multiple simultaneous generation tasks demonstrate that Seg2Seg achieves state-of-the-art performance and exhibits better generality across various tasks.Comment: Accepted at NeurIPS 202

    End-to-End Simultaneous Speech Translation with Differentiable Segmentation

    Full text link
    End-to-end simultaneous speech translation (SimulST) outputs translation while receiving the streaming speech inputs (a.k.a. streaming speech translation), and hence needs to segment the speech inputs and then translate based on the current received speech. However, segmenting the speech inputs at unfavorable moments can disrupt the acoustic integrity and adversely affect the performance of the translation model. Therefore, learning to segment the speech inputs at those moments that are beneficial for the translation model to produce high-quality translation is the key to SimulST. Existing SimulST methods, either using the fixed-length segmentation or external segmentation model, always separate segmentation from the underlying translation model, where the gap results in segmentation outcomes that are not necessarily beneficial for the translation process. In this paper, we propose Differentiable Segmentation (DiSeg) for SimulST to directly learn segmentation from the underlying translation model. DiSeg turns hard segmentation into differentiable through the proposed expectation training, enabling it to be jointly trained with the translation model and thereby learn translation-beneficial segmentation. Experimental results demonstrate that DiSeg achieves state-of-the-art performance and exhibits superior segmentation capability.Comment: Accepted at ACL 2023 finding

    Simultaneous Machine Translation with Tailored Reference

    Full text link
    Simultaneous machine translation (SiMT) generates translation while reading the whole source sentence. However, existing SiMT models are typically trained using the same reference disregarding the varying amounts of available source information at different latency. Training the model with ground-truth at low latency may introduce forced anticipations, whereas utilizing reference consistent with the source word order at high latency results in performance degradation. Consequently, it is crucial to train the SiMT model with appropriate reference that avoids forced anticipations during training while maintaining high quality. In this paper, we propose a novel method that provides tailored reference for the SiMT models trained at different latency by rephrasing the ground-truth. Specifically, we introduce the tailor, induced by reinforcement learning, to modify ground-truth to the tailored reference. The SiMT model is trained with the tailored reference and jointly optimized with the tailor to enhance performance. Importantly, our method is applicable to a wide range of current SiMT approaches. Experiments on three translation tasks demonstrate that our method achieves state-of-the-art performance in both fixed and adaptive policies.Comment: Accepted to EMNLP 2023; 15 pages, 8 figure

    Learning Optimal Policy for Simultaneous Machine Translation via Binary Search

    Full text link
    Simultaneous machine translation (SiMT) starts to output translation while reading the source sentence and needs a precise policy to decide when to output the generated translation. Therefore, the policy determines the number of source tokens read during the translation of each target token. However, it is difficult to learn a precise translation policy to achieve good latency-quality trade-offs, because there is no golden policy corresponding to parallel sentences as explicit supervision. In this paper, we present a new method for constructing the optimal policy online via binary search. By employing explicit supervision, our approach enables the SiMT model to learn the optimal policy, which can guide the model in completing the translation during inference. Experiments on four translation tasks show that our method can exceed strong baselines across all latency scenarios.Comment: Accepted to ACL 2023. 14 pages, 5 figure

    Energy-Efficient Flow Scheduling and Routing with Hard Deadlines in Data Center Networks

    Full text link
    The power consumption of enormous network devices in data centers has emerged as a big concern to data center operators. Despite many traffic-engineering-based solutions, very little attention has been paid on performance-guaranteed energy saving schemes. In this paper, we propose a novel energy-saving model for data center networks by scheduling and routing "deadline-constrained flows" where the transmission of every flow has to be accomplished before a rigorous deadline, being the most critical requirement in production data center networks. Based on speed scaling and power-down energy saving strategies for network devices, we aim to explore the most energy efficient way of scheduling and routing flows on the network, as well as determining the transmission speed for every flow. We consider two general versions of the problem. For the version of only flow scheduling where routes of flows are pre-given, we show that it can be solved polynomially and we develop an optimal combinatorial algorithm for it. For the version of joint flow scheduling and routing, we prove that it is strongly NP-hard and cannot have a Fully Polynomial-Time Approximation Scheme (FPTAS) unless P=NP. Based on a relaxation and randomized rounding technique, we provide an efficient approximation algorithm which can guarantee a provable performance ratio with respect to a polynomial of the total number of flows.Comment: 11 pages, accepted by ICDCS'1

    Non-autoregressive Streaming Transformer for Simultaneous Translation

    Full text link
    Simultaneous machine translation (SiMT) models are trained to strike a balance between latency and translation quality. However, training these models to achieve high quality while maintaining low latency often leads to a tendency for aggressive anticipation. We argue that such issue stems from the autoregressive architecture upon which most existing SiMT models are built. To address those issues, we propose non-autoregressive streaming Transformer (NAST) which comprises a unidirectional encoder and a non-autoregressive decoder with intra-chunk parallelism. We enable NAST to generate the blank token or repetitive tokens to adjust its READ/WRITE strategy flexibly, and train it to maximize the non-monotonic latent alignment with an alignment-based latency loss. Experiments on various SiMT benchmarks demonstrate that NAST outperforms previous strong autoregressive SiMT baselines.Comment: EMNLP 2023 main conference; Source code is available at https://github.com/ictnlp/NAS

    Computational Analysis of Missense Mutations Causing Snyder-Robinson Syndrome

    Get PDF
    The Snyder-Robinson syndrome is caused by missense mutations in the spermine sythase gene that encodes a protein (SMS) of 529 amino acids. Here we investigate, in silico, the molecular effect of three missense mutations, c.267G\u3eA (p.G56S), c.496T\u3eG (p.V132G), and c.550T\u3eC (p.I150T) in SMS that were clinically identified to cause the disease. Single-point energy calculations, molecular dynamics simulations, and pKa calculations revealed the effects of these mutations on SMS\u27s stability, flexibility, and interactions. It was predicted that the catalytic residue, Asp276, should be protonated prior binding the substrates. The pKa calculations indicated the p.I150T mutation causes pKa changes with respect to the wild-type SMS, which involve titratable residues interacting with the S-methyl-5′-thioadenosine (MTA) substrate. The p.I150T missense mutation was also found to decrease the stability of the C-terminal domain and to induce structural changes in the vicinity of the MTA binding site. The other two missense mutations, p.G56S and p.V132G, are away from active site and do not perturb its wild-type properties, but affect the stability of both the monomers and the dimer. Specifically, the p.G56S mutation is predicted to greatly reduce the affinity of monomers to form a dimer, and therefore should have a dramatic effect on SMS function because dimerization is essential for SMS activity. Hum Mutat 31:1043–1049, 2010

    Robust Point Cloud Registration Framework Based on Deep Graph Matching(TPAMI Version)

    Full text link
    3D point cloud registration is a fundamental problem in computer vision and robotics. Recently, learning-based point cloud registration methods have made great progress. However, these methods are sensitive to outliers, which lead to more incorrect correspondences. In this paper, we propose a novel deep graph matching-based framework for point cloud registration. Specifically, we first transform point clouds into graphs and extract deep features for each point. Then, we develop a module based on deep graph matching to calculate a soft correspondence matrix. By using graph matching, not only the local geometry of each point but also its structure and topology in a larger range are considered in establishing correspondences, so that more correct correspondences are found. We train the network with a loss directly defined on the correspondences, and in the test stage the soft correspondences are transformed into hard one-to-one correspondences so that registration can be performed by a correspondence-based solver. Furthermore, we introduce a transformer-based method to generate edges for graph construction, which further improves the quality of the correspondences. Extensive experiments on object-level and scene-level benchmark datasets show that the proposed method achieves state-of-the-art performance. The code is available at: \href{https://github.com/fukexue/RGM}{https://github.com/fukexue/RGM}.Comment: accepted by TPAMI 2022. arXiv admin note: substantial text overlap with arXiv:2103.0425

    The Influence of Receiver Selection Strategy on Packet Success Probability in Ad Hoc Network

    Get PDF
    Considering the importance of the receiver (RX) selection strategy for the packet success probability (PSP) in ad hoc network, this paper probes into the PSPs with nearest RX selection strategy and farthest RX selection strategy and determines the number of hops with the two strategies. Next, the performance of the successful transmission probability (STP) and PSP were discussed through numerical simulation with the above mentioned two strategies. The simulation results show that the PSP is affected by the terminal density, the RX selection strategy, the packet length and the STP; the number of hops mainly depends on the terminal density, the RX selection strategy, the length between the source TX and the destination RX. Furthermore, the nearest RX selection strategy and the farthest RX selection strategy differ insignificantly in the packet transmission duration between source TX to destination RX at a small terminal density
    corecore