48 research outputs found

    Unified Segment-to-Segment Framework for Simultaneous Sequence Generation

    Full text link
    Simultaneous sequence generation is a pivotal task for real-time scenarios, such as streaming speech recognition, simultaneous machine translation and simultaneous speech translation, where the target sequence is generated while receiving the source sequence. The crux of achieving high-quality generation with low latency lies in identifying the optimal moments for generating, accomplished by learning a mapping between the source and target sequences. However, existing methods often rely on task-specific heuristics for different sequence types, limiting the model's capacity to adaptively learn the source-target mapping and hindering the exploration of multi-task learning for various simultaneous tasks. In this paper, we propose a unified segment-to-segment framework (Seg2Seg) for simultaneous sequence generation, which learns the mapping in an adaptive and unified manner. During the process of simultaneous generation, the model alternates between waiting for a source segment and generating a target segment, making the segment serve as the natural bridge between the source and target. To accomplish this, Seg2Seg introduces a latent segment as the pivot between source to target and explores all potential source-target mappings via the proposed expectation training, thereby learning the optimal moments for generating. Experiments on multiple simultaneous generation tasks demonstrate that Seg2Seg achieves state-of-the-art performance and exhibits better generality across various tasks.Comment: Accepted at NeurIPS 202

    End-to-End Simultaneous Speech Translation with Differentiable Segmentation

    Full text link
    End-to-end simultaneous speech translation (SimulST) outputs translation while receiving the streaming speech inputs (a.k.a. streaming speech translation), and hence needs to segment the speech inputs and then translate based on the current received speech. However, segmenting the speech inputs at unfavorable moments can disrupt the acoustic integrity and adversely affect the performance of the translation model. Therefore, learning to segment the speech inputs at those moments that are beneficial for the translation model to produce high-quality translation is the key to SimulST. Existing SimulST methods, either using the fixed-length segmentation or external segmentation model, always separate segmentation from the underlying translation model, where the gap results in segmentation outcomes that are not necessarily beneficial for the translation process. In this paper, we propose Differentiable Segmentation (DiSeg) for SimulST to directly learn segmentation from the underlying translation model. DiSeg turns hard segmentation into differentiable through the proposed expectation training, enabling it to be jointly trained with the translation model and thereby learn translation-beneficial segmentation. Experimental results demonstrate that DiSeg achieves state-of-the-art performance and exhibits superior segmentation capability.Comment: Accepted at ACL 2023 finding

    Simultaneous Machine Translation with Tailored Reference

    Full text link
    Simultaneous machine translation (SiMT) generates translation while reading the whole source sentence. However, existing SiMT models are typically trained using the same reference disregarding the varying amounts of available source information at different latency. Training the model with ground-truth at low latency may introduce forced anticipations, whereas utilizing reference consistent with the source word order at high latency results in performance degradation. Consequently, it is crucial to train the SiMT model with appropriate reference that avoids forced anticipations during training while maintaining high quality. In this paper, we propose a novel method that provides tailored reference for the SiMT models trained at different latency by rephrasing the ground-truth. Specifically, we introduce the tailor, induced by reinforcement learning, to modify ground-truth to the tailored reference. The SiMT model is trained with the tailored reference and jointly optimized with the tailor to enhance performance. Importantly, our method is applicable to a wide range of current SiMT approaches. Experiments on three translation tasks demonstrate that our method achieves state-of-the-art performance in both fixed and adaptive policies.Comment: Accepted to EMNLP 2023; 15 pages, 8 figure

    Learning Optimal Policy for Simultaneous Machine Translation via Binary Search

    Full text link
    Simultaneous machine translation (SiMT) starts to output translation while reading the source sentence and needs a precise policy to decide when to output the generated translation. Therefore, the policy determines the number of source tokens read during the translation of each target token. However, it is difficult to learn a precise translation policy to achieve good latency-quality trade-offs, because there is no golden policy corresponding to parallel sentences as explicit supervision. In this paper, we present a new method for constructing the optimal policy online via binary search. By employing explicit supervision, our approach enables the SiMT model to learn the optimal policy, which can guide the model in completing the translation during inference. Experiments on four translation tasks show that our method can exceed strong baselines across all latency scenarios.Comment: Accepted to ACL 2023. 14 pages, 5 figure

    Non-autoregressive Streaming Transformer for Simultaneous Translation

    Full text link
    Simultaneous machine translation (SiMT) models are trained to strike a balance between latency and translation quality. However, training these models to achieve high quality while maintaining low latency often leads to a tendency for aggressive anticipation. We argue that such issue stems from the autoregressive architecture upon which most existing SiMT models are built. To address those issues, we propose non-autoregressive streaming Transformer (NAST) which comprises a unidirectional encoder and a non-autoregressive decoder with intra-chunk parallelism. We enable NAST to generate the blank token or repetitive tokens to adjust its READ/WRITE strategy flexibly, and train it to maximize the non-monotonic latent alignment with an alignment-based latency loss. Experiments on various SiMT benchmarks demonstrate that NAST outperforms previous strong autoregressive SiMT baselines.Comment: EMNLP 2023 main conference; Source code is available at https://github.com/ictnlp/NAS

    Neuroform stent – assisted coil embolization: New treatment strategy for complex intracranial aneurysms with midterm results

    Get PDF
    Objective: To present detailed results of our treatment experience in using Neuroform Stent-Assisted Coil embolization to treat complex cerebral aneurysms over 3-year period, emphasizing on the technical difficulties, procedure-related complications, and to evaluate midterm results. Methods: Patients underwent Neuroform stent-assisted coil embolization were registered in a database. We assessed patients’ history, aneurysm morphology, indications for stenting, and technical details of the procedures, complications and midterm follow-up data.Results: This study included twenty-six patients with 39 aneurysms. A total of 32 of 39 aneurysms were treated by Neuroform stent-assisted embolization (SAC). Three anuerysms stented without coiling, 2 aneurysms coiled without stenting and 2 aneuysms surgically clipped. The indications for use included broad-necked aneurysms (n = 28), giant or large aneurysms (n = 6), and fusiform aneurysms (n = 5). Of the 32 aneurysms treated by Neuroform SAC, we achieved complete (100%) and near complete (> 95%) occlusion in 27 aneurysms, and Partial (< 95%) occlusion in 5 aneurysms. Follow-up angiographic data avialble in 22 of 32 aneurysms treated by Neuroform SAC (68.7%) (average follow-up, 12 mo; range 4–24 mo) demonstrating recanalization in 3 aneurysms (13.6%), and stable occlusion in 19 aneurysms (86.4%). No delayed progressive embolization or in-stent stenosis observed. Conclusion: Neuroform microstent system led to a significant evolution in the endovascular treatment of complex intracranial aneurysms. Our results and midterm follow-up showed Neuroform stent-assisted coil embolization is safe and effective technique in the treatment of complex cerebral aneurysms. Although, the clinically significant complications are uncommon and the evaluation at midterm follow-up is encouraging, further studies needed to assess the long-term stability and durability of the stent

    Statistical models for text query-based image retrieval

    No full text
    Image indexing and retrieval has been an active research area for more than one decade. Although many accomplishments have been made in this domain, it is still a challenging problem and far from being solved. Traditional content-based approaches make use of queries based on image examples or image attributes like color and texture, and images are retrieved according to the similarity of each target image with the query image. However, image query based retrieval systems do not really capture the semantics or meanings of images well. Furthermore, image queries are difficult and inconvenient to form for most users. To capture the semantics of images, libraries and other organizations have manually annotated each image with keywords and captions, and then search on those annotations using text retrieval engines. The disadvantage of this approach is the huge cost of annotating large number of images and the inconsistency of annotations by different people. In this work, we focus on general image and historical handwritten document retrieval based on textual queries. We explore statistical model based techniques that allow us to retrieve general images and historical handwritten document images with text queries. These techniques are (i) image retrieval based on automatic annotation, (ii) direct retrieval based on computing the posterior of an image given a text query, and (iii) handwritten document image recognition. We compare the performance of these approaches on several general image and historical handwritten document collections. The main contributions of this work include (i) two probabilistic generative models for annotation-based retrieval, (ii) a direct retrieval model for general images, and (iii) a thorough investigation of machine learning models for handwritten document recognition. Our experimental results and retrieval systems show that our proposed approaches may be applied to practical textual query based retrieval systems on large image data sets

    Postface

    No full text
    Conjointement organisée par la Société chinoise d’études de l’histoire de France, la Maison des sciences de l’homme, l’université Paris-I Panthéon-Sorbonne et l’Institut de recherche sur les relations internationales et le développement régional de l’université normale supérieure de l’Est de la Chine (ECNU), l’Université d’automne a déjà six ans d’histoire derrière elle. Chaque édition de l’université présente les derniers travaux de recherches, au plus haut niveau, portant sur l’histoire et ..

    Essai d’interprétation de la relation sino-européenne

    No full text
    L’Union européenne et la Chine figurent parmi les entités politiques les plus en vue de la communauté internationale. L’évolution de leur relation bilatérale attire forcément l’attention de tous les acteurs intéressés. Ces dernières années, les perspectives de la relation sino-européenne après les difficultés rencontrées en 2008 ont été jugées de façon très diverse par l’opinion publique et les analystes chinois et internationaux. Je me permets de présenter ici mon point de vue personnel. Glo..
    corecore