50 research outputs found
Unified Segment-to-Segment Framework for Simultaneous Sequence Generation
Simultaneous sequence generation is a pivotal task for real-time scenarios,
such as streaming speech recognition, simultaneous machine translation and
simultaneous speech translation, where the target sequence is generated while
receiving the source sequence. The crux of achieving high-quality generation
with low latency lies in identifying the optimal moments for generating,
accomplished by learning a mapping between the source and target sequences.
However, existing methods often rely on task-specific heuristics for different
sequence types, limiting the model's capacity to adaptively learn the
source-target mapping and hindering the exploration of multi-task learning for
various simultaneous tasks. In this paper, we propose a unified
segment-to-segment framework (Seg2Seg) for simultaneous sequence generation,
which learns the mapping in an adaptive and unified manner. During the process
of simultaneous generation, the model alternates between waiting for a source
segment and generating a target segment, making the segment serve as the
natural bridge between the source and target. To accomplish this, Seg2Seg
introduces a latent segment as the pivot between source to target and explores
all potential source-target mappings via the proposed expectation training,
thereby learning the optimal moments for generating. Experiments on multiple
simultaneous generation tasks demonstrate that Seg2Seg achieves
state-of-the-art performance and exhibits better generality across various
tasks.Comment: Accepted at NeurIPS 202
End-to-End Simultaneous Speech Translation with Differentiable Segmentation
End-to-end simultaneous speech translation (SimulST) outputs translation
while receiving the streaming speech inputs (a.k.a. streaming speech
translation), and hence needs to segment the speech inputs and then translate
based on the current received speech. However, segmenting the speech inputs at
unfavorable moments can disrupt the acoustic integrity and adversely affect the
performance of the translation model. Therefore, learning to segment the speech
inputs at those moments that are beneficial for the translation model to
produce high-quality translation is the key to SimulST. Existing SimulST
methods, either using the fixed-length segmentation or external segmentation
model, always separate segmentation from the underlying translation model,
where the gap results in segmentation outcomes that are not necessarily
beneficial for the translation process. In this paper, we propose
Differentiable Segmentation (DiSeg) for SimulST to directly learn segmentation
from the underlying translation model. DiSeg turns hard segmentation into
differentiable through the proposed expectation training, enabling it to be
jointly trained with the translation model and thereby learn
translation-beneficial segmentation. Experimental results demonstrate that
DiSeg achieves state-of-the-art performance and exhibits superior segmentation
capability.Comment: Accepted at ACL 2023 finding
Simultaneous Machine Translation with Tailored Reference
Simultaneous machine translation (SiMT) generates translation while reading
the whole source sentence. However, existing SiMT models are typically trained
using the same reference disregarding the varying amounts of available source
information at different latency. Training the model with ground-truth at low
latency may introduce forced anticipations, whereas utilizing reference
consistent with the source word order at high latency results in performance
degradation. Consequently, it is crucial to train the SiMT model with
appropriate reference that avoids forced anticipations during training while
maintaining high quality. In this paper, we propose a novel method that
provides tailored reference for the SiMT models trained at different latency by
rephrasing the ground-truth. Specifically, we introduce the tailor, induced by
reinforcement learning, to modify ground-truth to the tailored reference. The
SiMT model is trained with the tailored reference and jointly optimized with
the tailor to enhance performance. Importantly, our method is applicable to a
wide range of current SiMT approaches. Experiments on three translation tasks
demonstrate that our method achieves state-of-the-art performance in both fixed
and adaptive policies.Comment: Accepted to EMNLP 2023; 15 pages, 8 figure
Learning Optimal Policy for Simultaneous Machine Translation via Binary Search
Simultaneous machine translation (SiMT) starts to output translation while
reading the source sentence and needs a precise policy to decide when to output
the generated translation. Therefore, the policy determines the number of
source tokens read during the translation of each target token. However, it is
difficult to learn a precise translation policy to achieve good latency-quality
trade-offs, because there is no golden policy corresponding to parallel
sentences as explicit supervision. In this paper, we present a new method for
constructing the optimal policy online via binary search. By employing explicit
supervision, our approach enables the SiMT model to learn the optimal policy,
which can guide the model in completing the translation during inference.
Experiments on four translation tasks show that our method can exceed strong
baselines across all latency scenarios.Comment: Accepted to ACL 2023. 14 pages, 5 figure
SiLLM: Large Language Models for Simultaneous Machine Translation
Simultaneous Machine Translation (SiMT) generates translations while reading
the source sentence, necessitating a policy to determine the optimal timing for
reading and generating words. Despite the remarkable performance achieved by
Large Language Models (LLM) across various NLP tasks, existing SiMT methods
predominantly focus on conventional transformers, employing a single model to
concurrently determine the policy and generate the translations. However, given
the complexity of SiMT, it is challenging to effectively address both tasks
with a single model. Therefore, there is a need to decouple the SiMT task into
policy-decision and translation sub-tasks. We propose SiLLM, which delegates
the two sub-tasks to separate agents, thereby incorporating LLM into SiMT. The
policy-decision agent is managed by a conventional SiMT model, responsible for
determining the translation policy. The translation agent, leveraging the
capabilities of LLM, generates translation using the partial source sentence.
The two agents collaborate to accomplish SiMT. To facilitate the application of
token-level policies determined by conventional SiMT models to LLM, we propose
a word-level policy adapted for LLM. Experiments on two datasets demonstrate
that, with a small amount of data for fine-tuning LLM, SiLLM attains
state-of-the-art performance.Comment: 13 pages, 6 tables, 7 figure
Non-autoregressive Streaming Transformer for Simultaneous Translation
Simultaneous machine translation (SiMT) models are trained to strike a
balance between latency and translation quality. However, training these models
to achieve high quality while maintaining low latency often leads to a tendency
for aggressive anticipation. We argue that such issue stems from the
autoregressive architecture upon which most existing SiMT models are built. To
address those issues, we propose non-autoregressive streaming Transformer
(NAST) which comprises a unidirectional encoder and a non-autoregressive
decoder with intra-chunk parallelism. We enable NAST to generate the blank
token or repetitive tokens to adjust its READ/WRITE strategy flexibly, and
train it to maximize the non-monotonic latent alignment with an alignment-based
latency loss. Experiments on various SiMT benchmarks demonstrate that NAST
outperforms previous strong autoregressive SiMT baselines.Comment: EMNLP 2023 main conference; Source code is available at
https://github.com/ictnlp/NAS
Neuroform stent – assisted coil embolization: New treatment strategy for complex intracranial aneurysms with midterm results
Objective: To present detailed results of our treatment experience in using Neuroform Stent-Assisted Coil embolization to treat complex cerebral aneurysms over 3-year period, emphasizing on the technical difficulties, procedure-related complications, and to evaluate midterm results. Methods: Patients underwent Neuroform stent-assisted coil embolization were registered in a database. We assessed patients’ history, aneurysm morphology, indications for stenting, and technical details of the procedures, complications and midterm follow-up data.Results: This study included twenty-six patients with 39 aneurysms. A total of 32 of 39 aneurysms were treated by Neuroform stent-assisted embolization (SAC). Three anuerysms stented without coiling, 2 aneurysms coiled without stenting and 2 aneuysms surgically clipped. The indications for use included broad-necked aneurysms (n = 28), giant or large aneurysms (n = 6), and fusiform aneurysms (n = 5). Of the 32 aneurysms treated by Neuroform SAC, we achieved complete (100%) and near complete (> 95%) occlusion in 27 aneurysms, and Partial (< 95%) occlusion in 5 aneurysms. Follow-up angiographic data avialble in 22 of 32 aneurysms treated by Neuroform SAC (68.7%) (average follow-up, 12 mo; range 4–24 mo) demonstrating recanalization in 3 aneurysms (13.6%), and stable occlusion in 19 aneurysms (86.4%). No delayed progressive embolization or in-stent stenosis observed. Conclusion: Neuroform microstent system led to a significant evolution in the endovascular treatment of complex intracranial aneurysms. Our results and midterm follow-up showed Neuroform stent-assisted coil embolization is safe and effective technique in the treatment of complex cerebral aneurysms. Although, the clinically significant complications are uncommon and the evaluation at midterm follow-up is encouraging, further studies needed to assess the long-term stability and durability of the stent
Recommended from our members
A Hierarchical, HMMbased Accuracy for a Digital Library of Books
A number of projects are creating searchable digital libraries of printed books. These include the Million Book Project, the Google Book project and similar eorts from Yahoo and Microsoft. Content-based on line book retrieval usually requires rst converting printed text into machine readable (e.g. ASCII) text using an optical character recognition (OCR) engine and then doing full text search on the results. Many of these books are old and there are a variety of processing steps that are required to create an end to end system. Changing any step (including the scanning process) can aect OCR performance and hence a good automatic statistical evaluation of OCR performance on book length material is needed. Evaluating OCR performance on the entire book is non-trivial. The only easily obtainable ground truth (the Gutenberg e-texts) must be automatically aligned with the OCR output over the entire length of a book. This may be viewed as equivalent to the problem of aligning two large (easily a million long) sequences. The problem is further complicated by OCR errors as well as the possibility of large chunks of missing material in one of the sequences. We propose a Hidden Markov Model (HMM) based hierarchical alignment algorithm to align OCR output and the ground truth for books. We believe this is the rst work to automatically align a whole book without using any book structure information. The alignment process works by breaking up the problem of aligning two long sequences into the problem of aligning many smaller subsequences. This can be rapidly and eectively done. Experimental results show that our hierarchical alignment approach works very well even if OCR output has a high recognition error rate. Finally, we evaluate the performance of a commercial OCR engine over a large dataset of books based on the alignment results
Statistical models for text query-based image retrieval
Image indexing and retrieval has been an active research area for more than one decade. Although many accomplishments have been made in this domain, it is still a challenging problem and far from being solved. Traditional content-based approaches make use of queries based on image examples or image attributes like color and texture, and images are retrieved according to the similarity of each target image with the query image. However, image query based retrieval systems do not really capture the semantics or meanings of images well. Furthermore, image queries are difficult and inconvenient to form for most users. To capture the semantics of images, libraries and other organizations have manually annotated each image with keywords and captions, and then search on those annotations using text retrieval engines. The disadvantage of this approach is the huge cost of annotating large number of images and the inconsistency of annotations by different people. In this work, we focus on general image and historical handwritten document retrieval based on textual queries. We explore statistical model based techniques that allow us to retrieve general images and historical handwritten document images with text queries. These techniques are (i) image retrieval based on automatic annotation, (ii) direct retrieval based on computing the posterior of an image given a text query, and (iii) handwritten document image recognition. We compare the performance of these approaches on several general image and historical handwritten document collections. The main contributions of this work include (i) two probabilistic generative models for annotation-based retrieval, (ii) a direct retrieval model for general images, and (iii) a thorough investigation of machine learning models for handwritten document recognition. Our experimental results and retrieval systems show that our proposed approaches may be applied to practical textual query based retrieval systems on large image data sets
Postface
Conjointement organisée par la Société chinoise d’études de l’histoire de France, la Maison des sciences de l’homme, l’université Paris-I Panthéon-Sorbonne et l’Institut de recherche sur les relations internationales et le développement régional de l’université normale supérieure de l’Est de la Chine (ECNU), l’Université d’automne a déjà six ans d’histoire derrière elle. Chaque édition de l’université présente les derniers travaux de recherches, au plus haut niveau, portant sur l’histoire et ..