6 research outputs found
OpusCleaner and OpusTrainer, open source toolkits for training Machine Translation and Large language models
Developing high quality machine translation systems is a labour intensive,
challenging and confusing process for newcomers to the field. We present a pair
of tools OpusCleaner and OpusTrainer that aim to simplify the process, reduce
the amount of work and lower the entry barrier for newcomers.
OpusCleaner is a data downloading, cleaning, and proprocessing toolkit. It is
designed to allow researchers to quickly download, visualise and preprocess
bilingual (or monolingual) data that comes from many different sources, each of
them with different quality, issues, and unique filtering/preprocessing
requirements.
OpusTrainer is a data scheduling and data augmenting tool aimed at building
large scale, robust machine translation systems and large language models. It
features deterministic data mixing from many different sources, on-the-fly data
augmentation and more.
Using these tools, we showcase how we can use it to create high quality
machine translation model robust to noisy user input; multilingual models and
terminology aware models.Comment: Code on Github: https://github.com/hplt-project/OpusCleaner and
https://github.com/hplt-project/OpusTraine
Unusual magnetoelectric effect in paramagnetic rare-earth langasite
Violation of time reversal and spatial inversion symmetries has profound
consequences for elementary particles and cosmology. Spontaneous breaking of
these symmetries at phase transitions gives rise to unconventional physical
phenomena in condensed matter systems, such as ferroelectricity induced by
magnetic spirals, electromagnons, non-reciprocal propagation of light and spin
waves, and the linear magnetoelectric (ME) effect - the electric polarization
proportional to the applied magnetic field and the magnetization induced by the
electric field. Here, we report the experimental study of the holmium-doped
langasite, HoLaGaSiO, showing a puzzling combination
of linear and highly non-linear ME responses in the disordered paramagnetic
state: its electric polarization grows linearly with the magnetic field but
oscillates many times upon rotation of the magnetic field vector. We propose a
simple phenomenological Hamiltonian describing this unusual behavior and derive
it microscopically using the coupling of magnetic multipoles of the rare-earth
ions to the electric field.Comment: 8 pages, 3 figure
Initial experience with CytoSorb therapy in patients receiving left ventricular assist devices
BACKGROUND: The use of left ventricular assist devices (LVAD) in patients with advance heart failure is still associated with an important risk of immune dysregulation and infections. The aim of this study was to determine whether extracorporeal blood purification using the CytoSorb device benefits patients after LVAD implantation in terms of complications and overall survival.MATERIALS AND METHODS: Between August 2010 and January 2020, 207 consecutive patients underwent LVAD implantation, of whom 72 underwent CytoSorb therapy and 135 did not. Overall survival, major adverse events, and laboratory parameters were compared between 112 propensity score-matched patients (CytoSorb: 72 patients; non-CytoSorb: 40 patients).RESULTS: WBC (p = .033), CRP (p = .001), and IL-6 (p < .001), significantly increased with LVAD implantation, while CytoSorb did not influence this response. In-hospital mortality and overall survival during follow-up were similar with CytoSorb. However, patients treated with CytoSorb were more likely to develop respiratory failure (54.2% vs. 30.0%, p = .024), need mechanical ventilation for longer than 6 days post-implant (50.0% vs. 27.5%, p = .035), and require tracheostomy during hospitalization (31.9% vs. 12.5%, p = .040). No other significant differences were observed with regard to major adverse events during follow-up.CONCLUSIONS: Overall, our results showed that CytoSorb might not convey a significant morbidity or mortality benefit for patients undergoing LVAD implantation.</p