45 research outputs found
Dictionary-based Domain Adaptation of MT Systems without Retraining
We describe our submission to the IT-domain translation task of WMT 2016.
We perform domain adaptation with dictionary data on already trained MT systems with no further retraining.
We apply our approach to two conceptually different systems developed within the QTLeap project: TectoMT and Moses, as well as Chimera, their combination.
In all settings, our method improves the translation quality.
Moreover, the basic variant of our approach is applicable to any MT system, including a black-box one
New Language Pairs in TectoMT
The TectoMT tree-to-tree machine translation system has been updated this year to support easier retraining for more translation directions. We use multilingual standards for morphology and syntax annotation and language-independent base rules. We include a simple, non-parametric way of combining TectoMT’s transfer model outputs
Machine Translation of Medical Texts in the Khresmoi Project
The WMT 2014 Medical Translation Task poses an interesting challenge for Machine Translation
(MT). In the standard translation task, the end application is the translation itself. In this task, the MT system is considered a part of a larger system for cross-lingual information retrieval (IR)
Adaptation of machine translation for multilingual information retrieval in the medical domain
Objective. We investigate machine translation (MT) of user search queries in the context of cross-lingual information retrieval (IR) in the medical domain. The main focus is on techniques to adapt MT to increase translation quality; however, we also explore MT adaptation to improve eectiveness of cross-lingual IR.
Methods and Data. Our MT system is Moses, a state-of-the-art phrase-based statistical machine translation system. The IR system is based on the BM25 retrieval model implemented in the Lucene search engine. The MT techniques employed in this work include in-domain training and tuning, intelligent training data selection, optimization of phrase table configuration, compound
splitting, and exploiting synonyms as translation variants. The IR methods include morphological normalization and using multiple translation variants for query expansion. The experiments are performed and thoroughly evaluated on three language pairs: Czech–English, German–English, and French–English. MT quality is evaluated on data sets created within the Khresmoi project and IR eectiveness is tested on the CLEF eHealth 2013 data sets.
Results. The search query translation results achieved in our experiments are outstanding – our systems outperform not only our strong baselines, but also Google Translate and Microsoft Bing Translator in direct comparison carried out on all the language pairs. The baseline BLEU scores increased from 26.59 to 41.45 for Czech–English, from 23.03 to 40.82 for German–English, and from 32.67 to 40.82 for French–English. This is a 55% improvement on average. In terms of the IR performance on this
particular test collection, a significant improvement over the baseline is achieved only for French–English. For Czech–English and German–English, the increased MT quality does not lead to better IR results.
Conclusions. Most of the MT techniques employed in our experiments improve MT of medical search queries. Especially the intelligent training data selection proves to be very successful for domain adaptation of MT. Certain improvements are also obtained from German compound splitting on the source language side. Translation quality, however, does not appear to correlate with the IR performance – better translation does not necessarily yield better retrieval. We discuss in detail the contribution of the individual techniques and state-of-the-art features and provide future research directions
The Cytokinin Complex Associated With Rhodococcus fascians: Which Compounds Are Critical for Virulence?
Virulent strains of Rhodococcus fascians cause a range of disease symptoms, many of which can be mimicked by application of cytokinin. Both virulent and avirulent strains produce a complex of cytokinins, most of which can be derived from tRNA degradation. To test the three current hypotheses regarding the involvement of cytokinins as virulence determinants, we used PCR to detect specific genes, previously associated with a linear virulence plasmid, including two methyl transferase genes (mt1 and mt2) and fas4 (dimethyl transferase), of multiple strains of R. fascians. We inoculated Pisum sativum (pea) seeds with virulent and avirulent strains of R. fascians, monitored the plants over time and compared these to mock-inoculated controls. We used RT-qPCR to monitor the expression of mt1, mt2, and fas4 in inoculated tissues and LC-MS/MS to obtain a comprehensive picture of the cytokinin complement of inoculated cotyledons, roots and shoots over time. The presence and expression of mt1 and mt2 was associated with those strains of R. fascians classed as virulent, and not those classed as avirulent. Expression of mt1, mt2, and fas4 peaked at 9 days post-inoculation (dpi) in cotyledons and at 15 dpi in shoots and roots developed from seeds inoculated with virulent strain 602. Pea plants inoculated with virulent and avirulent strains of R. fascians both contained cytokinins likely to have been derived from tRNA turnover including the 2-methylthio cytokinins and cis-zeatin-derivatives. Along with the isopentenyladenine-type cytokinins, the levels of these compounds did not correlate with virulence. Only the novel 1- and 2-methylated isopentenyladenine cytokinins were uniquely associated with infection by the virulent strains and are, therefore, the likely causative factors of the disease symptoms
Optical Observations of XTE J1118 + 480 during the 2000 Outburst
Abstract
We report on photometric and spectroscopic observations of a possible halo black-hole X-ray nova, XTE J1118480 ( KV UMa) during outburst. Our photometric monitoring during the main outburst revealed that the optical maximum as well as the onset of the outburst precede those in the X-ray region. This indicates that the event was an "outside-in" type outburst and that its optical flux was dominated by viscous heating, itself, and not the effect of X-ray irradiation. Based on these results, we suggest an outburst scenario analogous to superoutbursts in SU UMa-type dwarf novae. This scenario predicts a superhump phenomenon, which we indeed detected throughout the outburst. We determined its period to be , which is slightly longer than the orbital periods suggested from spectroscopic observations. We have furthermore revealed the first evidence of a continuous period decreasing in X-ray novae. The most prominent feature in our optical spectrum is a double-peak He II 4686Å emission line having an asymmetric profile with an outstanding blue side peak. Using a Doppler mapping method, we found that the He II emission originates from the accretion disk, which particularly concentrates on the hot spot. The time that the blue peak becomes strongest corresponds to a superhump peak. This implies that we see an elongated side of an eccentric disk at that time and, hence, it may cause an asymmetric emission profile. Substituting the observed fractional superhump excess for a theoretically expected relation between it and the mass ratio, we estimate that the black-hole mass is larger than . XTE J1118480 thus has a large mass of a compact object compared with the typical black-hole X-ray novae
Survey of Period Variations of Superhumps in SU UMa-Type Dwarf Novae
We systematically surveyed period variations of superhumps in SU UMa-type
dwarf novae based on newly obtained data and past publications. In many
systems, the evolution of superhump period are found to be composed of three
distinct stages: early evolutionary stage with a longer superhump period,
middle stage with systematically varying periods, final stage with a shorter,
stable superhump period. During the middle stage, many systems with superhump
periods less than 0.08 d show positive period derivatives. Contrary to the
earlier claim, we found no clear evidence for variation of period derivatives
between superoutburst of the same object. We present an interpretation that the
lengthening of the superhump period is a result of outward propagation of the
eccentricity wave and is limited by the radius near the tidal truncation. We
interpret that late stage superhumps are rejuvenized excitation of 3:1
resonance when the superhumps in the outer disk is effectively quenched. Many
of WZ Sge-type dwarf novae showed long-enduring superhumps during the
post-superoutburst stage having periods longer than those during the main
superoutburst. The period derivatives in WZ Sge-type dwarf novae are found to
be strongly correlated with the fractional superhump excess, or consequently,
mass ratio. WZ Sge-type dwarf novae with a long-lasting rebrightening or with
multiple rebrightenings tend to have smaller period derivatives and are
excellent candidate for the systems around or after the period minimum of
evolution of cataclysmic variables (abridged).Comment: 239 pages, 225 figures, PASJ accepte
The 2001 Superoutburst of WZ Sagittae
We report the results of a worldwide campaign to observe WZ Sagittae during
its 2001 superoutburst. After a 23-year slumber at V=15.5, the star rose within
2 days to a peak brightness of 8.2, and showed a main eruption lasting 25 days.
The return to quiescence was punctuated by 12 small eruptions, of ~1 mag
amplitude and 2 day recurrence time; these "echo outbursts" are of uncertain
origin, but somewhat resemble the normal outbursts of dwarf novae. After 52
days, the star began a slow decline to quiescence.
Periodic waves in the light curve closely followed the pattern seen in the
1978 superoutburst: a strong orbital signal dominated the first 12 days,
followed by a powerful /common superhump/ at 0.05721(5) d, 0.92(8)% longer than
P_orb. The latter endured for at least 90 days, although probably mutating into
a "late" superhump with a slightly longer mean period [0.05736(5) d]. The
superhump appeared to follow familiar rules for such phenomena in dwarf novae,
with components given by linear combinations of two basic frequencies: the
orbital frequency omega_o and an unseen low frequency Omega, believed to
represent the accretion disk's apsidal precession. Long time series reveal an
intricate fine structure, with ~20 incommensurate frequencies. Essentially all
components occurred at a frequency n(omega_o)-m(Omega), with m=1, ..., n. But
during its first week, the common superhump showed primary components at n
(omega_o)-Omega, for n=1, 2, 3, 4, 5, 6, 7, 8, 9 (i.e., m=1 consistently); a
month later, the dominant power shifted to components with m=n-1. This may
arise from a shift in the disk's spiral-arm pattern, likely to be the
underlying cause of superhumps.
The great majority of frequency components ... . (etc., abstract continues)Comment: PDF, 54 pages, 4 tables, 21 figures, 1 appendix; accepted, in press,
to appear July 2002, PASP; more info at http://cba.phys.columbia.edu
Superhumps in Cataclysmic Binaries. XXIV. Twenty More Dwarf Novae
We report precise measures of the orbital and superhump period in twenty more
dwarf novae. For ten stars, we report new and confirmed spectroscopic periods -
signifying the orbital period P_o - as well as the superhump period P_sh. These
are GX Cas, HO Del, HS Vir, BC UMa, RZ Leo, KV Dra, KS UMa, TU Crt, QW Ser, and
RZ Sge. For the remaining ten, we report a medley of P_o and P_sh measurements
from photometry; most are new, with some confirmations of previous values.
These are KV And, LL And, WX Cet, MM Hya, AO Oct, V2051 Oph, NY Ser, KK Tel, HV
Vir, and RX J1155.4-5641.
Periods, as usual, can be measured to high accuracy, and these are of special
interest since they carry dynamical information about the binary. We still have
not quite learned how to read the music, but a few things are clear. The
fractional superhump excess epsilon [=(P_sh-P_o)/P_o] varies smoothly with P_o.
The scatter of the points about that smooth curve is quite low, and can be used
to limit the intrinsic scatter in M_1, the white dwarf mass, and the
mass-radius relation of the secondary. The dispersion in M_1 does not exceed
24%, and the secondary-star radii scatter by no more than 11% from a fixed
mass-radius relation. For the well-behaved part of epsilon(P_o) space, we
estimate from superhump theory that the secondaries are 18+-6% larger than
theoretical ZAMS stars. This affects some other testable predictions about the
secondaries: at a fixed P_o, it suggests that the secondaries are (compared
with ZAMS predictions) 40+-14% less massive, 12+-4% smaller, 19+-6% cooler, and
less luminous by a factor 2.5(7). The presence of a well-defined mass-radius
relation, reflected in a well-defined epsilon(P_o) relation, strongly limits
effects of nuclear evolution in the secondaries.Comment: PDF, 62 pages, 7 tables, 21 figures; accepted, in press, to appear
November 2003, PASP; more info at http://cba.phys.columbia.edu
Khresmoi Professional: Multilingual Semantic Search for Medical Professionals
There is increasing interest in and need for innovative solutions to medical search. In this paper we present the EU funded Khresmoi medical search and access system, currently in year 3 of 4 of development across 12 partners . The Khresmoi system uses a component based architecture housed in the cloud to allow for the development of several innovative applications to support target users medical information needs. The Khresmoi search systems based on this architecture have been designed to support the multilingual and multimod al information needs of three target groups the general public, general practitioners and consultant
radiologists. In this paper we focus on the presentation of the systems to support the latter two groups using semantic, multilingual text and image based (including 2D and 3D radiology images) search