26 research outputs found
Increasing Social Integration in an Interdisciplinary MA Programme through Group Work
In an interdisciplinary MA programme, it is especially important that the students get socially integrated from the beginning. Most often not only the place and people will be new but also the field of study. This can be difficult to handle without a network. In this project I will investigate using group work to help initiate social integration. In this context, I will also reflect on the different group work I conducted
syntactic recordering in statistical machine translation
Reordering has been an important topic in statistical machine translation
(SMT) as long as SMT has been around. State-of-the-art SMT systems such
as Pharaoh (Koehn, 2004a) still employ a simplistic model of the reordering
process to do non-local reordering. This model penalizes any reordering no
matter the words. The reordering is only selected if it leads to a translation
that looks like a much better sentence than the alternative.
Recent developments have, however, seen improvements in translation
quality following from syntax-based reordering. One such development
is the pre-translation approach that adjusts the source sentence to resemble
target language word order prior to translation. This is done based on
rules that are either manually created or automatically learned from word
aligned parallel corpora.
We introduce a novel approach to syntactic reordering. This approach
provides better exploitation of the information in the reordering rules and
eliminates problematic biases of previous approaches. Although the approach
is examined within a pre-translation reordering framework, it easily
extends to other frameworks. Our approach significantly outperforms a
state-of-the-art phrase-based SMT system and previous approaches to pretranslation
reordering, including (Li et al., 2007; Zhang et al., 2007b; Crego
& MariË no, 2007). This is consistent both for a very close language pair,
English-Danish, and a very distant language pair, English-Arabic.
We also propose automatic reordering rule learning based on a rich set
of linguistic information. As opposed to most previous approaches that
extract a large set of rules, our approach produces a small set of predominantly
general rules. These provide a good reflection of the main reordering
issues of a given language pair. We examine the influence of several
parameters that may have influence on the quality of the rules learned.
Finally, we provide a new approach for improving automatic word alignment.
This word alignment is used in the above task of automatically learning
reordering rules. Our approach learns from hand aligned data how to
combine several automatic word alignments to one superior word alignment.
The automatic word alignments are created from the same data that
has been preprocessed with different tokenization schemes. Thus utilizing
the different strengths that different tokenization schemes exhibit in word
alignment. We achieve a 38% error reduction for the automatic word alignmen
Incremental Re-training for Post-editing SMT
A method is presented for incremental retraining
of an SMT system, in which a local
phrase table is created and incrementally updated
as a file is translated and post-edited.
It is shown that translation data from within
the same file has higher value than other
domain-specific data. In two technical domains,
within-file data increases BLEU score
by several full points. Furthermore, a strong
recency effect is documented; nearby data
within the file has greater value than more
distant data. It is also shown that the value
of translation data is strongly correlated with
a metric defined over new occurrences of ngrams.
Finally, it is argued that the incremental
re-training prototype could serve as the basis
for a practical system which could be interactively
updated in real time in a post-editing
setting. Based on the results here, such an interactive
system has the potential to dramatically
improve translation quality
Effect of implantable cardioverter-defibrillators in patients with non-ischaemic systolic heart failure and concurrent coronary atherosclerosis
AIMS: Prophylactic implantable cardioverterâdefibrillators (ICD) reduce mortality in patients with ischaemic heart failure (HF), whereas the effect of ICD in patients with nonâischaemic HF is less clear. We aimed to investigate the association between concomitant coronary atherosclerosis and mortality in patients with nonâischaemic HF and the effect of ICD implantation in these patients. METHODS AND RESULTS: Patients were included from DANISH (Danish Study to Assess the Efficacy of Implantable Cardioverter Defibrillators in Patients with NonâIschaemic Systolic Heart Failure on Mortality), randomizing patients to ICD or control. Study inclusion criteria for HF were left ventricular ejection fraction â¤Â 35% and increased levels (>200 pg/mL) of Nâterminal proâbrain natriuretic peptide. Of the 1116 patients from DANISH, 838 (75%) patients had available data from coronary angiogram and were included in this subgroup analysis. We used Cox regression to assess the relationship between coronary atherosclerosis and mortality and the effect of ICD implantation. Of the included patients, 266 (32%) had coronary atherosclerosis. Of these, 216 (81%) had atherosclerosis without significant stenoses, and 50 (19%) had significant stenosis. Patients with atherosclerosis were significantly older {67 [interquartile range (IQR) 61â73] vs. 61 [IQR 54â68] years; P < 0.0001}, and more were men (77% vs. 70%; P = 0.03). During a median followâup of 64.3 months (IQR 47â82), 174 (21%) of the patients died. The effect of ICD on allâcause mortality was not modified by coronary atherosclerosis [hazard ratio (HR) 0.94; 0.58â1.52; P = 0.79 vs. HR 0.82; 0.56â1.20; P = 0.30], P for interaction = 0.67. In univariable analysis, coronary atherosclerosis was a significant predictor of allâcause mortality [HR, 1.41; 95% confidence interval (CI), 1.04â1.91; P = 0.03]. However, this association disappeared when adjusting for cardiovascular risk factors (age, gender, diabetes, hypertension, smoking, and estimated glomerular filtration rate) (HR 1.05, 0.76â1.45, P = 0.76). CONCLUSIONS: In patients with nonâischaemic systolic heart failure, ICD implantation did not reduce allâcause mortality in patients either with or without concomitant coronary atherosclerosis. The concomitant presence of coronary atherosclerosis was associated with increased mortality. However, this association was explained by other risk factors
Syntactic reordering in statistical machine translation
Reordering has been an important topic in statistical machine translation (SMT) as long as SMT has been around. State-of-the-art SMT systems such as Pharaoh (Koehn, 2004a) still employ a simplistic model of the reordering process to do non-local reordering. This model penalizes any reordering no matter the words. The reordering is only selected if it leads to a translation that looks like a much better sentence than the alternative. Recent developments have, however, seen improvements in translation quality following from syntax-based reordering. One such development is the pre-translation approach that adjusts the source sentence to resemble target language word order prior to translation. This is done based on rules that are either manually created or automatically learned from word aligned parallel corpora. We introduce a novel approach to syntactic reordering. This approach provides better exploitation of the information in the reordering rules and eliminates problematic biases of previous approaches. Although the approach is examined within a pre-translation reordering framework, it easily extends to other frameworks. Our approach significantly outperforms a state-of-the-art phrase-based SMT system and previous approaches to pretranslation reordering, including (Li et al., 2007; Zhang et al., 2007b; Crego & MariË no, 2007). This is consistent both for a very close language pair, English-Danish, and a very distant language pair, English-Arabic. We also propose automatic reordering rule learning based on a rich set of linguistic information. As opposed to most previous approaches that extract a large set of rules, our approach produces a small set of predominantly general rules. These provide a good reflection of the main reordering issues of a given language pair. We examine the influence of several parameters that may have influence on the quality of the rules learned. Finally, we provide a new approach for improving automatic word alignment. This word alignment is used in the above task of automatically learning reordering rules. Our approach learns from hand aligned data how to combine several automatic word alignments to one superior word alignment. The automatic word alignments are created from the same data that has been preprocessed with different tokenization schemes. Thus utilizing the different strengths that different tokenization schemes exhibit in word alignment. We achieve a 38% error reduction for the automatic word alignmen
Computational Linguistics
We present a novel approach to word reordering which successfully integrates syntactic structural knowledge with phrase-based SMT. This is done by constructing a lattice of alternatives based on automatically learned probabilistic syntactic rules. In decoding, the alternatives are scored based on the output word order, not the order of the input. Unlike previous approaches, this makes it possible to successfully integrate syntactic reordering with phrase-based SMT. On an English-Danish task, we achieve an absolute improvement in translation quality of 1.1 % BLEU. Manual evaluation supports the claim that the present approach is significantly superior to previous approaches.