1,795,466 research outputs found
Contextual bitext-derived paraphrases in automatic MT evaluation
In this paper we present a novel method for deriving paraphrases during automatic MT evaluation using only the source and reference texts, which are necessary for
the evaluation, and word and phrase alignment software. Using target language paraphrases produced through word and
phrase alignment a number of alternative reference sentences are constructed automatically for each candidate translation. The method produces lexical and lowlevel
syntactic paraphrases that are relevant to the domain in hand, does not use external knowledge resources, and can be
combined with a variety of automatic MT evaluation system
Learning labelled dependencies in machine translation evaluation
Recently novel MT evaluation metrics have been presented which go beyond pure string matching, and which correlate
better than other existing metrics with human judgements. Other research in this area has presented machine learning
methods which learn directly from human judgements. In this paper, we present a novel combination of dependency- and
machine learning-based approaches to automatic MT evaluation, and demonstrate greater correlations with human judgement than the existing state-of-the-art methods.
In addition, we examine the extent to which our novel method can be generalised across different tasks and domains
Robust Dialog State Tracking for Large Ontologies
The Dialog State Tracking Challenge 4 (DSTC 4) differentiates itself from the
previous three editions as follows: the number of slot-value pairs present in
the ontology is much larger, no spoken language understanding output is given,
and utterances are labeled at the subdialog level. This paper describes a novel
dialog state tracking method designed to work robustly under these conditions,
using elaborate string matching, coreference resolution tailored for dialogs
and a few other improvements. The method can correctly identify many values
that are not explicitly present in the utterance. On the final evaluation, our
method came in first among 7 competing teams and 24 entries. The F1-score
achieved by our method was 9 and 7 percentage points higher than that of the
runner-up for the utterance-level evaluation and for the subdialog-level
evaluation, respectively.Comment: Paper accepted at IWSDS 201
Proving termination of evaluation for System F with control operators
We present new proofs of termination of evaluation in reduction semantics
(i.e., a small-step operational semantics with explicit representation of
evaluation contexts) for System F with control operators. We introduce a
modified version of Girard's proof method based on reducibility candidates,
where the reducibility predicates are defined on values and on evaluation
contexts as prescribed by the reduction semantics format. We address both
abortive control operators (callcc) and delimited-control operators (shift and
reset) for which we introduce novel polymorphic type systems, and we consider
both the call-by-value and call-by-name evaluation strategies.Comment: In Proceedings COS 2013, arXiv:1309.092
A novel cassette method for probe evaluation in the designed biochips
A critical step in biochip design is the selection of probes with identical hybridisation characteristics. In this article we describe a novel method for evaluating DNA hybridisation probes, allowing the fine-tuning of biochips, that uses cassettes with multiple probes. Each cassette contains probes in equimolar proportions so that their hybridisation performance can be assessed in a single reaction. The model used to demonstrate this method was a series of probes developed to detect TORCH pathogens. DNA probes were designed for Toxoplasma gondii, Chlamidia trachomatis, Rubella, Cytomegalovirus, and Herpes virus and these were used to construct the DNA cassettes. Five cassettes were constructed to detect TORCH pathogens using a variety of genes coding for membrane proteins, viral matrix protein, an early expressed viral protein, viral DNA polymerase and the repetitive gene B1 of Toxoplasma gondii. All of these probes, except that for the B1 gene, exhibited similar profiles under the same hybridisation conditions. The failure of the B1 gene probe to hybridise was not due to a position effect, and this indicated that the probe was unsuitable for inclusion in the biochip. The redesigned probe for the B1 gene exhibited identical hybridisation properties to the other probes, suitable for inclusion in a biochip
Method of calibration for glucose sensor implemented in an integrated microdialysis based system
In this paper the novel method of calibration of glucose amperometric type sensor implemented in an integrated microdialysis based micro system is presented. This method consists in evaluation of the charge, resulting from the glucose consumption in the enzymatic reaction, transferred to the electrode under stop-flow conditions
Probabilistic Adaptive Computation Time
We present a probabilistic model with discrete latent variables that control
the computation time in deep learning models such as ResNets and LSTMs. A prior
on the latent variables expresses the preference for faster computation. The
amount of computation for an input is determined via amortized maximum a
posteriori (MAP) inference. MAP inference is performed using a novel stochastic
variational optimization method. The recently proposed Adaptive Computation
Time mechanism can be seen as an ad-hoc relaxation of this model. We
demonstrate training using the general-purpose Concrete relaxation of discrete
variables. Evaluation on ResNet shows that our method matches the
speed-accuracy trade-off of Adaptive Computation Time, while allowing for
evaluation with a simple deterministic procedure that has a lower memory
footprint
An Enhanced Method For Evaluating Automatic Video Summaries
Evaluation of automatic video summaries is a challenging problem. In the past
years, some evaluation methods are presented that utilize only a single feature
like color feature to detect similarity between automatic video summaries and
ground-truth user summaries. One of the drawbacks of using a single feature is
that sometimes it gives a false similarity detection which makes the assessment
of the quality of the generated video summary less perceptual and not accurate.
In this paper, a novel method for evaluating automatic video summaries is
presented. This method is based on comparing automatic video summaries
generated by video summarization techniques with ground-truth user summaries.
The objective of this evaluation method is to quantify the quality of video
summaries, and allow comparing different video summarization techniques
utilizing both color and texture features of the video frames and using the
Bhattacharya distance as a dissimilarity measure due to its advantages. Our
Experiments show that the proposed evaluation method overcomes the drawbacks of
other methods and gives a more perceptual evaluation of the quality of the
automatic video summaries.Comment: This paper has been withdrawn by the author due to some errors and
incomplete stud
- …
