44,086 research outputs found

    Automatic Quality Estimation for ASR System Combination

    Get PDF
    Recognizer Output Voting Error Reduction (ROVER) has been widely used for system combination in automatic speech recognition (ASR). In order to select the most appropriate words to insert at each position in the output transcriptions, some ROVER extensions rely on critical information such as confidence scores and other ASR decoder features. This information, which is not always available, highly depends on the decoding process and sometimes tends to over estimate the real quality of the recognized words. In this paper we propose a novel variant of ROVER that takes advantage of ASR quality estimation (QE) for ranking the transcriptions at "segment level" instead of: i) relying on confidence scores, or ii) feeding ROVER with randomly ordered hypotheses. We first introduce an effective set of features to compensate for the absence of ASR decoder information. Then, we apply QE techniques to perform accurate hypothesis ranking at segment-level before starting the fusion process. The evaluation is carried out on two different tasks, in which we respectively combine hypotheses coming from independent ASR systems and multi-microphone recordings. In both tasks, it is assumed that the ASR decoder information is not available. The proposed approach significantly outperforms standard ROVER and it is competitive with two strong oracles that e xploit prior knowledge about the real quality of the hypotheses to be combined. Compared to standard ROVER, the abs olute WER improvements in the two evaluation scenarios range from 0.5% to 7.3%

    An Analysis of Source-Side Grammatical Errors in NMT

    Full text link
    The quality of Neural Machine Translation (NMT) has been shown to significantly degrade when confronted with source-side noise. We present the first large-scale study of state-of-the-art English-to-German NMT on real grammatical noise, by evaluating on several Grammar Correction corpora. We present methods for evaluating NMT robustness without true references, and we use them for extensive analysis of the effects that different grammatical errors have on the NMT output. We also introduce a technique for visualizing the divergence distribution caused by a source-side error, which allows for additional insights.Comment: Accepted and to be presented at BlackboxNLP 201

    Benefits of data augmentation for NMT-based text normalization of user-generated content

    Get PDF
    One of the most persistent characteristics of written user-generated content (UGC) is the use of non-standard words. This characteristic contributes to an increased difficulty to automatically process and analyze UGC. Text normalization is the task of transforming lexical variants to their canonical forms and is often used as a pre-processing step for conventional NLP tasks in order to overcome the performance drop that NLP systems experience when applied to UGC. In this work, we follow a Neural Machine Translation approach to text normalization. To train such an encoder-decoder model, large parallel training corpora of sentence pairs are required. However, obtaining large data sets with UGC and their normalized version is not trivial, especially for languages other than English. In this paper, we explore how to overcome this data bottleneck for Dutch, a low-resource language. We start off with a publicly available tiny parallel Dutch data set comprising three UGC genres and compare two different approaches. The first is to manually normalize and add training data, a money and time-consuming task. The second approach is a set of data augmentation techniques which increase data size by converting existing resources into synthesized non-standard forms. Our results reveal that a combination of both approaches leads to the best results

    Reducing “Structure from Motion”: a general framework for dynamic vision. 2. Implementation and experimental assessment

    Get PDF
    For pt.1 see ibid., p.933-42 (1998). A number of methods have been proposed in the literature for estimating scene-structure and ego-motion from a sequence of images using dynamical models. Despite the fact that all methods may be derived from a “natural” dynamical model within a unified framework, from an engineering perspective there are a number of trade-offs that lead to different strategies depending upon the applications and the goals one is targeting. We want to characterize and compare the properties of each model such that the engineer may choose the one best suited to the specific application. We analyze the properties of filters derived from each dynamical model under a variety of experimental conditions, assess the accuracy of the estimates, their robustness to measurement noise, sensitivity to initial conditions and visual angle, effects of the bas-relief ambiguity and occlusions, dependence upon the number of image measurements and their sampling rate

    A Framework for Image Segmentation Using Shape Models and Kernel Space Shape Priors

    Get PDF
    ©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or distribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.DOI: 10.1109/TPAMI.2007.70774Segmentation involves separating an object from the background in a given image. The use of image information alone often leads to poor segmentation results due to the presence of noise, clutter or occlusion. The introduction of shape priors in the geometric active contour (GAC) framework has proved to be an effective way to ameliorate some of these problems. In this work, we propose a novel segmentation method combining image information with prior shape knowledge, using level-sets. Following the work of Leventon et al., we propose to revisit the use of PCA to introduce prior knowledge about shapes in a more robust manner. We utilize kernel PCA (KPCA) and show that this method outperforms linear PCA by allowing only those shapes that are close enough to the training data. In our segmentation framework, shape knowledge and image information are encoded into two energy functionals entirely described in terms of shapes. This consistent description permits to fully take advantage of the Kernel PCA methodology and leads to promising segmentation results. In particular, our shape-driven segmentation technique allows for the simultaneous encoding of multiple types of shapes, and offers a convincing level of robustness with respect to noise, occlusions, or smearing

    Low-resource machine translation using MATREX: The DCU machine translation system for IWSLT 2009

    Get PDF
    In this paper, we give a description of the Machine Translation (MT) system developed at DCU that was used for our fourth participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2009). Two techniques are deployed in our system in order to improve the translation quality in a low-resource scenario. The first technique is to use multiple segmentations in MT training and to utilise word lattices in decoding stage. The second technique is used to select the optimal training data that can be used to build MT systems. In this year’s participation, we use three different prototype SMT systems, and the output from each system are combined using standard system combination method. Our system is the top system for Chinese–English CHALLENGE task in terms of BLEU score

    Unscented Bayesian Optimization for Safe Robot Grasping

    Full text link
    We address the robot grasp optimization problem of unknown objects considering uncertainty in the input space. Grasping unknown objects can be achieved by using a trial and error exploration strategy. Bayesian optimization is a sample efficient optimization algorithm that is especially suitable for this setups as it actively reduces the number of trials for learning about the function to optimize. In fact, this active object exploration is the same strategy that infants do to learn optimal grasps. One problem that arises while learning grasping policies is that some configurations of grasp parameters may be very sensitive to error in the relative pose between the object and robot end-effector. We call these configurations unsafe because small errors during grasp execution may turn good grasps into bad grasps. Therefore, to reduce the risk of grasp failure, grasps should be planned in safe areas. We propose a new algorithm, Unscented Bayesian optimization that is able to perform sample efficient optimization while taking into consideration input noise to find safe optima. The contribution of Unscented Bayesian optimization is twofold as if provides a new decision process that drives exploration to safe regions and a new selection procedure that chooses the optimal in terms of its safety without extra analysis or computational cost. Both contributions are rooted on the strong theory behind the unscented transformation, a popular nonlinear approximation method. We show its advantages with respect to the classical Bayesian optimization both in synthetic problems and in realistic robot grasp simulations. The results highlights that our method achieves optimal and robust grasping policies after few trials while the selected grasps remain in safe regions.Comment: conference pape
    corecore