1,140 research outputs found

    Monte Carlo inference and maximization for phrase-based translation

    Get PDF
    Recent advances in statistical machine translation have used beam search for approximate NP-complete inference within probabilistic translation models. We present an alternative approach of sampling from the posterior distribution defined by a translation model. We define a novel Gibbs sampler for sampling translations given a source sentence and show that it effectively explores this posterior distribution. In doing so we overcome the limitations of heuristic beam search and obtain theoretically sound solutions to inference problems such as finding the maximum probability translation and minimum expected risk training and decoding.

    Maximum Fidelity

    Full text link
    The most fundamental problem in statistics is the inference of an unknown probability distribution from a finite number of samples. For a specific observed data set, answers to the following questions would be desirable: (1) Estimation: Which candidate distribution provides the best fit to the observed data?, (2) Goodness-of-fit: How concordant is this distribution with the observed data?, and (3) Uncertainty: How concordant are other candidate distributions with the observed data? A simple unified approach for univariate data that addresses these traditionally distinct statistical notions is presented called "maximum fidelity". Maximum fidelity is a strict frequentist approach that is fundamentally based on model concordance with the observed data. The fidelity statistic is a general information measure based on the coordinate-independent cumulative distribution and critical yet previously neglected symmetry considerations. An approximation for the null distribution of the fidelity allows its direct conversion to absolute model concordance (p value). Fidelity maximization allows identification of the most concordant model distribution, generating a method for parameter estimation, with neighboring, less concordant distributions providing the "uncertainty" in this estimate. Maximum fidelity provides an optimal approach for parameter estimation (superior to maximum likelihood) and a generally optimal approach for goodness-of-fit assessment of arbitrary models applied to univariate data. Extensions to binary data, binned data, multidimensional data, and classical parametric and nonparametric statistical tests are described. Maximum fidelity provides a philosophically consistent, robust, and seemingly optimal foundation for statistical inference. All findings are presented in an elementary way to be immediately accessible to all researchers utilizing statistical analysis.Comment: 66 pages, 32 figures, 7 tables, submitte
    • …
    corecore