3,185 research outputs found

    Improving the translation environment for professional translators

    Get PDF
    When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project

    Probabilistic Inference Modulo Theories

    Get PDF
    We present SGDPLL(T), an algorithm that solves (among many other problems) probabilistic inference modulo theories, that is, inference problems over probabilistic models defined via a logic theory provided as a parameter (currently, propositional, equalities on discrete sorts, and inequalities, more specifically difference arithmetic, on bounded integers). While many solutions to probabilistic inference over logic representations have been proposed, SGDPLL(T) is simultaneously (1) lifted, (2) exact and (3) modulo theories, that is, parameterized by a background logic theory. This offers a foundation for extending it to rich logic languages such as data structures and relational data. By lifted, we mean algorithms with constant complexity in the domain size (the number of values that variables can take). We also detail a solver for summations with difference arithmetic and show experimental results from a scenario in which SGDPLL(T) is much faster than a state-of-the-art probabilistic solver.Comment: Submitted to StarAI-16 workshop as closely revised version of IJCAI-16 pape

    Bit-Vector Model Counting using Statistical Estimation

    Full text link
    Approximate model counting for bit-vector SMT formulas (generalizing \#SAT) has many applications such as probabilistic inference and quantitative information-flow security, but it is computationally difficult. Adding random parity constraints (XOR streamlining) and then checking satisfiability is an effective approximation technique, but it requires a prior hypothesis about the model count to produce useful results. We propose an approach inspired by statistical estimation to continually refine a probabilistic estimate of the model count for a formula, so that each XOR-streamlined query yields as much information as possible. We implement this approach, with an approximate probability model, as a wrapper around an off-the-shelf SMT solver or SAT solver. Experimental results show that the implementation is faster than the most similar previous approaches which used simpler refinement strategies. The technique also lets us model count formulas over floating-point constraints, which we demonstrate with an application to a vulnerability in differential privacy mechanisms

    How much hybridisation does machine translation need?

    Get PDF
    This is the peer reviewed version of the following article: [Costa-jussà, M. R. (2015), How much hybridization does machine translation Need?. J Assn Inf Sci Tec, 66: 2160–2165. doi:10.1002/asi.23517], which has been published in final form at [10.1002/asi.23517]. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.Rule-based and corpus-based machine translation (MT)have coexisted for more than 20 years. Recently, bound-aries between the two paradigms have narrowed andhybrid approaches are gaining interest from bothacademia and businesses. However, since hybridapproaches involve the multidisciplinary interaction oflinguists, computer scientists, engineers, and informa-tion specialists, understandably a number of issuesexist.While statistical methods currently dominate researchwork in MT, most commercial MT systems are techni-cally hybrid systems. The research community shouldinvestigate the bene¿ts and questions surrounding thehybridization of MT systems more actively. This paperdiscusses various issues related to hybrid MT includingits origins, architectures, achievements, and frustra-tions experienced in the community. It can be said thatboth rule-based and corpus- based MT systems havebene¿ted from hybridization when effectively integrated.In fact, many of the current rule/corpus-based MTapproaches are already hybridized since they do includestatistics/rules at some point.Peer ReviewedPostprint (author's final draft

    Approximate weighted model integration on DNF structures

    Get PDF
    Weighted model counting consists of computing the weighted sum of all satisfying assignments of a propositional formula. Weighted model counting is well-known to be #P-hard for exact solving, but admits a fully polynomial randomized approximation scheme when restricted to DNF structures. In this work, we study weighted model integration, a generalization of weighted model counting which involves real variables in addition to propositional variables, and pose the following question: Does weighted model integration on DNF structures admit a fully polynomial randomized approximation scheme? Building on classical results from approximate weighted model counting and approximate volume computation, we show that weighted model integration on DNF structures can indeed be approximated for a class of weight functions. Our approximation algorithm is based on three subroutines, each of which can be a weak (i.e., approximate), or a strong (i.e., exact) oracle, and in all cases, comes along with accuracy guarantees. We experimentally verify our approach over randomly generated DNF instances of varying sizes, and show that our algorithm scales to large problem instances, involving up to 1K variables, which are currently out of reach for existing, general-purpose weighted model integration solvers
    • …
    corecore