3,185 research outputs found
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
Probabilistic Inference Modulo Theories
We present SGDPLL(T), an algorithm that solves (among many other problems)
probabilistic inference modulo theories, that is, inference problems over
probabilistic models defined via a logic theory provided as a parameter
(currently, propositional, equalities on discrete sorts, and inequalities, more
specifically difference arithmetic, on bounded integers). While many solutions
to probabilistic inference over logic representations have been proposed,
SGDPLL(T) is simultaneously (1) lifted, (2) exact and (3) modulo theories, that
is, parameterized by a background logic theory. This offers a foundation for
extending it to rich logic languages such as data structures and relational
data. By lifted, we mean algorithms with constant complexity in the domain size
(the number of values that variables can take). We also detail a solver for
summations with difference arithmetic and show experimental results from a
scenario in which SGDPLL(T) is much faster than a state-of-the-art
probabilistic solver.Comment: Submitted to StarAI-16 workshop as closely revised version of
IJCAI-16 pape
Bit-Vector Model Counting using Statistical Estimation
Approximate model counting for bit-vector SMT formulas (generalizing \#SAT)
has many applications such as probabilistic inference and quantitative
information-flow security, but it is computationally difficult. Adding random
parity constraints (XOR streamlining) and then checking satisfiability is an
effective approximation technique, but it requires a prior hypothesis about the
model count to produce useful results. We propose an approach inspired by
statistical estimation to continually refine a probabilistic estimate of the
model count for a formula, so that each XOR-streamlined query yields as much
information as possible. We implement this approach, with an approximate
probability model, as a wrapper around an off-the-shelf SMT solver or SAT
solver. Experimental results show that the implementation is faster than the
most similar previous approaches which used simpler refinement strategies. The
technique also lets us model count formulas over floating-point constraints,
which we demonstrate with an application to a vulnerability in differential
privacy mechanisms
How much hybridisation does machine translation need?
This is the peer reviewed version of the following article: [Costa-jussà , M. R. (2015), How much hybridization does machine translation Need?. J Assn Inf Sci Tec, 66: 2160–2165. doi:10.1002/asi.23517], which has been published in final form at [10.1002/asi.23517]. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.Rule-based and corpus-based machine translation (MT)have coexisted for more than 20 years. Recently, bound-aries between the two paradigms have narrowed andhybrid approaches are gaining interest from bothacademia and businesses. However, since hybridapproaches involve the multidisciplinary interaction oflinguists, computer scientists, engineers, and informa-tion specialists, understandably a number of issuesexist.While statistical methods currently dominate researchwork in MT, most commercial MT systems are techni-cally hybrid systems. The research community shouldinvestigate the bene¿ts and questions surrounding thehybridization of MT systems more actively. This paperdiscusses various issues related to hybrid MT includingits origins, architectures, achievements, and frustra-tions experienced in the community. It can be said thatboth rule-based and corpus- based MT systems havebene¿ted from hybridization when effectively integrated.In fact, many of the current rule/corpus-based MTapproaches are already hybridized since they do includestatistics/rules at some point.Peer ReviewedPostprint (author's final draft
Approximate weighted model integration on DNF structures
Weighted model counting consists of computing the weighted sum of all satisfying assignments of a propositional formula. Weighted model counting is well-known to be #P-hard for exact solving, but admits a fully polynomial randomized approximation scheme when restricted to DNF structures. In this work, we study weighted model integration, a generalization of weighted model counting which involves real variables in addition to propositional variables, and pose the following question: Does weighted model integration on DNF structures admit a fully polynomial randomized approximation scheme? Building on classical results from approximate weighted model counting and approximate volume computation, we show that weighted model integration on DNF structures can indeed be approximated for a class of weight functions. Our approximation algorithm is based on three subroutines, each of which can be a weak (i.e., approximate), or a strong (i.e., exact) oracle, and in all cases, comes along with accuracy guarantees. We experimentally verify our approach over randomly generated DNF instances of varying sizes, and show that our algorithm scales to large problem instances, involving up to 1K variables, which are currently out of reach for existing, general-purpose weighted model integration solvers
- …