70,129 research outputs found
Language and Proofs for Higher-Order SMT (Work in Progress)
Satisfiability modulo theories (SMT) solvers have throughout the years been
able to cope with increasingly expressive formulas, from ground logics to full
first-order logic modulo theories. Nevertheless, higher-order logic within SMT
is still little explored. One main goal of the Matryoshka project, which
started in March 2017, is to extend the reasoning capabilities of SMT solvers
and other automatic provers beyond first-order logic. In this preliminary
report, we report on an extension of the SMT-LIB language, the standard input
format of SMT solvers, to handle higher-order constructs. We also discuss how
to augment the proof format of the SMT solver veriT to accommodate these new
constructs and the solving techniques they require.Comment: In Proceedings PxTP 2017, arXiv:1712.0089
Rewriting Modulo SMT and Open System Analysis
Rewriting modulo SMT is a new technique that combines the power of SMT solving, rewriting modulo theories, and model checking. Rewriting modulo SMT is ideally suited to model and analyze reachability properties of infinite-state open systems, i.e., systems that interact with a nondeterministic environment. Such systems exhibit both internal nondeterminism, which is proper to the system, and external nondeterminism, which is due to the environment. In a reflective formalism, such as rewriting logic, rewriting modulo SMT can be reduced to standard rewriting. Hence, rewriting modulo SMT naturally extends rewriting-based reachability analysis techniques, which are available for closed systems, to open systems. In this talk, I will be discussing the main conceptual and technical ideas behind rewriting modulo SMT, its state of implementation in the Maude system, and some research challenges to be tackled during the next few years.Universidad de MĂĄlaga. Campus de Excelencia Internacional AndalucĂa Tech
Hybridity in MT: experiments on the Europarl corpus
(Way & Gough, 2005) demonstrate that their Marker-based EBMT system is capable of outperforming a word-based
SMT system trained on reasonably large data sets. (Groves & Way, 2005) take this a stage further and demonstrate that
while the EBMT system also outperforms a phrase-based SMT (PBSMT) system, a hybrid 'example-based SMT' system incorporating marker chunks and SMT sub-sentential alignments is capable of outperforming both baseline translation models for French{English translation.
In this paper, we show that similar gains are to be had from constructing a hybrid 'statistical EBMT' system capable
of outperforming the baseline system of (Way & Gough, 2005). Using the Europarl (Koehn, 2005) training and test
sets we show that this time around, although all 'hybrid' variants of the EBMT system fall short of the quality achieved by the baseline PBSMT system, merging
elements of the marker-based and SMT data, as in (Groves & Way, 2005), to create a hybrid 'example-based SMT' system, outperforms the baseline SMT and EBMT systems from which it is derived.
Furthermore, we provide further evidence in favour of hybrid systems by adding an SMT target language model to all EBMT system variants and demonstrate that this too has a positive eÂźect on translation quality
A Simple and Flexible Way of Computing Small Unsatisfiable Cores in SAT Modulo Theories
Finding small unsatisfiable cores for SAT problems has recently received a lot of interest, mostly for its applications in formal verification. However, propositional logic is often not expressive enough for representing many interesting verification problems, which can be more naturally addressed in the framework of Satisfiability Modulo Theories, SMT. Surprisingly, the problem of finding unsatisfiable cores in SMT has received very little attention in the literature; in particular, we are not aware of any work aiming at producing small unsatisfiable cores in SMT. In this paper we present a novel approach to this problem. The main idea is to combine an SMT solver with an external propositional core extractor: the SMT solver produces the theory lemmas found during the search; the core extractor is then called on the boolean abstraction of the original SMT problem and of the theory lemmas. This results in an unsatisfiable core for the original SMT problem, once the remaining theory lemmas have been removed. The approach is conceptually interesting, since the SMT solver is used to dynamically lift the suitable amount of theory information to the boolean level, and it also has several advantages in practice. In fact, it is extremely simple to implement and to update, and it can be interfaced with every propositional core extractor in a plug-and-play manner, so that to benefit for free of all unsat-core reduction techniques which have been or will be made available. We have evaluated our approach by an extensive empirical test on SMT-LIB benchmarks, which confirms the validity and potential of this approach
TMX markup: a challenge when adapting SMT to the localisation environment
Translation memory (TM) plays an important role in localisation workflows and is used as an efficient and fundamental tool to carry out translation. In recent years, statistical machine translation (SMT) techniques have been rapidly developed, and the translation quality and speed have been significantly improved as well. However,when applying SMT technique to facilitate post-editing in the localisation industry, we need to adapt SMT to the TM data which is formatted with special mark-up. In this paper, we explore some issues when adapting SMT to Symantec formatted TM data.
Three different methods are proposed to handle the Translation Memory eXchange (TMX) markup and a comparative study is carried out between them. Furthermore, we also compare the TMX-based SMT systems with a customised SYSTRAN system through human evaluation and automatic evaluation metrics. The experimental results conducted on the French and English language pair show that the SMT can perform well using TMX as input format either during training or at runtime
Optimization Modulo Theories with Linear Rational Costs
In the contexts of automated reasoning (AR) and formal verification (FV),
important decision problems are effectively encoded into Satisfiability Modulo
Theories (SMT). In the last decade efficient SMT solvers have been developed
for several theories of practical interest (e.g., linear arithmetic, arrays,
bit-vectors). Surprisingly, little work has been done to extend SMT to deal
with optimization problems; in particular, we are not aware of any previous
work on SMT solvers able to produce solutions which minimize cost functions
over arithmetical variables. This is unfortunate, since some problems of
interest require this functionality.
In the work described in this paper we start filling this gap. We present and
discuss two general procedures for leveraging SMT to handle the minimization of
linear rational cost functions, combining SMT with standard minimization
techniques. We have implemented the procedures within the MathSAT SMT solver.
Due to the absence of competitors in the AR, FV and SMT domains, we have
experimentally evaluated our implementation against state-of-the-art tools for
the domain of linear generalized disjunctive programming (LGDP), which is
closest in spirit to our domain, on sets of problems which have been previously
proposed as benchmarks for the latter tools. The results show that our tool is
very competitive with, and often outperforms, these tools on these problems,
clearly demonstrating the potential of the approach.Comment: Submitted on january 2014 to ACM Transactions on Computational Logic,
currently under revision. arXiv admin note: text overlap with arXiv:1202.140
Taking statistical machine translation to the student translator
Despite the growth of statistical machine translation (SMT) research and development in recent years, it remains somewhat out of reach for the translation community where programming expertise and knowledge of statistics tend not to be commonplace. While the concept of SMT is relatively straightforward, its implementation in functioning systems remains difficult for most, regardless of expertise. More recently, however, developments such as SmartMATE have emerged which aim to assist users in creating their own customized SMT systems and thus reduce the learning curve associated with SMT. In addition to commercial uses, translator training stands to benefit from such increased levels of inclusion and access to state-of-the-art approaches to MT. In this paper we draw on experience in developing and evaluating a new syllabus in SMT for a cohort of post-graduate student translators: we identify several issues encountered in the introduction of student translators to SMT, and report on data derived from repeated measures questionnaires that aim to capture data on studentsâ self-efficacy in the use of SMT. Overall, results show that participants report significant increases in their levels of confidence and knowledge of MT in general, and of SMT in particular. Additional benefits â such as increased technical competence and confidence â and future refinements are also discussed
Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation
The goal of counterfactual learning for statistical machine translation (SMT)
is to optimize a target SMT system from logged data that consist of user
feedback to translations that were predicted by another, historic SMT system. A
challenge arises by the fact that risk-averse commercial SMT systems
deterministically log the most probable translation. The lack of sufficient
exploration of the SMT output space seemingly contradicts the theoretical
requirements for counterfactual learning. We show that counterfactual learning
from deterministic bandit logs is possible nevertheless by smoothing out
deterministic components in learning. This can be achieved by additive and
multiplicative control variates that avoid degenerate behavior in empirical
risk minimization. Our simulation experiments show improvements of up to 2 BLEU
points by counterfactual learning from deterministic bandit feedback.Comment: Conference on Empirical Methods in Natural Language Processing
(EMNLP), 2017, Copenhagen, Denmar
Integrating N-best SMT outputs into a TM system
In this paper, we propose a novel frame- work to enrich Translation Memory (TM) systems with Statistical Machine Translation (SMT) outputs using ranking. In order to offer the human translators multiple choices, instead of only using the top SMT output and top TM hit, we merge the N-best output from the SMT system and the k-best hits with highest fuzzy match scores from the TM system. The merged list is then ranked according to the prospective post-editing effort and provided to the translators to aid their work. Experiments show that our ranked output achieve 0.8747 precision at top 1 and 0.8134 precision at top 5. Our
framework facilitates a tight integration between SMT and TM, where full advantage is taken of TM while high quality
SMT output is availed of to improve the productivity of human translators
- âŠ