4,987 research outputs found
An Expression Tree Decoding Strategy for Mathematical Equation Generation
Generating mathematical equations from natural language requires an accurate
understanding of the relations among math expressions. Existing approaches can
be broadly categorized into token-level and expression-level generation. The
former treats equations as a mathematical language, sequentially generating
math tokens. Expression-level methods generate each expression one by one.
However, each expression represents a solving step, and there naturally exist
parallel or dependent relations between these steps, which are ignored by
current sequential methods. Therefore, we integrate tree structure into the
expression-level generation and advocate an expression tree decoding strategy.
To generate a tree with expression as its node, we employ a layer-wise parallel
decoding strategy: we decode multiple independent expressions (leaf nodes) in
parallel at each layer and repeat parallel decoding layer by layer to
sequentially generate these parent node expressions that depend on others.
Besides, a bipartite matching algorithm is adopted to align multiple
predictions with annotations for each layer. Experiments show our method
outperforms other baselines, especially for these equations with complex
structures.Comment: Accepted to EMNLP-2023, camera-ready versio
A Theme-Rewriting Approach for Generating Algebra Word Problems
Texts present coherent stories that have a particular theme or overall
setting, for example science fiction or western. In this paper, we present a
text generation method called {\it rewriting} that edits existing
human-authored narratives to change their theme without changing the underlying
story. We apply the approach to math word problems, where it might help
students stay more engaged by quickly transforming all of their homework
assignments to the theme of their favorite movie without changing the math
concepts that are being taught. Our rewriting method uses a two-stage decoding
process, which proposes new words from the target theme and scores the
resulting stories according to a number of factors defining aspects of
syntactic, semantic, and thematic coherence. Experiments demonstrate that the
final stories typically represent the new theme well while still testing the
original math concepts, outperforming a number of baselines. We also release a
new dataset of human-authored rewrites of math word problems in several themes.Comment: To appear EMNLP 201
Polyglot Semantic Parsing in APIs
Traditional approaches to semantic parsing (SP) work by training individual
models for each available parallel dataset of text-meaning pairs. In this
paper, we explore the idea of polyglot semantic translation, or learning
semantic parsing models that are trained on multiple datasets and natural
languages. In particular, we focus on translating text to code signature
representations using the software component datasets of Richardson and Kuhn
(2017a,b). The advantage of such models is that they can be used for parsing a
wide variety of input natural languages and output programming languages, or
mixed input languages, using a single unified model. To facilitate modeling of
this type, we develop a novel graph-based decoding framework that achieves
state-of-the-art performance on the above datasets, and apply this method to
two other benchmark SP tasks.Comment: accepted for NAACL-2018 (camera ready version
Discrete logarithm computations over finite fields using Reed-Solomon codes
Cheng and Wan have related the decoding of Reed-Solomon codes to the
computation of discrete logarithms over finite fields, with the aim of proving
the hardness of their decoding. In this work, we experiment with solving the
discrete logarithm over GF(q^h) using Reed-Solomon decoding. For fixed h and q
going to infinity, we introduce an algorithm (RSDL) needing O (h! q^2)
operations over GF(q), operating on a q x q matrix with (h+2) q non-zero
coefficients. We give faster variants including an incremental version and
another one that uses auxiliary finite fields that need not be subfields of
GF(q^h); this variant is very practical for moderate values of q and h. We
include some numerical results of our first implementations
Non-Autoregressive Math Word Problem Solver with Unified Tree Structure
Existing MWP solvers employ sequence or binary tree to present the solution
expression and decode it from given problem description. However, such
structures fail to handle the variants that can be derived via mathematical
manipulation, e.g., and can both be
possible valid solutions for a same problem but formulated as different
expression sequences or trees. The multiple solution variants depicting
different possible solving procedures for the same input problem would raise
two issues: 1) making it hard for the model to learn the mapping function
between the input and output spaces effectively, and 2) wrongly indicating
\textit{wrong} when evaluating a valid expression variant. To address these
issues, we introduce a unified tree structure to present a solution expression,
where the elements are permutable and identical for all the expression
variants. We propose a novel non-autoregressive solver, named \textit{MWP-NAS},
to parse the problem and deduce the solution expression based on the unified
tree. For evaluating the possible expression variants, we design a path-based
metric to evaluate the partial accuracy of expressions of a unified tree. The
results from extensive experiments conducted on Math23K and MAWPS demonstrate
the effectiveness of our proposed MWP-NAS. The codes and checkpoints are
available at: \url{https://github.com/mengqunhan/MWP-NAS}.Comment: Accepted at EMNLP202
On joint detection and decoding of linear block codes on Gaussian vector channels
Optimal receivers recovering signals transmitted across noisy communication channels employ a maximum-likelihood (ML) criterion to minimize the probability of error. The problem of finding the most likely transmitted symbol is often equivalent to finding the closest lattice point to a given point and is known to be NP-hard. In systems that employ error-correcting coding for data protection, the symbol space forms a sparse lattice, where the sparsity structure is determined by the code. In such systems, ML data recovery may be geometrically interpreted as a search for the closest point in the sparse lattice. In this paper, motivated by the idea of the "sphere decoding" algorithm of Fincke and Pohst, we propose an algorithm that finds the closest point in the sparse lattice to the given vector. This given vector is not arbitrary, but rather is an unknown sparse lattice point that has been perturbed by an additive noise vector whose statistical properties are known. The complexity of the proposed algorithm is thus a random variable. We study its expected value, averaged over the noise and over the lattice. For binary linear block codes, we find the expected complexity in closed form. Simulation results indicate significant performance gains over systems employing separate detection and decoding, yet are obtained at a complexity that is practically feasible over a wide range of system parameters
Self-consistent Reasoning For Solving Math Word Problems
Math word problems (MWPs) is a task that automatically derives solution
expression from a giving math problems in text. The previous studies suffer
from spurious correlations between input text and output expression. To
mitigate this issue, we propose a self-consistent reasoning framework called
SCR, which attempts to adopt a pruning strategy to correct the output
distribution shift so as to implicitly fix those spurious correlative samples.
Specifically, we firstly obtain a sub-network by pruning a roberta2tree model,
for the sake to use the gap on output distribution between the original
roberta2tree model and the pruned sub-network to expose spurious correlative
samples. Then, we calibrate the output distribution shift by applying symmetric
Kullback-Leibler divergence to alleviate spurious correlations. In addition,
SCR generates equivalent expressions, thereby, capturing the original text's
logic rather than relying on hints from original text. Extensive experiments on
two large-scale benchmarks demonstrate that our model substantially outperforms
the strong baseline methods.Comment: Submitted to IEEE ICASSP 202
- …