7,029 research outputs found
Adaptive Probability Theory: Human Biases as an Adaptation
Humans make mistakes in our decision-making and probability judgments. While the heuristics used for decision-making have been explained as adaptations that are both efficient and fast, the reasons why people deal with probabilities using the reported biases have not been clear. We will see that some of these biases can be understood as heuristics developed to explain a complex world when little information is available. That is, they approximate Bayesian inferences for situations more complex than the ones in laboratory experiments and in this sense might have appeared as an adaptation to those situations. When ideas as uncertainty and limited sample sizes are included in the problem, the correct probabilities are changed to values close to the observed behavior. These ideas will be used to explain the observed weight functions, the violations of coalescing and stochastic dominance reported in the literature
Parsing as Reduction
We reduce phrase-representation parsing to dependency parsing. Our reduction
is grounded on a new intermediate representation, "head-ordered dependency
trees", shown to be isomorphic to constituent trees. By encoding order
information in the dependency labels, we show that any off-the-shelf, trainable
dependency parser can be used to produce constituents. When this parser is
non-projective, we can perform discontinuous parsing in a very natural manner.
Despite the simplicity of our approach, experiments show that the resulting
parsers are on par with strong baselines, such as the Berkeley parser for
English and the best single system in the SPMRL-2014 shared task. Results are
particularly striking for discontinuous parsing of German, where we surpass the
current state of the art by a wide margin
Selective Attention for Context-aware Neural Machine Translation
Despite the progress made in sentence-level NMT, current systems still fall
short at achieving fluent, good quality translation for a full document. Recent
works in context-aware NMT consider only a few previous sentences as context
and may not scale to entire documents. To this end, we propose a novel and
scalable top-down approach to hierarchical attention for context-aware NMT
which uses sparse attention to selectively focus on relevant sentences in the
document context and then attends to key words in those sentences. We also
propose single-level attention approaches based on sentence or word-level
information in the context. The document-level context representation, produced
from these attention modules, is integrated into the encoder or decoder of the
Transformer model depending on whether we use monolingual or bilingual context.
Our experiments and evaluation on English-German datasets in different document
MT settings show that our selective attention approach not only significantly
outperforms context-agnostic baselines but also surpasses context-aware
baselines in most cases.Comment: Accepted at NAACL-HLT 201
- …