7,059 research outputs found
Optimal CDMA signatures: a finite-step approach
A description of optimal sequences for direct-sequence code division multiple access is a byproduct of recent characterizations of the sum capacity. The paper restates the sequence design problem as an inverse singular value problem and shows that it can be solved with finite-step algorithms from matrix analysis. Relevant algorithms are reviewed and a new one-sided construction is proposed that obtains the sequences directly instead of computing the Gram matrix of the optimal signatures
The Parametric Ordinal-Recursive Complexity of Post Embedding Problems
Post Embedding Problems are a family of decision problems based on the
interaction of a rational relation with the subword embedding ordering, and are
used in the literature to prove non multiply-recursive complexity lower bounds.
We refine the construction of Chambart and Schnoebelen (LICS 2008) and prove
parametric lower bounds depending on the size of the alphabet.Comment: 16 + vii page
Finite-step algorithms for constructing optimal CDMA signature sequences
A description of optimal sequences for direct-spread code-division multiple access (DS-CDMA) is a byproduct of recent characterizations of the sum capacity. This paper restates the sequence design problem as an inverse singular value problem and shows that the problem can be solved with finite-step algorithms from matrix theory. It proposes a new one-sided algorithm that is numerically stable and faster than previous methods
A Formal Model of Ambiguity and its Applications in Machine Translation
Systems that process natural language must cope with and resolve ambiguity. In this dissertation, a model of language processing is advocated in which multiple inputs and multiple analyses of inputs are considered concurrently and a single analysis is only a last resort. Compared to conventional models, this approach can be understood as replacing single-element inputs and outputs with weighted sets of inputs and outputs. Although processing components must deal with sets (rather than individual elements), constraints are imposed on the elements of these sets, and the representations from existing models may be reused. However, to deal efficiently with large (or infinite) sets, compact representations of sets that share structure between elements, such as weighted finite-state transducers and synchronous context-free grammars, are necessary. These representations and algorithms for manipulating them are discussed in depth in depth.
To establish the effectiveness and tractability of the proposed processing model, it is applied to several problems in machine translation. Starting with spoken language translation, it is shown that translating a set of transcription hypotheses yields better translations compared to a baseline in which a single (1-best) transcription hypothesis is selected and then translated, independent of the translation model formalism used. More subtle forms of ambiguity that arise even in text-only translation (such as decisions conventionally made during system development about how to preprocess text) are then discussed, and it is shown that the ambiguity-preserving paradigm can be employed in these cases as well, again leading to improved translation quality. A model for supervised learning that learns from training data where sets (rather than single elements) of correct labels are provided for each training instance and use it to learn a model of compound word segmentation is also introduced, which is used as a preprocessing step in machine translation
Macro Grammars and Holistic Triggering for Efficient Semantic Parsing
To learn a semantic parser from denotations, a learning algorithm must search
over a combinatorially large space of logical forms for ones consistent with
the annotated denotations. We propose a new online learning algorithm that
searches faster as training progresses. The two key ideas are using macro
grammars to cache the abstract patterns of useful logical forms found thus far,
and holistic triggering to efficiently retrieve the most relevant patterns
based on sentence similarity. On the WikiTableQuestions dataset, we first
expand the search space of an existing model to improve the state-of-the-art
accuracy from 38.7% to 42.7%, and then use macro grammars and holistic
triggering to achieve an 11x speedup and an accuracy of 43.7%.Comment: EMNLP 201
TuLiPA : towards a multi-formalism parsing environment for grammar engineering
In this paper, we present an open-source parsing environment (TĂĽbingen Linguistic Parsing Architecture, TuLiPA) which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. This environment currently supports tree-based grammars (namely Tree-Adjoining Grammars (TAG) and Multi-Component Tree-Adjoining Grammars with Tree Tuples (TT-MCTAG)) and allows computation not only of syntactic structures, but also of the corresponding semantic representations. It is used for the development of a tree-based grammar for German
TuLiPA : towards a multi-formalism parsing environment for grammar engineering
In this paper, we present an open-source parsing environment (TĂĽbingen Linguistic Parsing Architecture, TuLiPA) which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. This environment currently supports tree-based grammars (namely Tree-Adjoining Grammars (TAG) and Multi-Component Tree-Adjoining Grammars with Tree Tuples (TT-MCTAG)) and allows computation not only of syntactic structures, but also of the corresponding semantic representations. It is used for the development of a tree-based grammar for German
- …