960 research outputs found
Combining semantic and syntactic structure for language modeling
Structured language models for speech recognition have been shown to remedy
the weaknesses of n-gram models. All current structured language models are,
however, limited in that they do not take into account dependencies between
non-headwords. We show that non-headword dependencies contribute to
significantly improved word error rate, and that a data-oriented parsing model
trained on semantically and syntactically annotated data can exploit these
dependencies. This paper also contains the first DOP model trained by means of
a maximum likelihood reestimation procedure, which solves some of the
theoretical shortcomings of previous DOP models.Comment: 4 page
Algebraicity of the Appell-Lauricella and Horn hypergeometric functions
We extend Schwarz' list of irreducible algebraic Gauss functions to the four
classes of Appell-Lauricella functions in several variables and the 14 complete
Horn functions in two variables. This gives an example of a family of functions
such that for any number of variables there are infinitely many algebraic
functions, namely the Lauricella functions.Comment: 24 pages, 6 tables, 2 figure
An improved parser for data-oriented lexical-functional analysis
We present an LFG-DOP parser which uses fragments from LFG-annotated
sentences to parse new sentences. Experiments with the Verbmobil and Homecentre
corpora show that (1) Viterbi n best search performs about 100 times faster
than Monte Carlo search while both achieve the same accuracy; (2) the DOP
hypothesis which states that parse accuracy increases with increasing fragment
size is confirmed for LFG-DOP; (3) LFG-DOP's relative frequency estimator
performs worse than a discounted frequency estimator; and (4) LFG-DOP
significantly outperforms Tree-DOP is evaluated on tree structures only.Comment: 8 page
A classification of the irreducible algebraic A-hypergeometric functions associated to planar point configurations
We consider A-hypergeometric functions associated to normal sets in the
plane. We give a classification of all point configurations for which there
exists a parameter vector such that the associated hypergeometric function is
algebraic. In particular, we show that there are no irreducible algebraic
functions if the number of boundary points is sufficiently large and A is not a
pyramid.Comment: 24 pages, 8 tables, 13 figure
Data-Oriented Language Processing. An Overview
During the last few years, a new approach to language processing has started
to emerge, which has become known under various labels such as "data-oriented
parsing", "corpus-based interpretation", and "tree-bank grammar" (cf. van den
Berg et al. 1994; Bod 1992-96; Bod et al. 1996a/b; Bonnema 1996; Charniak
1996a/b; Goodman 1996; Kaplan 1996; Rajman 1995a/b; Scha 1990-92; Sekine &
Grishman 1995; Sima'an et al. 1994; Sima'an 1995-96; Tugwell 1995). This
approach, which we will call "data-oriented processing" or "DOP", embodies the
assumption that human language perception and production works with
representations of concrete past language experiences, rather than with
abstract linguistic rules. The models that instantiate this approach therefore
maintain large corpora of linguistic representations of previously occurring
utterances. When processing a new input utterance, analyses of this utterance
are constructed by combining fragments from the corpus; the
occurrence-frequencies of the fragments are used to estimate which analysis is
the most probable one.
In this paper we give an in-depth discussion of a data-oriented processing
model which employs a corpus of labelled phrase-structure trees. Then we review
some other models that instantiate the DOP approach. Many of these models also
employ labelled phrase-structure trees, but use different criteria for
extracting fragments from the corpus or employ different disambiguation
strategies (Bod 1996b; Charniak 1996a/b; Goodman 1996; Rajman 1995a/b; Sekine &
Grishman 1995; Sima'an 1995-96); other models use richer formalisms for their
corpus annotations (van den Berg et al. 1994; Bod et al., 1996a/b; Bonnema
1996; Kaplan 1996; Tugwell 1995).Comment: 34 pages, Postscrip
A Data-Oriented Model of Literary Language
We consider the task of predicting how literary a text is, with a gold
standard from human ratings. Aside from a standard bigram baseline, we apply
rich syntactic tree fragments, mined from the training set, and a series of
hand-picked features. Our model is the first to distinguish degrees of highly
and less literary novels using a variety of lexical and syntactic features, and
explains 76.0 % of the variation in literary ratings.Comment: To be published in EACL 2017, 11 page
Reasoning processes involved in ICT-mediated design communication
Conversational interaction is central to architectural design practice. New information and communication technologies (ICT) change the designer’s traditional way of communicating and interacting. In this paper we investigate how communication in the design process might be supported using ICT. With this aim, we study a text-based Skype conversation between a design teacher and a design student. We consider this conversation as part of an architectural design process and analyse it using linkography. From the linkograph analysis, specific features are identified that apply specifically to text-based Skype interactions. We conclude that online text-based Skype interaction can be one of the many possible interactions by means of communication media (sketching, conversation, modelling, and so forth) during the design process, and provides a distinct set of characteristics that might be considered by the designer
- …