5,207 research outputs found
The syntactic processing of particles in Japanese spoken language
Particles fullfill several distinct central roles in the Japanese language.
They can mark arguments as well as adjuncts, can be functional or have semantic
funtions. There is, however, no straightforward matching from particles to
functions, as, e.g., GA can mark the subject, the object or an adjunct of a
sentence. Particles can cooccur. Verbal arguments that could be identified by
particles can be eliminated in the Japanese sentence. And finally, in spoken
language particles are often omitted. A proper treatment of particles is thus
necessary to make an analysis of Japanese sentences possible. Our treatment is
based on an empirical investigation of 800 dialogues. We set up a type
hierarchy of particles motivated by their subcategorizational and
modificational behaviour. This type hierarchy is part of the Japanese syntax in
VERBMOBIL.Comment: 8 page
The syntactic processing of particles in Japanese spoken language
Particles fullfill several distinct central roles in the Japanese language. They can mark arguments as well as adjuncts, can be functional or have semantic functions. There is, however, no straightforward matching from particles to functions, as, e.g., 'ga' can mark the subject, the object or the adjunct of a sentence. Particles can cooccur. Verbal arguments that could be identified by particles can be eliminated in the Japanese sentence. And finally, in spoken language particles are often omitted. A proper treatment of particles is thus necessary to make an analysis of Japanese sentences possible. Our treatment is based on an empirical investigation of 800 dialogues. We set up a type hierarchy of particles motivated by their subcategorizational and modificational behaviour. This type hierarchy is part of the Japanese syntax in VERBMOBIL
Efficient deep processing of japanese
We present a broad coverage Japanese grammar written in the HPSG formalism with MRS semantics. The grammar is created for use in real world applications, such that robustness and performance issues play an important role. It is connected to a POS tagging and word segmentation tool. This grammar is being developed in a multilingual context, requiring MRS structures that are easily comparable across languages
Head-initial constructions in japanese
Japanese is often taken to be strictly head-final in its syntax. In our work on a broad-coverage, precision implemented HPSG for Japanese, we have found that while this is generally true, there are nonetheless a few minor exceptions to the broad trend. In this paper, we describe the grammar engineering project, present the exceptions we have found, and conclude that this kind of phenomenon motivates on the one hand the HPSG type hierarchical approach which allows for the statement of both broad generalizations and exceptions to those generalizations and on the other hand the usefulness of grammar engineering as a means of testing linguistic hypotheses
A preliminary bibliography on focus
[I]n its present form, the bibliography contains approximately 1100 entries. Bibliographical work is never complete, and the present one is still modest in a number of respects. It is not annotated, and it still contains a lot of mistakes and inconsistencies. It has nevertheless reached a stage which justifies considering the possibility of making it available to the public. The first step towards this is its pre-publication in the form of this working paper. […]
The bibliography is less complete for earlier years. For works before 1970, the bibliographies of Firbas and Golkova 1975 and Tyl 1970 may be consulted, which have not been included here
A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena
Word reordering is one of the most difficult aspects of statistical machine
translation (SMT), and an important factor of its quality and efficiency.
Despite the vast amount of research published to date, the interest of the
community in this problem has not decreased, and no single method appears to be
strongly dominant across language pairs. Instead, the choice of the optimal
approach for a new translation task still seems to be mostly driven by
empirical trials. To orientate the reader in this vast and complex research
area, we present a comprehensive survey of word reordering viewed as a
statistical modeling challenge and as a natural language phenomenon. The survey
describes in detail how word reordering is modeled within different
string-based and tree-based SMT frameworks and as a stand-alone task, including
systematic overviews of the literature in advanced reordering modeling. We then
question why some approaches are more successful than others in different
language pairs. We argue that, besides measuring the amount of reordering, it
is important to understand which kinds of reordering occur in a given language
pair. To this end, we conduct a qualitative analysis of word reordering
phenomena in a diverse sample of language pairs, based on a large collection of
linguistic knowledge. Empirical results in the SMT literature are shown to
support the hypothesis that a few linguistic facts can be very useful to
anticipate the reordering characteristics of a language pair and to select the
SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic
Modal Markers in Japanese: A Study of Learners’ Use before and after Study Abroad
Japanese discourse requires speakers to index, in a relatively explicit manner, their stance toward the propositional information as well as the hearer. This is done, among other things, by means of a grammaticalized set of modal markers. Although previous research suggests that the use of modal expressions by second language learners differs from that of native users, little is known about “typical” native or non-native behavior. This study aims (a) to delineate native and non-native usage by a quantitative examination of a broad range of Japanese modal categories, and qualitative analyses of a subset of potentially problematic categories among them, and (b) to identify possible developmental trajectories, by means of a longitudinal observation of learners’ verbal production before and after study abroad in Japan. We find that modal categories realized by non- transparent or non-salient markers (e.g., explanatory modality no da, or utterance modality sentence-final particles) pose particular challenges in spite of their relatively high availability in the input, and we discuss this finding in terms of processing constraints that arguably affect learners’ acquisition of the grammaticalized modal markers
- …