38,208 research outputs found
Why swimming is just as difficult as dying for japanese learners of english
While both Japanese and English have a grammatic al form denoting the progressive, the two forms (te-iru & be+ing) interact differently with the inherent semantics of the verb to which they attach (Kindaichi, 1950; McClure, 1995; Shirai, 2000). Japanese change of state verbs are incompatible with a progressive interpretation, allowing only a resultative interpretation of V+ te-iru, while a progressive interpretation is preferred for activity predicates. English be+ing denotes a progressive interpretation regardless of the lexical semantics of the verb. The question that arises is how we can account for the fact that change of state verbs like dying can denote a progressive interpretation in English, but not in Japanese. While researchers such as Kageyama (1996) and Ogihara (1998, 1999) propose that the difference lies in the lexical semantics of the verbs themselves, others such as McClure (1995) have argued that the difference lies in the semantics of the grammatical forms, be+ing and te-iru. We present results from an experimental study of Japanese learnersâ interpretation of the English progressive which provide support for McClureâs proposal. Results indicate that independent of verb type, learners had significantly more difficulty with the past progressive. We argue that knowledge of L2 semantics-syntax correspondences proceeds not on the basis of L1 lexical semantic knowledge, but on the basis of grammatical forms
Dependency parsing of Turkish
The suitability of different parsing methods for different languages is an important topic in
syntactic parsing. Especially lesser-studied languages, typologically different from the languages
for which methods have originally been developed, poses interesting challenges in this respect.
This article presents an investigation of data-driven dependency parsing of Turkish, an agglutinative
free constituent order language that can be seen as the representative of a wider class
of languages of similar type. Our investigations show that morphological structure plays an
essential role in finding syntactic relations in such a language. In particular, we show that
employing sublexical representations called inflectional groups, rather than word forms, as the
basic parsing units improves parsing accuracy. We compare two different parsing methods, one
based on a probabilistic model with beam search, the other based on discriminative classifiers and
a deterministic parsing strategy, and show that the usefulness of sublexical units holds regardless
of parsing method.We examine the impact of morphological and lexical information in detail and
show that, properly used, this kind of information can improve parsing accuracy substantially.
Applying the techniques presented in this article, we achieve the highest reported accuracy for
parsing the Turkish Treebank
Gathering Statistics to Aspectually Classify Sentences with a Genetic Algorithm
This paper presents a method for large corpus analysis to semantically
classify an entire clause. In particular, we use cooccurrence statistics among
similar clauses to determine the aspectual class of an input clause. The
process examines linguistic features of clauses that are relevant to aspectual
classification. A genetic algorithm determines what combinations of linguistic
features to use for this task.Comment: postscript, 9 pages, Proceedings of the Second International
Conference on New Methods in Language Processing, Oflazer and Somers ed
Learning Sentence-internal Temporal Relations
In this paper we propose a data intensive approach for inferring
sentence-internal temporal relations. Temporal inference is relevant for
practical NLP applications which either extract or synthesize temporal
information (e.g., summarisation, question answering). Our method bypasses the
need for manual coding by exploiting the presence of markers like after", which
overtly signal a temporal relation. We first show that models trained on main
and subordinate clauses connected with a temporal marker achieve good
performance on a pseudo-disambiguation task simulating temporal inference
(during testing the temporal marker is treated as unseen and the models must
select the right marker from a set of possible candidates). Secondly, we assess
whether the proposed approach holds promise for the semi-automatic creation of
temporal annotations. Specifically, we use a model trained on noisy and
approximate data (i.e., main and subordinate clauses) to predict
intra-sentential relations present in TimeBank, a corpus annotated rich
temporal information. Our experiments compare and contrast several
probabilistic models differing in their feature space, linguistic assumptions
and data requirements. We evaluate performance against gold standard corpora
and also against human subjects
- âŠ