42,709 research outputs found

    Learning Sentence-internal Temporal Relations

    Get PDF
    In this paper we propose a data intensive approach for inferring sentence-internal temporal relations. Temporal inference is relevant for practical NLP applications which either extract or synthesize temporal information (e.g., summarisation, question answering). Our method bypasses the need for manual coding by exploiting the presence of markers like after", which overtly signal a temporal relation. We first show that models trained on main and subordinate clauses connected with a temporal marker achieve good performance on a pseudo-disambiguation task simulating temporal inference (during testing the temporal marker is treated as unseen and the models must select the right marker from a set of possible candidates). Secondly, we assess whether the proposed approach holds promise for the semi-automatic creation of temporal annotations. Specifically, we use a model trained on noisy and approximate data (i.e., main and subordinate clauses) to predict intra-sentential relations present in TimeBank, a corpus annotated rich temporal information. Our experiments compare and contrast several probabilistic models differing in their feature space, linguistic assumptions and data requirements. We evaluate performance against gold standard corpora and also against human subjects

    Lice in the fur of our language? German irrelevance particles between Dutch and English

    Get PDF
    The present paper compares the distribution of English ‑ever, German immer and/or auch, and Dutch (dan) ook in universal concessive-conditional and nonspecific free relative subordinate clauses (e.g. G. Was auch immer du willst ‘Whatever you want’) and in their elliptically reduced versions (e.g. D. … of wat dan ook ‘… or whatever’). By combining large language-specific corpora such as the DeReKo, SoNaR, and BYU corpora with the smaller multilingual Conver‑ GENTiecorpus, 38,748 instances were obtained while maintaining comparability. Whereas present-day English has only one option in both clausal and elliptical constructions, viz. WH-ever, Dutch and German show more variation: in Dutch, discontinuous W … ook is by far the most frequent option in subordinate clauses, while the complex particle dan ook is largely confined to elliptical constructions. In German subordinate clauses, immer in adjacency to the W-word is the most frequent option, thus corresponding to English WH-ever, but in elliptical constructions auch immer is predominates, thus corresponding to Dutch dan ook

    Memory-Based Shallow Parsing

    Full text link
    We present memory-based learning approaches to shallow parsing and apply these to five tasks: base noun phrase identification, arbitrary base phrase recognition, clause detection, noun phrase parsing and full parsing. We use feature selection techniques and system combination methods for improving the performance of the memory-based learner. Our approach is evaluated on standard data sets and the results are compared with that of other systems. This reveals that our approach works well for base phrase identification while its application towards recognizing embedded structures leaves some room for improvement

    Emergence phenomena in German W-immer/auch-subordinators

    Get PDF
    The present study is concerned with the distributional patterns of the irrelevance particles immer ‘ever’ and auch ‘also’ in German universal concessive conditionals and free relatives (e.g. was immer er auch sagt ‘whatever he says’). Whereas irrelevance is conveyed by a single element in a fixed position in languages like English (-ever), immer and auch occur in multiple positions and combinations. Following the example of Leuschner (2000), the distribution of particles and their combinations is documented and explained using functional motivations. Compared with Leuschner (2000), however, the present study is based on a much larger sample of 23,299 clauses with the W-words was and wer (incl. their inflected forms) from the DeReKo-corpus, allowing for a far more detailed statistical analysis. Special attention is devoted to the distribution of immer and auch (including their combinations) in full subordinate clauses vs. elliptically reduced forms, and to the nature of the resulting patterns as a case of emergent grammar

    Contrastive corpus analysis of cross-linguistic asymmetries in concessive conditionals

    Get PDF

    Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

    No full text
    Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources

    Relevance verbs in English and French: synonymy and its structural properties

    Get PDF
    This study deals with a particular group of predicates called "predicates/verbs of relevance" or "predicates/verbs of indifference" in the literature. Its purpose is to investigate to what extent verbs of this particular group present common structural properties. It therefore seeks to establish the structural manifestations of synonymy. These structural manifestations are not to be found in argument-function mapping a la Levin (1993), but rather in polarity, decategorialization and sentence structure. Corpus data reveal that syntax, semantics and pragmatics interact in particular ways in the field of relevance. This interaction appears to be grounded in pragmatic constraints arising from the principle of relevance (Sperber and Wilson 1986). The basic idea is that, as relevance is presupposed in human communication and need not be expressed, verbs of relevance are more likely to be used with negative than with positive polarity. Used with positive polarity, they tend to occur in sentence forms that present them as strongly presupposed. Used with negative polarity, they are more likely to occur in the focal area of the sentence. As statements about relevance express speaker's points of view, relevance verbs are also markers of intersubjectivity and are therefore subject to grammaticalization phenomena, such as the omission of prepositions

    Same time, across time: simultaneity clauses from late modern to present-day english

    Get PDF
    In this paper we offer a diachronic analysis of simultaneity subordinator as against the background of simultaneity subordinators while, whilst, when from 1650 to the end of the 20th century. The present survey makes use of data extracted from the British English component of ARCHER (version 3.1), focusing in particular on fiction, the register par excellence for the use of simultaneity subordinators. We analyse our data according to a selection of parameters (ordering, verb type, duration, tense and aspect, subject identity, simultaneity type) and show that, against a background of relatively stability, the major change is a dramatic increase in the frequency of simultaneity as-clauses from the first half of the 19th century onwards. Adapting the historical work on stylistic change by Biber and Finegan (1989, 1997), as well as theoretical and experimental accounts of the semantics of English simultaneity markers, we highlight an interesting parallelism between the spread of as-clauses in oral narrative from childhood to adulthood and the spread of as-clauses in modern fiction. In either case, the spread of as may be symptomatic of an evolution in narrative techniques, particularly in respect of the means by which complex events are typically represented

    Combination Strategies for Semantic Role Labeling

    Full text link
    This paper introduces and analyzes a battery of inference models for the problem of semantic role labeling: one based on constraint satisfaction, and several strategies that model the inference as a meta-learning problem using discriminative classifiers. These classifiers are developed with a rich set of novel features that encode proposition and sentence-level information. To our knowledge, this is the first work that: (a) performs a thorough analysis of learning-based inference models for semantic role labeling, and (b) compares several inference strategies in this context. We evaluate the proposed inference strategies in the framework of the CoNLL-2005 shared task using only automatically-generated syntactic information. The extensive experimental evaluation and analysis indicates that all the proposed inference strategies are successful -they all outperform the current best results reported in the CoNLL-2005 evaluation exercise- but each of the proposed approaches has its advantages and disadvantages. Several important traits of a state-of-the-art SRL combination strategy emerge from this analysis: (i) individual models should be combined at the granularity of candidate arguments rather than at the granularity of complete solutions; (ii) the best combination strategy uses an inference model based in learning; and (iii) the learning-based inference benefits from max-margin classifiers and global feedback

    A usage based approach into the acquisition of relative clauses

    Get PDF
    ABSTRACT: Previous research has shown that cross-linguistically relative clauses are acquired late and are considered as a signal of linguistic complexity. This study adapts a usage-based account of relative clause acquisition in Turkish. A corpus based on three databases including 170 recordings of naturalistic mother-child interaction was analysed. The age of children in these three databases are 02;00-03;06, 01;00-02;04 and 00;09-02;09, respectively. The analyses revealed that the use of relative clauses in both the children’s productions and in child-directed speech were extremely scarce. Though previous research underlined the linguistic complexity of relative clauses as a reason for late acquisition, the results of this study point out that scarcity of input should also be regarded as a powerful predictor. The study underlines the availability of other constructions that are functionally parallel to relative clauses. The findings suggest that such structures which are syntactically and morphologically less complex than relative clauses are common in both child directed speech and in children’s productions
    • …
    corecore