707 research outputs found

    Mining Large-scale Event Knowledge from Web Text

    Get PDF
    AbstractThis paper addresses the problem of automatic acquisition of semantic relations between events. While previous works on semantic relation automatic acquisition relied on annotated text corpus, it is still unclear how to develop more generic methods to meet the needs of identifying related event pairs and extracting event-arguments (especially the predicate, subject and object). Motivated by this limitation, we develop a three-phased approach that acquires causality from the Web text. First, we use explicit connective markers (such as “because”) as linguistic cues to discover causal related events. Next, we extract the event-arguments based on local dependency parse trees of event expressions. At the last step, we propose a statistical model to measure the potential causal relations. The results of our empirical evaluations on a large-scale Web text corpus show that (a) the use of local dependency tree extensively improves both the accuracy and recall of event-arguments extraction task, and (b) our measure improves the traditional PMI method

    Using Tree Kernels for Classifying Temporal Relations between Events

    Get PDF
    PACLIC 23 / City University of Hong Kong / 3-5 December 200

    CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania

    Get PDF
    CLIFF is the Computational Linguists\u27 Feedback Forum. We are a group of students and faculty who gather once a week to hear a presentation and discuss work currently in progress. The \u27feedback\u27 in the group\u27s name is important: we are interested in sharing ideas, in discussing ongoing research, and in bringing together work done by the students and faculty in Computer Science and other departments. However, there are only so many presentations which we can have in a year. We felt that it would be beneficial to have a report which would have, in one place, short descriptions of the work in Natural Language Processing at the University of Pennsylvania. This report then, is a collection of abstracts from both faculty and graduate students, in Computer Science, Psychology and Linguistics. We want to stress the close ties between these groups, as one of the things that we pride ourselves on here at Penn is the communication among different departments and the inter-departmental work. Rather than try to summarize the varied work currently underway at Penn, we suggest reading the abstracts to see how the students and faculty themselves describe their work. The report illustrates the diversity of interests among the researchers here, as well as explaining the areas of common interest. In addition, since it was our intent to put together a document that would be useful both inside and outside of the university, we hope that this report will explain to everyone some of what we are about

    Human Simulations of Vocabulary Learning

    Get PDF
    The work reported here experimentally investigates a striking generalization about vocabulary acquisition: Noun learning is superior to verb learning in the earliest moments of child language development. The dominant explanation of this phenomenon in the literature invokes differing conceptual requirements for items in these lexical categories: Verbs are cognitively more complex than nouns and so their acquisition must await certain mental developments in the infant. In the present work, we investigate an alternative hypothesis; namely, that it is the information requirements of verb learning, not the conceptual requirements, that crucially determine the acquisition order. Efficient verb learning requires access to structural features of the exposure language and thus cannot take place until a scaffolding of noun knowledge enables the acquisition of clause-level syntax. More generally, we experimentally investigate the hypothesis that vocabulary acquisition takes place via an incremental constraint-satisfaction procedure that bootstraps itself into successively more sophisticated linguistic representations which, in turn, enable new kinds of vocabulary learning. If the experimental subjects were young children, it would be difficult to distinguish between this information-centered hypothesis and the conceptual change hypothesis. Therefore the experimental learners are adults. The items to be “acquired” in the experiments were the 24 most frequent nouns and 24 most frequent verbs from a sample of maternal speech to 18-24-month old infants. The various experiments ask about the kinds of information that will support identification of these words as they occur in mother-to-child discourse. In Experiment 1, subjects were required to identify the words from observing several extralinguistic contexts for their use (silent videos in which mothers are seen uttering the “mystery word” several times to the infants, with each such use cued by a beep or a nonsense word). The findings under these conditions mimicked the known learning trajectory for infants at the inception of speech and comprehension: Nouns are learned far more efficiently than verbs. Experiment 2 showed that the Experiment 1 results are best understood as concreteness differences that are correlated with lexical class membership in the common useage of mothers to young children. Experiment 3 presented (different) subject groups with 24 verbs under varying information Conditions; namely: (1) extralinguistic information; (2) noun-co-occurrence information; (3) both (1) and (2); (4) syntactic-frame information but with nouns and verbs represented by nonsense words; (5) both (2) and (4); (6) both (1) and (5). Each Condition led to greater identification success than the preceding Condition. Moreover, not only the number but the type of verb that was efficiently learned was different under the different information conditions. We discuss these results as consistent with the incremental construction of a highly lexicalized grammar by cognitively and pragmatically sophisticated human infants, but inconsistent with a procedure in which lexical acquisition is independent of and antecedent to syntax acquisition

    Can humain association norm evaluate latent semantic analysis?

    Get PDF
    This paper presents the comparison of word association norm created by a psycholinguistic experiment to association lists generated by algorithms operating on text corpora. We compare lists generated by Church and Hanks algorithm and lists generated by LSA algorithm. An argument is presented on how those automatically generated lists reflect real semantic relations

    Research in the Language, Information and Computation Laboratory of the University of Pennsylvania

    Get PDF
    This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania. It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition. Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue it’s easier than ever to do so: this document is accessible on the “information superhighway”. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authors’ abstracts in the web version of this report. The abstracts describe the researchers’ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn

    Proceedings of the Seventh International Conference Formal Approaches to South Slavic and Balkan languages

    Get PDF
    Proceedings of the Seventh International Conference Formal Approaches to South Slavic and Balkan Languages publishes 17 papers that were presented at the conference organised in Dubrovnik, Croatia, 4-6 Octobre 2010
    • …
    corecore