42 research outputs found

    CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania

    Get PDF
    CLIFF is the Computational Linguists\u27 Feedback Forum. We are a group of students and faculty who gather once a week to hear a presentation and discuss work currently in progress. The \u27feedback\u27 in the group\u27s name is important: we are interested in sharing ideas, in discussing ongoing research, and in bringing together work done by the students and faculty in Computer Science and other departments. However, there are only so many presentations which we can have in a year. We felt that it would be beneficial to have a report which would have, in one place, short descriptions of the work in Natural Language Processing at the University of Pennsylvania. This report then, is a collection of abstracts from both faculty and graduate students, in Computer Science, Psychology and Linguistics. We want to stress the close ties between these groups, as one of the things that we pride ourselves on here at Penn is the communication among different departments and the inter-departmental work. Rather than try to summarize the varied work currently underway at Penn, we suggest reading the abstracts to see how the students and faculty themselves describe their work. The report illustrates the diversity of interests among the researchers here, as well as explaining the areas of common interest. In addition, since it was our intent to put together a document that would be useful both inside and outside of the university, we hope that this report will explain to everyone some of what we are about

    Anaphoric resolution of zero pronouns in Chinese in translation and reading comprehension

    Get PDF
    The primary aim of the thesis is to investigate some of the processes of reading Chinese text by means of comparing and analysing approximately 100 parallel translations of four texts from Chinese to English. The translations are answers to A Level examination questions. The focus of the investigation is interpretation of the zero pronoun, a common phenomenon in Chinese, which often requires explicitation when translated into English. The secondary aim is to show how translation gives evidence of comprehension, as shown by the variation in interpretation of zero pronouns. The thesis reviews relevant psycholinguistic research into reading, particularly reading of Chinese text. This is followed by reviews of relevant research into translation as a reading activity, and a discussion of its role in language teaching and testing.The core of the thesis is the discussion of the zero pronoun in Chinese, including discussion of anaphoric choice - the writer's decision on when to use zero in preference to an explicit anaphoric form - and of anaphoric resolution - how a reader decides what a zero pronoun refers to. Anaphoric resolution may be problematic for less experienced readers of Chinese owing to its lack of rich morphological inflection which, in other languages, provides the reader with information. Some of the key ideas on anaphoric choice and resolution are then applied to the analysis of the data in the parallel translations. It would appear that factors in Chinese texts which have an effect on comprehending zero pronouns are antecedent distance, topic persistence, abstraction, multiplicity of arguments and the meaning of the verb. Characteristics of the reader which may affect comprehension of the zero pronoun include personal schemata which may lead to elaborative inferences. On the basis of the data I suggest that mark schemes could be devised on a scalar system encompassing optimal solution, proximal solution and nonsolution, which might help to solve the problem of variability in marking translation.A by-product of the thesis, and an avenue for further research, is the apparent close relationship between idea units, clause length, punctuation breaks and antecedent distance in Chinese texts and saccade length and working memory capacity in the reader of Chinese

    CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania

    Get PDF
    The Computational Linguistics Feedback Forum (CLIFF) is a group of students and faculty who gather once a week to discuss the members\u27 current research. As the word feedback suggests, the group\u27s purpose is the sharing of ideas. The group also promotes interdisciplinary contacts between researchers who share an interest in Cognitive Science. There is no single theme describing the research in Natural Language Processing at Penn. There is work done in CCG, Tree adjoining grammars, intonation, statistical methods, plan inference, instruction understanding, incremental interpretation, language acquisition, syntactic parsing, causal reasoning, free word order languages, ... and many other areas. With this in mind, rather than trying to summarize the varied work currently underway here at Penn, we suggest reading the following abstracts to see how the students and faculty themselves describe their work. Their abstracts illustrate the diversity of interests among the researchers, explain the areas of common interest, and describe some very interesting work in Cognitive Science. This report is a collection of abstracts from both faculty and graduate students in Computer Science, Psychology and Linguistics. We pride ourselves on the close working relations between these groups, as we believe that the communication among the different departments and the ongoing inter-departmental research not only improves the quality of our work, but makes much of that work possible

    Proceedings of the Sixth International Conference Formal Approaches to South Slavic and Balkan languages

    Get PDF
    Proceedings of the Sixth International Conference Formal Approaches to South Slavic and Balkan Languages publishes 22 papers that were presented at the conference organised in Dubrovnik, Croatia, 25-28 Septembre 2008

    A constraint-based hypergraph partitioning approach to coreference resolution

    Get PDF
    The objectives of this thesis are focused on research in machine learning for coreference resolution. Coreference resolution is a natural language processing task that consists of determining the expressions in a discourse that mention or refer to the same entity. The main contributions of this thesis are (i) a new approach to coreference resolution based on constraint satisfaction, using a hypergraph to represent the problem and solving it by relaxation labeling; and (ii) research towards improving coreference resolution performance using world knowledge extracted from Wikipedia. The developed approach is able to use entity-mention classi cation model with more expressiveness than the pair-based ones, and overcome the weaknesses of previous approaches in the state of the art such as linking contradictions, classi cations without context and lack of information evaluating pairs. Furthermore, the approach allows the incorporation of new information by adding constraints, and a research has been done in order to use world knowledge to improve performances. RelaxCor, the implementation of the approach, achieved results in the state of the art, and participated in international competitions: SemEval-2010 and CoNLL-2011. RelaxCor achieved second position in CoNLL-2011.La resolució de correferències és una tasca de processament del llenguatge natural que consisteix en determinar les expressions d'un discurs que es refereixen a la mateixa entitat del mon real. La tasca té un efecte directe en la minería de textos així com en moltes tasques de llenguatge natural que requereixin interpretació del discurs com resumidors, responedors de preguntes o traducció automàtica. Resoldre les correferències és essencial si es vol poder “entendre” un text o un discurs. Els objectius d'aquesta tesi es centren en la recerca en resolució de correferències amb aprenentatge automàtic. Concretament, els objectius de la recerca es centren en els següents camps: + Models de classificació: Els models de classificació més comuns a l'estat de l'art estan basats en la classificació independent de parelles de mencions. Més recentment han aparegut models que classifiquen grups de mencions. Un dels objectius de la tesi és incorporar el model entity-mention a l'aproximació desenvolupada. + Representació del problema: Encara no hi ha una representació definitiva del problema. En aquesta tesi es presenta una representació en hypergraf. + Algorismes de resolució. Depenent de la representació del problema i del model de classificació, els algorismes de ressolució poden ser molt diversos. Un dels objectius d'aquesta tesi és trobar un algorisme de resolució capaç d'utilitzar els models de classificació en la representació d'hypergraf. + Representació del coneixement: Per poder administrar coneixement de diverses fonts, cal una representació simbòlica i expressiva d'aquest coneixement. En aquesta tesi es proposa l'ús de restriccions. + Incorporació de coneixement del mon: Algunes correferències no es poden resoldre només amb informació lingüística. Sovint cal sentit comú i coneixement del mon per poder resoldre coreferències. En aquesta tesi es proposa un mètode per extreure coneixement del mon de Wikipedia i incorporar-lo al sistem de resolució. Les contribucions principals d'aquesta tesi son (i) una nova aproximació al problema de resolució de correferències basada en satisfacció de restriccions, fent servir un hypergraf per representar el problema, i resolent-ho amb l'algorisme relaxation labeling; i (ii) una recerca per millorar els resultats afegint informació del mon extreta de la Wikipedia. L'aproximació presentada pot fer servir els models mention-pair i entity-mention de forma combinada evitant així els problemes que es troben moltes altres aproximacions de l'estat de l'art com per exemple: contradiccions de classificacions independents, falta de context i falta d'informació. A més a més, l'aproximació presentada permet incorporar informació afegint restriccions i s'ha fet recerca per aconseguir afegir informació del mon que millori els resultats. RelaxCor, el sistema que ha estat implementat durant la tesi per experimentar amb l'aproximació proposada, ha aconseguit uns resultats comparables als millors que hi ha a l'estat de l'art. S'ha participat a les competicions internacionals SemEval-2010 i CoNLL-2011. RelaxCor va obtenir la segona posició al CoNLL-2010

    A neurolinguistic approach to pronominal resumption in Akan focus constructions

    Get PDF
    The current project explored the phonological and syntactic aspects of Akan pronominal resumption. The grammatical tone features of the resumptive pronoun and the clause determiner were assessed in Akan speakers with agrammatism. We found that the resumptive pronoun worsens wh-question comprehension in agrammatic speakers. However, the production of pronominal resumption was relatively spared. The ERP study investigated Akan native speakers’ sensitivity to the distribution of the resumptive pronoun by creating word-order and animacy violations. Our study represents a novel addition to the sentence processing field, as it looks into the interface between syntax, semantics, and phonology in Akan pronominal resumption

    Proceedings of the Conference on Natural Language Processing 2010

    Get PDF
    This book contains state-of-the-art contributions to the 10th conference on Natural Language Processing, KONVENS 2010 (Konferenz zur Verarbeitung natürlicher Sprache), with a focus on semantic processing. The KONVENS in general aims at offering a broad perspective on current research and developments within the interdisciplinary field of natural language processing. The central theme draws specific attention towards addressing linguistic aspects ofmeaning, covering deep as well as shallow approaches to semantic processing. The contributions address both knowledgebased and data-driven methods for modelling and acquiring semantic information, and discuss the role of semantic information in applications of language technology. The articles demonstrate the importance of semantic processing, and present novel and creative approaches to natural language processing in general. Some contributions put their focus on developing and improving NLP systems for tasks like Named Entity Recognition or Word Sense Disambiguation, or focus on semantic knowledge acquisition and exploitation with respect to collaboratively built ressources, or harvesting semantic information in virtual games. Others are set within the context of real-world applications, such as Authoring Aids, Text Summarisation and Information Retrieval. The collection highlights the importance of semantic processing for different areas and applications in Natural Language Processing, and provides the reader with an overview of current research in this field
    corecore