Search CORE

1,369 research outputs found

Treebank-based acquisition of Chinese LFG resources for parsing and generation

Author: Guo Yuqing
Publication venue: Dublin City University. School of Computing
Publication date: 01/11/2009
Field of study

This thesis describes a treebank-based approach to automatically acquire robust,wide-coverage Lexical-Functional Grammar (LFG) resources for Chinese parsing and generation, which is part of a larger project on the rapid construction of deep, large-scale, constraint-based, multilingual grammatical resources. I present an application-oriented LFG analysis for Chinese core linguistic phenomena and (in cooperation with PARC) develop a gold-standard dependency-bank of Chinese f-structures for evaluation. Based on the Penn Chinese Treebank, I design and implement two architectures for inducing Chinese LFG resources, one annotation-based and the other dependency conversion-based. I then apply the f-structure acquisition algorithm together with external, state-of-the-art parsers to parsing new text into "proto" f-structures. In order to convert "proto" f-structures into "proper" f-structures or deep dependencies, I present a novel Non-Local Dependency (NLD) recovery algorithm using subcategorisation frames and f-structure paths linking antecedents and traces in NLDs extracted from the automatically-built LFG f-structure treebank. Based on the grammars extracted from the f-structure annotated treebank, I develop a PCFG-based chart generator and a new n-gram based pure dependency generator to realise Chinese sentences from LFG f-structures. The work reported in this thesis is the first effort to scale treebank-based, probabilistic Chinese LFG resources from proof-of-concept research to unrestricted, real text. Although this thesis concentrates on Chinese and LFG, many of the methodologies, e.g. the acquisition of predicate-argument structures, NLD resolution and the PCFG- and dependency n-gram-based generation models, are largely language and formalism independent and should generalise to diverse languages as well as to labelled bilexical dependency representations other than LFG

Irish Universities

DCU Online Research Access Service

The Future of Information Sciences : INFuture2009 : Digital Resources and Knowledge Sharing

Author
Publication venue: Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb
Publication date: 01/11/2009
Field of study

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Automatic correction of grammatical errors in non-native English text

Author: Lee John Sie Yuen, 1977-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2009
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 99-107).Learning a foreign language requires much practice outside of the classroom. Computer-assisted language learning systems can help fill this need, and one desirable capability of such systems is the automatic correction of grammatical errors in texts written by non-native speakers. This dissertation concerns the correction of non-native grammatical errors in English text, and the closely related task of generating test items for language learning, using a combination of statistical and linguistic methods. We show that syntactic analysis enables extraction of more salient features. We address issues concerning robustness in feature extraction from non-native texts; and also design a framework for simultaneous correction of multiple error types. Our proposed methods are applied on some of the most common usage errors, including prepositions, verb forms, and articles. The methods are evaluated on sentences with synthetic and real errors, and in both restricted and open domains. A secondary theme of this dissertation is that of user customization. We perform a detailed analysis on a non-native corpus, illustrating the utility of an error model based on the mother tongue. We study the benefits of adjusting the correction models based on the quality of the input text; and also present novel methods to generate high-quality multiple-choice items that are tailored to the interests of the user.by John Sie Yuen Lee.Ph.D

DSpace@MIT

JTEC panel report on machine translation in Japan

Author: Carbonell Jaime
Johnson David
Rich Elaine
Tomita Masaru
Vasconcellos Muriel
Wilks Yorick
Publication venue
Publication date
Field of study

The goal of this report is to provide an overview of the state of the art of machine translation (MT) in Japan and to provide a comparison between Japanese and Western technology in this area. The term 'machine translation' as used here, includes both the science and technology required for automating the translation of text from one human language to another. Machine translation is viewed in Japan as an important strategic technology that is expected to play a key role in Japan's increasing participation in the world economy. MT is seen in Japan as important both for assimilating information into Japanese as well as for disseminating Japanese information throughout the world. Most of the MT systems now available in Japan are transfer-based systems. The majority of them exploit a case-frame representation of the source text as the basis of the transfer process. There is a gradual movement toward the use of deeper semantic representations, and some groups are beginning to look at interlingua-based systems

NASA Technical Reports Server

English Index

Author: Pálfi Lórand-Levente
Publication venue: Aarhus University, Faculty of Arts, School of Communication and Culture
Publication date: 13/03/2007
Field of study

No abstract

Tidsskrift.dk (Det Kongelige Bibliotek)

Recommended from our members

Natural Arabic language text understanding

Author: Al-Khonaizi Mohammed Taqi
Publication venue: University of Greenwich,
Publication date: 01/03/1999
Field of study

The most challenging part of natural language understanding is the representation of meaning. The current representation techniques are not sufficient to resolve the ambiguities, especially when the meaning is to be used for interrogation at a later stage. Arabic language represents a challenging field for Natural Language Processing (NLP) because of its rich eloquence and free word order, but at the same time it is a good platform to capture understanding because of its rich computational, morphological and grammar rules. Among different representation techniques, Lexical Functional Grammar (LFG) theory is found to be best suited for this task because of its structural approach. LFG lays down a computational approach towards NLP, especially the constituent and the functional structures, and models the completeness of relationships among the contents of each structure internally, as well as among the structures externally. The introduction of Artificial Intelligence (AI) techniques, such as knowledge representation and inferencing, enhances the capture of meaning by utilising domain specific common sense knowledge embedded in the model of domain of discourse and the linguistic rules that have been captured from the Arabic language grammar. This work has achieved the following results: (i) It is the first attempt to apply the LFG formalism on a full Arabic declarative text that consists of more than one paragraph. (ii) It extends the semantic structure of the LFG theory by incorporating a representation based on the thematic-role frames theory. (iii) It extends to the LFG theory to represent domain specific common sense knowledge. (iv) It automates the production process of the functional and semantic structures. (v) It automates the production process of domain specific common sense knowledge structure, which enhances the understanding ability of the system and resolves most ambiguities in subsequent question-answer sessions

Greenwich Academic Literature Archive

Statistical Models and Search Algorithms for Machine Translation

Author: Bertoldi Nicola
Publication venue
Publication date: 01/01/2005
Field of study

Not availabl

Archivio della ricerca - Fondazione Bruno Kessler

Unitn-eprints Research

Bostonia: v. 64, no. 1

Author: Bushkoff Len
Cox Jay
Goldman Martin S.
Hivnor Robert
Kay Jane Holtz
Kington Miles
Levin Jack
Lubin Peter
Minkin Tracey
Murray-Brown Jeremy
O'Donnell James
Papale Richard
Queijo Jon
Riely Elizabeth
Skvorecky Josef
Publication venue: Boston University
Publication date: 01/01/1990
Field of study

Founded in 1900, Bostonia magazine is Boston University's main alumni publication, which covers alumni and student life, as well as university activities, events, and programs

Boston University Institutional Repository (OpenBU)

Studies in the linguistic sciences. 17-18 (1987-1988)

Author
Publication venue: Urbana, Ill. : Dept. of Linguistics, University of Illinois,
Publication date
Field of study

Illinois Digital Environment for Access to Learning and Scholarship Repository