Search CORE

7,527 research outputs found

GED - a generalised syntax editor : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Computer Science at Massey University

Author: Moretti Giovanni Serafino
Publication venue: 'Massey University'
Publication date: 01/01/1984
Field of study

This thesis traces the development of a full-screen syntax-directed editor - a type of editor that operates on a program in terms of its syntactic tree structure instead of its sequential character representation. The editor is table-driven, reading as input an extended BNF syntax of the target language. It can therefore be used for any language whose syntax can be defined in EBNF. Print formatting information can be included with the syntactic definition to enable programs to be pretty-printed when they are displayed. The user is presented with a pretty-printed skeletal outline of a program with the currently selected construct highlighted and all required syntactic items provided by the editor. Any constructs with alternatives, such as "", which occurs in many languages, are initially denoted by a placeholder in the form of a non-terminal name (i.e. "") which is expanded when the user indicates which alternative is wanted. All symbols entered by the user are parsed immediately and any erroneous symbols rejected, making it impossible to create a syntactically incorrect program. The editor cannot detect semantic errors as no semantic information is available from the EBNF syntax. However the first use of all identifiers is flagged by the editor as an aid to the detection of undeclared identifiers. A "help" area at the bottom of the screen continuously displays a list of the correct next symbols and the syntactic definition of the currently selected program construct. This display, together with a multi-level "undo" command and the provision of a skeletal program by the editor, provides a way of exploring the various constructs in a programming language, while ensuring the syntactic correctness of the resultant program

Massey Research Online

Hybrid rule-based - example-based MT: feeding apertium with sub-sentential translation units

Author: Forcada Mikel
Sánchez-Martínez Felipe
Way Andy
Publication venue
Publication date: 01/01/2009
Field of study

This paper describes a hybrid machine translation (MT) approach that consists of integrating bilingual chunks (sub-sentential translation units) obtained from parallel corpora into an MT system built using the Apertium free/open-source rule-based machine translation platform, which uses a shallow-transfer translation approach. In the integration of bilingual chunks, special care has been taken so as not to break the application of the existing Apertium structural transfer rules, since this would increase the number of ungrammatical translations. The method consists of (i) the application of a dynamic-programming algorithm to compute the best translation coverage of the input sentence given the collection of bilingual chunks available; (ii) the translation of the input sentence as usual by Apertium; and (iii) the application of a language model to choose one of the possible translations for each of the bilingual chunks detected. Results are reported for the translation from English-to-Spanish, and vice versa, when marker-based bilingual chunks automatically obtained from parallel corpora are used

Repositorio Institucional de la Universidad de Alicante

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Irish Universities

DCU Online Research Access Service

Preparing, restructuring, and augmenting a French treebank: lexicalised parsers or coherent treebanks?

Author: Schluter Natalie
van Genabith Josef
Publication venue
Publication date: 01/01/2007
Field of study

We present the Modified French Treebank (MFT), a completely revamped French Treebank, derived from the Paris 7 Treebank (P7T), which is cleaner, more coherent, has several transformed structures, and introduces new linguistic analyses. To determine the effect of these changes, we investigate how theMFT fares in statistical parsing. Probabilistic parsers trained on the MFT training set (currently 3800 trees) already perform better than their counterparts trained on five times the P7T data (18,548 trees), providing an extreme example of the importance of data quality over quantity in statistical parsing. Moreover, regression analysis on the learning curve of parsers trained on the MFT lead to the prediction that parsers trained on the full projected 18,548 tree MFT training set will far outscore their counterparts trained on the full P7T. These analyses also show how problematic data can lead to problematic conclusions–in particular, we find that lexicalisation in the probabilistic parsing of French is probably not as crucial as was once thought (Arun and Keller (2005))

Irish Universities

DCU Online Research Access Service

An approach to software maintenance support using a syntactic source code analyser data base : this thesis is presented in a partial fulfillment of the requirements for the degree of Master of Arts in Computer Science at Massey University

Author: Parkin Peter Vivian
Publication venue: 'Massey University'
Publication date: 01/01/1987
Field of study

In this thesis, the development of a software maintenance tool called a syntactic source code analyser (SSCA) is summarised. An SSCA supports other maintenance tools which interact with source code by creating a data base of source information which has links to a formatted version of program source code. The particular SSCA presented handles programs written in a version of COBOL. Before developing a SSCA system, aspects of software maintenance need to be considered. Hence, the scope, definitions and problems of maintenance activities are briefly reviewed and maintenance support through environments, software metrics, and specific tools and techniques examined. A complete maintenance support environment for an application is found to overlap considerably with the application documentation system and shares some tools with development environments. Program source code is also identified as the fundamental documentation of an application and interaction with this source code is a requirement of many maintenance support tools

Massey Research Online

Parsing Using the Role and Reference Grammar Paradigm

Author: Guest E
Publication venue
Publication date: 01/06/2009
Field of study

Much effort has been put into finding ways of parsing natural language. Role and Reference Grammar (RRG) is a linguistic paradigm that has credibility in linguistic circles. In this paper we give a brief overview of RRG and show how this can be implemented into a standard rule-based parser. We used the chart parser to test the concept on sentences from student work. We present results that show the potential role of this method for parsing ungrammatical sentences

Leeds Beckett Repository