Search CORE

53 research outputs found

Training and Scaling Preference Functions for Disambiguation

Author: Alshawi Hiyan
Carter David
Publication venue
Publication date: 01/01/1994
Field of study

We present an automatic method for weighting the contributions of preference functions used in disambiguation. Initial scaling factors are derived as the solution to a least-squares minimization problem, and improvements are then made by hill-climbing. The method is applied to disambiguating sentences in the ATIS (Air Travel Information System) corpus, and the performance of the resulting scaling factors is compared with hand-tuned factors. We then focus on one class of preference function, those based on semantic lexical collocations. Experimental results are presented showing that such functions vary considerably in selecting correct analyses. In particular we define a function that performs significantly better than ones based on mutual information and likelihood ratios of lexical associations.Comment: To appear in Computational Linguistics (probably volume 20, December 94). LaTeX, 21 page

arXiv.org e-Print Archive

CiteSeerX

Transfer through quasi logical form - A new approach to machine translation

Author: Alshawi Hiyan
Carter David
Gambäck Björn
Pulman Steve
Rayner Manny
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/1991
Field of study

This Document is an introduction to a research project aimed at producing a prototype system for on-line translation of typed dialogues between speakers of different natural languages. The work was carried out jointly by SICS and SRI Cambridge. The resulting prototype system (called Billingual Conversation Interpreter, or BCI) translates between English and Swedish in both Directions. The major components of the BCI are two copies of the SRI Core Language Engine, equipped with English and Swedish grammars respectively. These are linked by the transfer and disambiguation components. Translation takes place by analyzing the source-language sentence into Quasi Logical Form (QLF), a linguistically motivated logical representation, transferring this onto a target-language QLF, and generating a target-language sentence. We believe that the project was successful in demonstrating the feasibility of using these techniques for interactive translation applications, and provides a sound basis for development of a large scale message translator system. The final section of the paper points to several possible follow-on projects aimed in the direction of practically usable commercial systems

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Bilingual conversation interpreter : a prototype interactive message translator. Final report

Author: Alshawi Hiyan
Brown Carl
Carter David
Gambäck Björn
Pulman Steve
Rayner Manny
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/1991
Field of study

This document is the final report for a research project aimed at producing a prototype system for on-line translation of typed dialogues between speakers of different natural languages. The work was carried out jointly by SICS and SRI Cambridge. The resulting prototype system (called Billingual Conversation Interpreter, or BCI) translates between english and Swedish in both directions.The Major components of the BCI are two copies of the SRI Core Language Engine, equipped with English and Swedish grammars respectively. These are linked by the transfer and disambiguation components. Translation takes place by analyzing the source-language sentence into Quasi Logical Form ( QLF), a linguistically motivated logical representation, transferring this into a target-language QLF, and generating a target-language sentence. When ambiguities occur that cannot be resolved automatically, they are clarified by Querying the appropriate user. The clarification dialogue presupposes no knowledge of either linguistics or the other language. The prototype system has a broad grammatical coverage, a initial vocabulary of about 1000 words together with vocabulary expansion tools, and a set of English-Swedish transfer rules. The formalism developed for coding this linguistic information make it relatively easy to extend the system. We believe that the project was successful in demonstrating the feasibility of using these techniques for interactive translation applications, and provides a sound basis for development of a large scale message translator system with potential for commercial exploitation.The main sections of this report are the following: * A non-technical introduction, summarizing the BCI's design , and containing a sample session. * An overview of the Swedish version of the CLE. * A detailed discussion of the theory and practice of QLF transfer. * A description of the interactive disambiguation method. * Suggestions for possible follow-on projects aimed in the direction of practically usable commercial systems

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Effective Utterance Classification with Unsupervised Phonotactic Models

Author: Hiyan Alshawi
Publication venue
Publication date: 01/01/2003
Field of study

This paper describes a method for utterance classification that does not require manual transcription of training data. The method combines domain independent acoustic models with off-the-shelf classifiers to give utterance classification performance that is surprisingly close to what can be achieved using conventional word-trigram recognition requiring manual transcription. In our method, unsupervised training is first used to train a phone n-gram model for a particular domain; the output of recognition with this model is then passed to a phone-string classifier. The classification accuracy of the method is evaluated on three different spoken language system domains

CiteSeerX