Search CORE

664 research outputs found

Designing Your Own Software: An Interim Solution for Intermediate French

Author: Brill Jana A.
Publication venue: 'The University of Kansas'
Publication date: 15/01/1991
Field of study

The University of Kansas: Journals@KU

Biodiversity Informatics

Morphological, syntactic and diacritics rules for automatic diacritization of Arabic sentences

Author: Chennoufi Amine
Mazroui Azzeddine
Publication venue: The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University.
Publication date: 01/04/2017
Field of study

AbstractThe diacritical marks of Arabic language are characters other than letters and are in the majority of cases absent from Arab writings. This paper presents a hybrid system for automatic diacritization of Arabic sentences combining linguistic rules and statistical treatments. The used approach is based on four stages. The first phase consists of a morphological analysis using the second version of the morphological analyzer Alkhalil Morpho Sys. Morphosyntactic outputs from this step are used in the second phase to eliminate invalid word transitions according to the syntactic rules. Then, the system used in the third stage is a discrete hidden Markov model and Viterbi algorithm to determine the most probable diacritized sentence. The unseen transitions in the training corpus are processed using smoothing techniques. Finally, the last step deals with words not analyzed by Alkhalil analyzer, for which we use statistical treatments based on the letters. The word error rate of our system is around 2.58% if we ignore the diacritic of the last letter of the word and around 6.28% when this diacritic is taken into account

Elsevier - Publisher Connector

Directory of Open Access Journals

Using Natural Language as Knowledge Representation in an Intelligent Tutoring System

Author: Jung Sung-Young
Publication venue
Publication date: 01/02/2012
Field of study

Knowledge used in an intelligent tutoring system to teach students is usually acquired from authors who are experts in the domain. A problem is that they cannot directly add and update knowledge if they don’t learn formal language used in the system. Using natural language to represent knowledge can allow authors to update knowledge easily. This thesis presents a new approach to use unconstrained natural language as knowledge representation for a physics tutoring system so that non-programmers can add knowledge without learning a new knowledge representation. This approach allows domain experts to add not only problem statements, but also background knowledge such as commonsense and domain knowledge including principles in natural language. Rather than translating into a formal language, natural language representation is directly used in inference so that domain experts can understand the internal process, detect knowledge bugs, and revise the knowledgebase easily. In authoring task studies with the new system based on this approach, it was shown that the size of added knowledge was small enough for a domain expert to add, and converged to near zero as more problems were added in one mental model test. After entering the no-new-knowledge state in the test, 5 out of 13 problems (38 percent) were automatically solved by the system without adding new knowledge

D-Scholarship@Pitt

An Investigation into Automatic Translation of Prepositions in IT Technical Documentation from English to Chinese

Author: Sun Yanli
Publication venue: Dublin City University. Centre for Translation and Textual Studies (CTTS)
Publication date: 01/11/2010
Field of study

Machine Translation (MT) technology has been widely used in the localisation industry to boost the productivity of professional translators. However, due to the high quality of translation expected, the translation performance of an MT system in isolation is less than satisfactory due to various generated errors. This study focuses on translation of prepositions from English into Chinese within technical documents in an industrial localisation context. The aim of the study is to reveal the salient errors in the translation of prepositions and to explore possible methods to remedy these errors. This study proposes three new approaches to improve the translation of prepositions. All approaches attempt to make use of the strengths of the two most popular MT architectures at the moment: Rule-Based MT (RBMT) and Statistical MT (SMT). The approaches include: firstly building an automatic preposition dictionary for the RBMT system; secondly exploring and modifing the process of Statistical Post-Editing (SPE) and thirdly pre-processing the source texts to better suit the RBMT system. Overall evaluation results (both human evaluation and automatic evaluation) show the potential of our new approaches in improving the translation of prepositions. In addition, the current study also reveals a new function of automatic metrics in assisting researchers to obtain more valid or purpose-specific human valuation results

Irish Universities

DCU Online Research Access Service

Generating natural language specifications from UML class diagrams

Author: A Abbott
AV Gervasi
CL Heitmeyer
E Brill
E Goldberg
Farid Meziane
G Booch
HM Harmain
K Walden
L Goldin
L Mich
MD Lubars
Nikos Athanasakis
P Martin-Löf
PPS Chen
PPS Chen
Sophia Ananiadou
SW Ambler
W Ahrendt
WC Mann
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Early phases of software development are known to be problematic, difficult to manage and errors occurring during these phases are expensive to correct. Many systems have been developed to aid the transition from informal Natural Language requirements to semistructured or formal specifications. Furthermore, consistency checking is seen by many software engineers as the solution to reduce the number of errors occurring during the software development life cycle and allow early verification and validation of software systems. However, this is confined to the models developed during analysis and design and fails to include the early Natural Language requirements. This excludes proper user involvement and creates a gap between the original requirements and the updated and modified models and implementations of the system. To improve this process, we propose a system that generates Natural Language specifications from UML class diagrams. We first investigate the variation of the input language used in naming the components of a class diagram based on the study of a large number of examples from the literature and then develop rules for removing ambiguities in the subset of Natural Language used within UML. We use WordNet,a linguistic ontology, to disambiguate the lexical structures of the UML string names and generate semantically sound sentences. Our system is developed in Java and is tested on an independent though academic case study

CiteSeerX

University of Salford Institutional Repository

Crossref

The University of Manchester - Institutional Repository

Opportunities and Challenges in Neural Dialog Tutoring

Author: Daheim Nico
Gurevych Iryna
Kapur Manu
Macina Jakub
Sachan Mrinmaya
Sinha Tanmay
Wang Lingzhi
Publication venue
Publication date: 27/03/2023
Field of study

Designing dialog tutors has been challenging as it involves modeling the diverse and complex pedagogical strategies employed by human tutors. Although there have been significant recent advances in neural conversational systems using large language models (LLMs) and growth in available dialog corpora, dialog tutoring has largely remained unaffected by these advances. In this paper, we rigorously analyze various generative language models on two dialog tutoring datasets for language learning using automatic and human evaluations to understand the new opportunities brought by these advances as well as the challenges we must overcome to build models that would be usable in real educational settings. We find that although current approaches can model tutoring in constrained learning scenarios when the number of concepts to be taught and possible teacher strategies are small, they perform poorly in less constrained scenarios. Our human quality evaluation shows that both models and ground-truth annotations exhibit low performance in terms of equitable tutoring, which measures learning opportunities for students and how engaging the dialog is. To understand the behavior of our models in a real tutoring setting, we conduct a user study using expert annotators and find a significantly large number of model reasoning errors in 45% of conversations. Finally, we connect our findings to outline future work.Comment: EACL 2023 (main conference, camera-ready

arXiv.org e-Print Archive

Recommended from our members

Proceedings of QG2010: The Third Workshop on Question Generation

Author: Boyer Kristy Elizabeth
Piwek Paul
Publication venue: questiongeneration.org
Publication date: 18/06/2010
Field of study

These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge". QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)

Open Research Online (The Open University)

Generating a Linguistic Model for Requirement Quality Analysis

Author: Kang Juyeon
Park Jungyeul
Publication venue: Hankookmunhwasa
Publication date: 01/01/2016
Field of study

Waseda University Repository

Evaluation of an Esperanto-Based Interlingua Multilingual Survey Form Machine Translation Mechanism Incorporating a Sublanguage Translation Methodolgy

Author: Boddington Richard
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2004
Field of study

Translation costs restrict the preparation of medical survey and other questionnaires for migrant communities in Western Australia. This restriction is compounded by a lack of affordable and accurate machine translation mechanisms. This research investigated and evaluated combined strategies intended to provide an efficacious and affordable machine translator by: • using an interlingua or pivot-language that requires less resources for its construction than contemporary systems and has the additional benefit of significant error reduction; and • defining smaller lexical environments to restrict data, thereby reducing the complexity of translation rules and enhancing correct semantic transfer between natural languages. This research focussed on producing a prototype machine translation mechanism that would accept questionnaire texts as discrete questions and suggested answers from which a respondent may select. The prototype was designed to accept non-ambiguous English as the source language, translate it to a pivot-language or interlingua, Esperanto, and thence to a selected target language, French. Subsequently, a reverse path of translation from the target language back to the source language enabled validation of minimal or zero change in both syntax and semantics of the original input. Jade, an object-oriented (00) database application, hosting the relationship between the natural languages and the interlingua, was used to facilitate the accurate transfer of meaning between the natural languages. Translation, interpretation and validation of sample texts was undertaken by linguists qualified in English, French and Esperanto. Translation output from the prototype model was compared, again with assistance from linguists, with a \u27control\u27 model, the SYSTRAN On-Line Translator, a more traditional transfer translation product. Successful completion of this research constitutes a step towards an increased availability of low cost machine translation to assist in the development of reliable and efficient survey translation systems for use in specific user environments. These environments include, but arc not exclusive to, medical, hospital and Australian indigenous-contact environments

Research Online @ ECU