8 research outputs found

    Towards the Development of a Hybrid Parser for Natural Languages

    Get PDF
    In order to understand natural languages, we have to be able to determine the relations between words, in other words we have to be able to \u27parse\u27 the input text. This is a difficult task, especially for Arabic, which has a number of properties that make it particularly difficult to handle. There are two approaches to parsing natural languages: grammar-driven and data-driven. Each of these approaches poses its own set of problems, which we discuss in this paper. The goal of our work is to produce a hybrid parser, which retains the advantages of the data-driven approach but is guided by grammar rules in order to produce more accurate output. This work consists of two stages: the first stage is to develop a baseline data-driven parser, which is guided by a machine learning algorithm for establishing dependency relations between words. The second stage is to integrate grammar rules into the baseline parser. In this paper, we describe the first stage of our work, which is now implemented, and a number of experiments that have been conducted on this parser. We also discuss the result of these experiments and highlight the different factors that are affecting parsing speed and the correctness of the parser results

    Towards the Development of a Hybrid Parser for Natural Languages

    Get PDF
    In order to understand natural languages, we have to be able to determine the relations between words, in other words we have to be able to 'parse' the input text. This is a difficult task, especially for Arabic, which has a number of properties that make it particularly difficult to handle. There are two approaches to parsing natural languages: grammar-driven and data-driven. Each of these approaches poses its own set of problems, which we discuss in this paper. The goal of our work is to produce a hybrid parser, which retains the advantages of the data-driven approach but is guided by grammar rules in order to produce more accurate output. This work consists of two stages: the first stage is to develop a baseline data-driven parser, which is guided by a machine learning algorithm for establishing dependency relations between words. The second stage is to integrate grammar rules into the baseline parser. In this paper, we describe the first stage of our work, which is now implemented, and a number of experiments that have been conducted on this parser. We also discuss the result of these experiments and highlight the different factors that are affecting parsing speed and the correctness of the parser results

    Towards the development of a hybrid parser for natural langauges

    Get PDF

    UNIARAB: An Universal Machine Translator System For Arabic Based On Role And Reference Grammar

    Get PDF
    This paper presents a machine translation system (Hutchins 2003) called UniArab (Salem, Hensman and Nolan 2008). It is a proof-of-concept system supporting the fundamental aspects of Arabic, such as the parts of speech, agreement and tenses. UniArab is based on the linking algorithm of RRG (syntax to semantics and vice versa). UniArab takes MSA Arabic as input in the native orthography, parses the sentence(s) into a logical meta-representation based on the fully expanded RRG logical structures and, using this, generates perfectly grammatical English output with full agreement and morphological resolution. UniArab utilizes an XML-based implementation of elements of the Role and Reference Grammar theory in software. In order to analyse Arabic by computer we first extract the lexical properties of the Arabic words (Al-Sughaiyer and Al-Kharashi 2004). From the parse, it then creates a computer-based representation for the logical structure of the Arabic sentence(s). We use the RRG theory to motivate the computational implementation of the architecture of the lexicon in software. We also implement in software the RRG bidirectional linking system to build the parse and generate functions between the syntax-semantic interfaces. Through seven input phases, including the morphological and syntactic unpacking, UniArab extracts the logical structure of an Arabic sentence. Using the XML-based metadata representing the RRG logical structure, UniArab then accurately generates an equivalent grammatical sentence in the target language through four output phases. We discuss the technologies used to support its development and also the user interface that allows for the addition of lexical items directly to the lexicon in real time. The UniArab system has been tested and evaluated generating equivalent grammatical sentences, in English, via the logical structure of Arabic sentences, based on MSA Arabic input with very significant and accurate results (Izwaini 2006). At present we are working to greatly extend the coverage by the addition of more verbs to the lexicon. We have demonstrated in this research that RRG is a viable linguistic model for building accurate rulebased semantically oriented machine translation software. Role and Reference Grammar (RRG) is a functional theory of grammar that posits a direct mapping between the semantic representation of a sentence and its syntactic representation. The theory allows a sentence in a specific language to be described in terms of its logical structure and grammatical procedures. RRG creates a linking relationship between syntax and semantics, and can account for how semantic representations are mapped into syntactic representations. We claim that RRG is very suitable for machine translation of Arabic, notwithstanding well-documented difficulties found within Arabic MT (Izwaini, S. 2006), and that RRG can be implemented in software as the rule-based kernel of an Interlingua bridge MT engine

    language and philosophy: an analysis of the turn to the subject in modern philosophy with historical linguistic approach

    Get PDF
    One of the main characteristics of the philosophy of Descartes which marked the starting point of modern philosophy and was continued by English empiricism and German Idealism is a special attention to the subject instead of cosmos, being or God. But what caused such a turn to subject? With a historical linguistic approach it can be shown that the replacement of old languages of philosophy namely Greek, Arabic and Latin language by modern European languages namely French, English and German can be one of the causes of that turn in the history of philosophy. It seems interesting that the change in modern European language occurred exactly at the same time the modern philosophy appeared. In this research we will concentrate on the word order and the possibility of the omission of the subject in the sentence. In modern European languages there is a insistence on the subject to appear at the beginning of the sentence. This can lead to a special attention to the subject in philosophical aspect. It may seem interesting that old languages of philosophy are null-subject languages in the sense that in these languages the subject can be omitted. In these languages the subject sometimes comes after the verb or even will not appear in the phrase. That may seem why in those languages a philosophy similar to modern philosophy did not appear. Finally it will be shown that the change in language did not confine to mere language and has important philosophical implications

    The Role of Translation in Language Change: A Corpus-based Study on English Influence on the Arabic Passive

    Get PDF
    Studies conducted around the world have shown that the structures of various languages have shifted over time towards that of English. This phenomenon could be attributed to the use of English as a lingua franca or to these languages’ contact with English via translation. This thesis investigates this shift towards English-language structure in translated and original Arabic scientific texts. To this end, I developed a diachronic corpus for scientific articles dating between 1997-2000 and 2016-2018 to generate findings for this genre. The study used both parallel and comparable corpora, allowing an investigation of the influence of English not only on translated texts but also on original scientific texts written within the same time frame. The results reveal that the English language has affected the Arabic passive voice structure in translated scientific texts, and that the English passive voice structure seems also to have affected original modern Arabic scientific texts. As for the agentive passive, English does not seem to have increased its influence between 1997-2000 and 2016-18 in the translated texts as most agentive English passives are translated into active Arabic sentences in both the 1997- 2000 and 2016-18 corpora. There also does not seem to be a significant increase in the agentive passive in original texts between 1997-2000 and 2016-2018
    corecore