24,543 research outputs found
Morphological variation of Arabic queries
Although it has been shown that in test collection based studies,
stemming improves retrieval effectiveness in an information retrieval system,
morphological variations of queries searching on the same topic are less well
understood. This work examines the broad morphological variation that
searchers of an Arabic retrieval system put into their queries. In this study, 15
native Arabic speakers were asked to generate queries, morphological variants
of query words were collated across users. Queries composed of either the
commonest or rarest variants of each word were submitted to a retrieval system
and the effectiveness of the searches was measured. It was found that queries
composed of the more popular morphological variants were more likely to
retrieve relevant documents that those composed of less popular
Morfologi Bahasa Arab: Reformulasi Sistem Derivasi dan Infleksi
Morphology, in the study of Arabic linguistics known as the discipline âilm al-áčŁarf, as part of grammar which examines the internal structure of words, has an urgency to be studied in depth. Especially in the context of Arabic studies that embrace typologies of complex inflective languages. This article examines the Arabic morphological system from a modern linguistic perspective, especially on derivational and inflectional changes. The discussion begins on the conception of derivation and inflection in the view of modern linguistics, as an introduction to see the system of derivation and inflection changes in Arabic linguistics. Morphological theories of Arabic grammars, in this article developed and communicated with modern linguistic theories. From this development a new formula was produced in the study of Arabic morphology which is expected to provide a more systematic description of the understanding of the Arabic morphological system
A morphological generator for the indexing of Arabic audio
This paper presents a novel Arabic morphological generator (AMG) for Modern Standard Arabic (MSA) which is designed and implemented using Prolog. The AMG is used to generate inflected forms of words used for the indexing of Arabic audio. These words are also the relevant terms in the Arab authority system (library information retrieval system) used in this study. The AMG generates inflected Arabic words from the root according to pre-specified morphological features that can be extended as needed. The Arabic word is represented as a feature structure which is handled through unification during the morphological generation process. The inflected forms can then be inserted automatically into a speech recognition grammar which is used to identify these words in an audio sequence or utterance
Recommended from our members
MADA+TOKAN Manual
MADA1 is a system for Morphological Analysis and Disambiguation for Arabic. TOKAN is a general tokenizer for MADA-disambigauted text. Internally, MADA also makes use of ALMORGEANA, an Arabic lexeme-based morphology analyzer
Recommended from our members
Excavating a linguistic category : on the properties of Ism al-Fiâl and the limits of KalÄm al-âArab
Examining the occurrence of ism fiâl murtajal (an obscure lexical class whose words syntactically are verbs, while morphologically resemble irregular nouns) in three early, founding works of Arabic grammar and lexicology, affords analysis of the wordsâ structures and origins, and informs our understanding of the Classical Arabic linguistic register at whose edges they existed. These worksâ terminology for the items differs from modern terms. Said terminology seems furthermore not yet standardized. Many items do not fit into conventional root-pattern morphological analysis, though creative or unprecedented derivational methods render them pliable to Arabicâs triradical morphosyntactic system. Some items do correspond to known roots, and a few are recognizable as basically conventional, if irregular, imperatives. A few times items exhibit archaic or irregular phonetics or morphophonology. This lexeme classâ presence in the performative Classical Arabic (âarabiyyah) suggests its founding corpus (kalÄm al-âarab) was not merely linguistic (i.e., âArabic languageâ) but also cultural (i.e., perceptions of âurĆ«bahâArabnessâitself).Middle Eastern Studie
Recommended from our members
Dialectal to Standard Arabic Paraphrasing to Improve Arabic-English Statistical Machine Translation
This paper is interested in improving the quality of Arabic-English statistical machine translation (SMT) on highly dialectal Arabic text using morphological knowledge. We present a light-weight rule-based approach to producing Modern Standard Arabic (MSA) paraphrases of dialectal Arabic out-of-vocabulary words and low frequency words. Our approach extends an existing MSA analyzer with a small number of morphological clitics and transfer rules. The generated paraphrase lattices are input to a state-of-the-art phrase-based SMT system resulting in improved BLEU scores on a blind test set by 0.56 absolute BLEU (or 1.5% relative)
Roots and patterns in Beja (Cushitic): the issue of language contact with Arabic
A large part of the morphology of Beja, the sole language of the Northern branch of Cushitic (Afroasiatic), belongs to the root and pattern system. This system is typologically similar to the Semitic one (particularly robust in Arabic) and is also found to a lesser extent in two neighboring Cushitic languages, Afar and Saho, but not in any other Cushitic language. This paper reviews the different patterns of the Beja morphological system, and compares them with the systems of its main Semitic contact language (Arabic) and with other Cushitic languages (Afar and Saho). No clear case of borrowing, copying, or replication from dominant and prestigious Arabic could be found, but sociolinguistic and linguistic data favors an interpretation in terms of a convergence phenomenon. The paper argues that contact with Arabic was a strong factor for the preservation of a crosslinguistically uncommon system in a large part of the Beja morphology. It also argues that intensive language contact between genetically related languages may help to preserve a morphological system which otherwise would have disappeared as is the case in most other Cushitic languages
Arabic open information extraction system using dependency parsing
Arabic is a Semitic language and one of the most natural languages distinguished by the richness in morphological enunciation and derivation. This special and complex nature makes extracting information from the Arabic language difficult and always needs improvement. Open information extraction systems (OIE) have been emerged and used in different languages, especially in English. However, it has almost not been used for the Arabic language. Accordingly, this paper aims to introduce an OIE system that extracts the relation tuple from Arabic web text, exploiting Arabic dependency parsing and thinking carefully about all possible text relations. Based on clause types' propositions as extractable relations and constituents' grammatical functions, the identities of corresponding clause types are established. The proposed system named Arabic open information extraction(AOIE) can extract highly scalable Arabic text relations while being domain independent. Implementing the proposed system handles the problem using supervised strategies while the system relies on unsupervised extraction strategies. Also, the system has been implemented in several domains to avoid information extraction in a specific field. The results prove that the system achieves high efficiency in extracting clauses from large amounts of text
- âŠ