5 research outputs found

    An Intelligent Framework for Natural Language Stems Processing

    Get PDF
    This work describes an intelligent framework that enables the derivation of stems from inflected words. Word stemming is one of the most important factors affecting the performance of many language applications including parsing, syntactic analysis, speech recognition, retrieval systems, medical systems, tutoring systems, biological systems,…, and translation systems. Computational stemming is essential for dealing with some natural language processing such as Arabic Language, since Arabic is a highly inflected language. Computational stemming is an urgent necessity for dealing with Arabic natural language processing. The framework is based on logic programming that creates a program to enabling the computer to reason logically. This framework provides information on semantics of words and resolves ambiguity. It determines the position of each addition or bound morpheme and identifies whether the inflected word is a subject, object, or something else. Position identification (expression) is vital for enhancing understandability mechanisms. The proposed framework adapts bi-directional approaches. It can deduce morphemes from inflected words or it can build inflected words from stems. The proposed framework handles multi-word expressions and identification of names. The framework is based on definiteclause grammar where rules are built according to Arabic patterns (templates) using programming language prolog as predicates in first-order logic. This framework is based on using predicates in firstorder logic with object-oriented programming convention which can address problems of complexity. This complexity of natural language processing comes from the huge amount of storage required. This storage reduces the efficiency of the software system. In order to deal with this complexity, the research uses Prolog as it is based on efficient and simple proof routines. It has dynamic memory allocation of automatic garbage collection. This facility, in addition to relieve th

    A rules based system for named entity recognition in modern standard Arabic

    Get PDF
    The amount of textual information available electronically has made it difficult for many users to find and access the right information within acceptable time. Research communities in the natural language processing (NLP) field are developing tools and techniques to alleviate these problems and help users in exploiting these vast resources. These techniques include Information Retrieval (IR) and Information Extraction (IE). The work described in this thesis concerns IE and more specifically, named entity extraction in Arabic. The Arabic language is of significant interest to the NLP community mainly due to its political and economic significance, but also due to its interesting characteristics. Text usually contains all kinds of names such as person names, company names, city and country names, sports teams, chemicals and lots of other names from specific domains. These names are called Named Entities (NE) and Named Entity Recognition (NER), one of the main tasks of IE systems, seeks to locate and classify automatically these names into predefined categories. NER systems are developed for different applications and can be beneficial to other information management technologies as it can be built over an IR system or can be used as the base module of a Data Mining application. In this thesis we propose an efficient and effective framework for extracting Arabic NEs from text using a rule based approach. Our approach makes use of Arabic contextual and morphological information to extract named entities. The context is represented by means of words that are used as clues for each named entity type. Morphological information is used to detect the part of speech of each word given to the morphological analyzer. Subsequently we developed and implemented our rules in order to recognise each position of the named entity. Finally, our system implementation, evaluation metrics and experimental results are presented.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Arabic Programming Environment

    Get PDF
    Computer Scienc

    A Rules Based System for Named Entity Recognition in Modern Standard Arabic

    Get PDF
    The amount of textual information available electronically has made it difficult formany users to find and access the right information within acceptable time. Researchcommunities in the natural language processing (NLP) field are developing tools andtechniques to alleviate these problems and help users in exploiting these vast resources.These techniques include Information Retrieval (IR) and Information Extraction (IE). Thework described in this thesis concerns IE and more specifically, named entity extraction inArabic. The Arabic language is of significant interest to the NLP community mainly due toits political and economic significance, but also due to its interesting characteristics.Text usually contains all kinds of names such as person names, company names,city and country names, sports teams, chemicals and lots of other names from specificdomains. These names are called Named Entities (NE) and Named Entity Recognition(NER), one of the main tasks of IE systems, seeks to locate and classify automaticallythese names into predefined categories. NER systems are developed for differentapplications and can be beneficial to other information management technologies as it canbe built over an IR system or can be used as the base module of a Data Mining application.In this thesis we propose an efficient and effective framework for extracting Arabic NEsfrom text using a rule based approach. Our approach makes use of Arabic contextual andmorphological information to extract named entities. The context is represented by meansof words that are used as clues for each named entity type. Morphological information isused to detect the part of speech of each word given to the morphological analyzer.Subsequently we developed and implemented our rules in order to recognise each positionof the named entity. Finally, our system implementation, evaluation metrics andexperimental results are presented
    corecore