28 research outputs found

    Improved Arabic characters recognition by combining multiple machine learning classifiers.

    Get PDF
    In this paper, we investigate a range of strategies for combining multiple machine learning techniques for recognizing Arabic characters, where we are faced with imperfect and dimensionally variable input characters. Experimental results show that combined confidence-based backoff strategies can produce more accurate results than each technique produces by itself and even the ones exhibited by the majority voting combination

    Evaluation of Combining Data-Driven Dependency Parsers for Arabic

    Get PDF
    In recent years, there has been a considerable interest in dependency parsing for many reasons. First, dependency-based syntactic representations seem to be effective in many areas of NLP, such as machine translation, question answering, and relation extraction, thanks to their transparent encoding of predicate-argument structure. Second, dependency parsing is flexible for free word order languages (e.g. Arabic and Czech). Third, and most importantly, the dependency-based approach has led to the development of fast robust reasonably accurate syntactic parsers for a number of languages. In this paper, we investigate the technique of combining multiple data-driven dependency parsers for parsing Arabic. Arabic has a number of characteristics, which will be described through the paper, that make parsing it challenging. Experimental results show that combined parsers can produce more accurate results, even for imperfectly tagged text, than each parser produces by itself for texts with the gold-standard tags

    Optimal k-means clustering using artificial bee colony algorithm with variable food sources length

    Get PDF
    Clustering is a robust machine learning task that involves dividing data points into a set of groups with similar traits. One of the widely used methods in this regard is the k-means clustering algorithm due to its simplicity and effectiveness. However, this algorithm suffers from the problem of predicting the number and coordinates of the initial clustering centers. In this paper, a method based on the first artificial bee colony algorithm with variable-length individuals is proposed to overcome the limitations of the k-means algorithm. Therefore, the proposed technique will automatically predict the clusters number (the value of k) and determine the most suitable coordinates for the initial centers of clustering instead of manually presetting them. The results were encouraging compared with the traditional k-means algorithm on three real-life clustering datasets. The proposed algorithm outperforms the traditional k-means algorithm for all tested real-life datasets

    Improved Arabic Characters Recognition by Combining Multiple Machine Learning Classifiers

    Get PDF
    In this paper, we investigate a range of strategies for combining multiple machine learning techniques for recognizing Arabic characters, where we are faced with imperfect and dimensionally variable input characters. Experimental results show that combined confidence-based backoff strategies can produce more accurate results than each technique produces by itself and even the ones exhibited by the majority voting combination

    Textual Entailment for Modern Standard Arabic

    No full text

    A Dataset for Arabic Textual Entailment

    No full text
    There are fewer resources for textual entailment(TE) for Arabic than for other languages,and the manpower for constructingsuch a resource is hard to come by.We describe here a semi-automatic techniquefor creating a first dataset for TEsystems for Arabic using an extension ofthe ‘headline-lead paragraph’ technique.We also sketch the difficulties inherent involunteer annotators-based judgment, anddescribe a regime to ameliorate some ofthese
    corecore