4 research outputs found
Classifying Amharic News Text Using Self-Organizing Maps
The paper addresses using artificial neural networks for classification of Amharic news items. Amharic is the language for countrywide communication in Ethiopia and has its own writing system containing extensive systematic redundancy. It is quite dialectally diversified and probably representative of the languages of a continent that so far has received little attention within the language processing field.
The experiments investigated document clustering around user queries using Self-Organizing Maps, an unsupervised learning neural network strategy. The best ANN model showed a precision of 60.0% when trying to cluster unseen data, and a 69.5% precision when trying to classify it
Methods for Amharic part-of-speech tagging
The paper describes a set of experiments
involving the application of three state-of-
the-art part-of-speech taggers to Ethiopian
Amharic, using three different tagsets.
The taggers showed worse performance
than previously reported results for Eng-
lish, in particular having problems with
unknown words. The best results were
obtained using a Maximum Entropy ap-
proach, while HMM-based and SVM-
based taggers got comparable results
Extraction of Arabic word roots: An Approach Based on Computational Model and Multi-Backpropagation Neural Networks
Stemming is a process of extracting the root of a given word, by stripping
off the affixes attached to this word. Many attempts have been made
to address the stemming of Arabic words problem. The majority of the
existing Arabic stemming algorithms require a complete set of morphological
rules and large vocabulary lookup tables. Furthermore, many of them give
more than one potential stem or root for a given Arabic word. According to
Ahmad [11], the Arabic stemming process based on the language morphological
rules is still a very difficult task due to the nature of the language itself.
The limitations of the current Arabic stemming methods have motivated this
research in which we investigate a novel approach to extract the word roots
of Arabic language named here as MUAIDI-STEMMER 2. This approach attempts
to exploit numerical relations between Arabic letters, avoiding having a list
of the root and pattern of each word in the language, and giving one root solution.
This approach is composed of two phases. Phase I depends on a basic
calculations extracted from linguistic analysis of Arabic patterns and affixes.
Phase II is based on artificial neural network trained by backpropagation
learning rule. In this proposed phase, we formulate the root extraction problem
as a classification problem and the neural network as a classifier tool.
This study demonstrates that a neural network can be effectively used to ex- tract the word roots of Arabic language
The stemmer developed is tested using 46,895 Arabic word types3. Error counting accuracy evaluation was employed to evaluate the performance of
the stemmer. It was successful in producing the stems of 44,107 Arabic words
from the given test datasets with accuracy of 94.81%.
2.Muaidi is the author father's name.
3.Types mean distinct or unique words