6 research outputs found

    Arabic Dictionary Compression Using An Invertible Integer Transformation and a Bitmap Representation

    No full text
    Due to advances in computer technology, language dictionaries are gradually gaining additional importance. Great achievements have been reported in areas such as natural language processing, speech recognition and other AI applications. Most of these applications require the availability of a dictionary that can be maintained and accessed in a very efficient way. Dictionaries are also required in conventional applications such as spelling correction and data base management.In this paper a flexible, efficient bitmap compression method for Arabic dictionaries is described. The method is based on an invertible integer transformation and may either be used to store root-based dictionaries or stem-based dictionaries. In the first case each root may cost up to 5 bits at most (less than two bits per character) if all Arabic roots are stored. Practically, only one bit is needed to store the whole root, but the additional bits are due to the sparsity of the resulting structure. It is quite feasible that the number of bits per root may be reduced to less than 2. For the second type of dictionary 17 bits at most are needed to represent a stem up to 7 characters in length (less than 3 bits per character). Again this figure may be improved if the sparsity problem is solved. The operations required to maintain the dictionary are very simple and very cheap in terms of processor time. The flexibility of the resulting structure is reflected in the fact that an unlimited number of attributes may be added as parallel arrays or bitmaps

    Arabic Dictionary Compression Using An Invertible Integer Transformation and a Bitmap Representation

    No full text
    Due to advances in computer technology, language dictionaries are gradually gaining additional importance. Great achievements have been reported in areas such as natural language processing, speech recognition and other AI applications. Most of these applications require the availability of a dictionary that can be maintained and accessed in a very efficient way. Dictionaries are also required in conventional applications such as spelling correction and data base management.In this paper a flexible, efficient bitmap compression method for Arabic dictionaries is described. The method is based on an invertible integer transformation and may either be used to store root-based dictionaries or stem-based dictionaries. In the first case each root may cost up to 5 bits at most (less than two bits per character) if all Arabic roots are stored. Practically, only one bit is needed to store the whole root, but the additional bits are due to the sparsity of the resulting structure. It is quite feasible that the number of bits per root may be reduced to less than 2. For the second type of dictionary 17 bits at most are needed to represent a stem up to 7 characters in length (less than 3 bits per character). Again this figure may be improved if the sparsity problem is solved. The operations required to maintain the dictionary are very simple and very cheap in terms of processor time. The flexibility of the resulting structure is reflected in the fact that an unlimited number of attributes may be added as parallel arrays or bitmaps
    corecore