352 research outputs found

    Transliteration of Hiragana and Katakana Handwritten Characters Using CNN-SVM

    Get PDF
    Hiragana and katakana handwritten characters are often used when writing words in Japanese. Japanese itself is often used by native Japanese as well as people learning Japanese around the world. Hiragana and katakana characters themselves are difficult to learn because many characters are similar to one another. In this study, hiragana and basic katakana, dakuten, handakuten, and youon were used, which were taken from the respondents using a questionnaire. This study used the CNN method which will be compared with a combination of the CNN and SVM methods which have been designed to identify each character that has been prepared. Preprocessing of character images uses the methods of image resizing, grayscaling, binarization, dilation, and erosion. The preprocessed results will be input for CNN as a feature extraction tool and SVM as a tool for character recognition. The results of this study obtained accuracy with the following parameters: 69×69 image size, 3 patience values, val_loss monitor callbacks, Nadam optimization function, 0.001 learning rate value, 30 epochs value, and SVM RBF kernel. If using a system that only uses the CNN network, the accuracy is 87.82%. The results obtained when using a combination of CNN and SVM were 88.21%

    Towards robust real-world historical handwriting recognition

    Get PDF
    In this thesis, we make a bridge from the past to the future by using artificial-intelligence methods for text recognition in a historical Dutch collection of the Natuurkundige Commissie that explored Indonesia (1820-1850). In spite of the successes of systems like 'ChatGPT', reading historical handwriting is still quite challenging for AI. Whereas GPT-like methods work on digital texts, historical manuscripts are only available as an extremely diverse collections of (pixel) images. Despite the great results, current DL methods are very data greedy, time consuming, heavily dependent on the human expert from the humanities for labeling and require machine-learning experts for designing the models. Ideally, the use of deep learning methods should require minimal human effort, have an algorithm observe the evolution of the training process, and avoid inefficient use of the already sparse amount of labeled data. We present several approaches towards dealing with these problems, aiming to improve the robustness of current methods and to improve the autonomy in training. We applied our novel word and line text recognition approaches on nine data sets differing in time period, language, and difficulty: three locally collected historical Latin-based data sets from Naturalis, Leiden; four public Latin-based benchmark data sets for comparability with other approaches; and two Arabic data sets. Using ensemble voting of just five neural networks, a level of accuracy was achieved which required hundreds of neural networks in earlier studies. Moreover, we increased the speed of evaluation of each training epoch without the need of labeled data

    A limited-size ensemble of homogeneous CNN/LSTMs for high-performance word classification

    Get PDF
    The strength of long short-term memory neural networks (LSTMs) that have been applied is more located in handling sequences of variable length than in handling geometric variability of the image patterns. In this paper, an end-to-end convolutional LSTM neural network is used to handle both geometric variation and sequence variability. The best results for LSTMs are often based on large-scale training of an ensemble of network instances. We show that high performances can be reached on a common benchmark set by using proper data augmentation for just five such networks using a proper coding scheme and a proper voting scheme. The networks have similar architectures (convolutional neural network (CNN): five layers, bidirectional LSTM (BiLSTM): three layers followed by a connectionist temporal classification (CTC) processing step). The approach assumes differently scaled input images and different feature map sizes. Three datasets are used: the standard benchmark RIMES dataset (French); a historical handwritten dataset KdK (Dutch); the standard benchmark George Washington (GW) dataset (English). Final performance obtained for the word-recognition test of RIMES was 96.6%, a clear improvement over other state-of-the-art approaches which did not use a pre-trained network. On the KdK and GW datasets, our approach also shows good results. The proposed approach is deployed in the Monk search engine for historical-handwriting collections

    シンソウニューラルネットワークニヨルテガキテキストニンシキ

    Get PDF
    博士(工学)東京農工大

    Chinese calligraphy: character style recognition based on full-page document

    Full text link
    Calligraphy plays a very important role in the history of China. From ancient times to modern times, the beauty of calligraphy has been passed down to the present. Different calligraphy styles and structures have made calligraphy a beauty and embodiment in the field of writing. However, the recognition of calligraphy style and fonts has always been a blank in the computer field. The structural complexity of different calligraphy also brings a lot of challenges to the recognition technology of computers. In my research, I mainly discussed some of the main recognition techniques and some popular machine learning algorithms in this field for more than 20 years, trying to find a new method of Chinese calligraphy styles recognition and exploring its feasibility. In our research, we searched for research papers 20 years ago. Most of the results are about the content recognition of modern Chinese characters. At first, we analyze the development of Chinese characters and the basic Chinese character theory. In the analysis of the current recognition of Chinese characters (including handwriting online and offline) in the computer field, it is more important to analyze various algorithms and results, and to analyze how to use the experimental data, besides how they construct the data set used for their test. The research on the method of image processing based on Chinese calligraphy works is very limited, and the data collection for calligraphy test is very limited also. The test of dataset that used between different recognition technologies is also very different. However, it has far-reaching significance for inheriting and carrying forward the traditional Chinese culture. It is very necessary to develop and promote the recognition of Chinese characters by means of computer tecnchque. In the current application field, the font recognition of Chinese calligraphy can effectively help the library administrators to identify the problem of the classification of the copybook, thus avoiding the recognition of the calligraphy font which is difficult to perform manually only through subjective experience. In the past 10 years of technology, some techniques for the recognition of single Chinese calligraphy fonts have been given. Most of them are the pre-processing of calligraphy characters, the extraction of stroke primitives, the extraction of style features, and the final classification of machine learning. The probability of the classification of the calligraphy works. Such technical requirements are very large for complex Chinese characters, the result of splitting and recognition is very large, and it is difficult to accurately divide many complex font results. As a result, the recognition rate is low, or the accuracy of recognition of a specific word is high, but the overall font recognition accuracy is low. We understand that Chinese calligraphy is a certain research value. In the field of recognition, many research papers on the analysis of Chinese calligraphy are based on the study of calligraphy and stroke. However, we have proposed a new method for dealing with font recognition. The recognition technology is based on the whole page of the document. It is studied in three steps: the first step is to use Fourier transform and some Chinese calligraphy images and analyze the results. The second is that CNN is based on different data sets to get some results. Finally, we made some improvements to the CNN structure. The experimental results of the thesis show that the full-page documents recognition method proposed can achieve high accuracy with the support of CNN technology, and can effectively identify the different styles of Chinese calligraphy in 5 styles. Compared with the traditional analysis methods, our experimental results show that the method based on the full-page document is feasible, avoiding the cumbersome font segmentation problem. This is more efficient and more accurate
    corecore