132 research outputs found
Chinese calligraphy: character style recognition based on full-page document
Calligraphy plays a very important role in the history of China. From ancient times to
modern times, the beauty of calligraphy has been passed down to the present. Different
calligraphy styles and structures have made calligraphy a beauty and embodiment in the
field of writing. However, the recognition of calligraphy style and fonts has always been
a blank in the computer field. The structural complexity of different calligraphy also
brings a lot of challenges to the recognition technology of computers. In my research, I
mainly discussed some of the main recognition techniques and some popular machine
learning algorithms in this field for more than 20 years, trying to find a new method of
Chinese calligraphy styles recognition and exploring its feasibility.
In our research, we searched for research papers 20 years ago. Most of the results are
about the content recognition of modern Chinese characters. At first, we analyze the
development of Chinese characters and the basic Chinese character theory. In the
analysis of the current recognition of Chinese characters (including handwriting online
and offline) in the computer field, it is more important to analyze various algorithms
and results, and to analyze how to use the experimental data, besides how they construct
the data set used for their test.
The research on the method of image processing based on Chinese calligraphy works
is very limited, and the data collection for calligraphy test is very limited also. The test
of dataset that used between different recognition technologies is also very different.
However, it has far-reaching significance for inheriting and carrying forward the
traditional Chinese culture. It is very necessary to develop and promote the recognition
of Chinese characters by means of computer tecnchque. In the current application field,
the font recognition of Chinese calligraphy can effectively help the library
administrators to identify the problem of the classification of the copybook, thus
avoiding the recognition of the calligraphy font which is difficult to perform manually
only through subjective experience.
In the past 10 years of technology, some techniques for the recognition of single
Chinese calligraphy fonts have been given. Most of them are the pre-processing of
calligraphy characters, the extraction of stroke primitives, the extraction of style
features, and the final classification of machine learning. The probability of the
classification of the calligraphy works. Such technical requirements are very large for
complex Chinese characters, the result of splitting and recognition is very large, and it
is difficult to accurately divide many complex font results. As a result, the recognition
rate is low, or the accuracy of recognition of a specific word is high, but the overall font
recognition accuracy is low.
We understand that Chinese calligraphy is a certain research value. In the field of
recognition, many research papers on the analysis of Chinese calligraphy are based on
the study of calligraphy and stroke. However, we have proposed a new method for
dealing with font recognition. The recognition technology is based on the whole page
of the document. It is studied in three steps: the first step is to use Fourier transform and
some Chinese calligraphy images and analyze the results. The second is that CNN is
based on different data sets to get some results. Finally, we made some improvements
to the CNN structure. The experimental results of the thesis show that the full-page
documents recognition method proposed can achieve high accuracy with the support of
CNN technology, and can effectively identify the different styles of Chinese calligraphy
in 5 styles. Compared with the traditional analysis methods, our experimental results
show that the method based on the full-page document is feasible, avoiding the
cumbersome font segmentation problem. This is more efficient and more accurate
Deep Sparse Auto-Encoder Features Learning for Arabic Text Recognition
One of the most recent challenging issues of pattern recognition and artificial intelligence is Arabic text recognition. This research topic is still a pervasive and unaddressed research field, because of several factors. Complications arise due to the cursive nature of the Arabic writing, character similarities, unlimited vocabulary, use of multi-size and mixed-fonts, etc. To handle these challenges, an automatic Arabic text recognition requires building a robust system by computing discriminative features and applying a rigorous classifier together to achieve an improved performance. In this work, we introduce a new deep learning based system that recognizes Arabic text contained in images. We propose a novel hybrid network, combining a Bag-of-Feature (BoF) framework for feature extraction based on a deep Sparse Auto-Encoder (SAE), and Hidden Markov Models (HMMs), for sequence recognition. Our proposed system, termed BoF-deep SAE-HMM, is tested on four datasets, namely the printed Arabic line images Printed KHATT (P-KHATT), the benchmark printed word images Arabic Printed Text Image (APTI), the benchmark handwritten Arabic word images IFN/ENIT, and the benchmark handwritten digits images Modified National Institute of Standards and Technology (MNIST)
Automatic Personality Prediction; an Enhanced Method Using Ensemble Modeling
Human personality is significantly represented by those words which he/she
uses in his/her speech or writing. As a consequence of spreading the
information infrastructures (specifically the Internet and social media), human
communications have reformed notably from face to face communication.
Generally, Automatic Personality Prediction (or Perception) (APP) is the
automated forecasting of the personality on different types of human
generated/exchanged contents (like text, speech, image, video, etc.). The major
objective of this study is to enhance the accuracy of APP from the text. To
this end, we suggest five new APP methods including term frequency
vector-based, ontology-based, enriched ontology-based, latent semantic analysis
(LSA)-based, and deep learning-based (BiLSTM) methods. These methods as the
base ones, contribute to each other to enhance the APP accuracy through
ensemble modeling (stacking) based on a hierarchical attention network (HAN) as
the meta-model. The results show that ensemble modeling enhances the accuracy
of APP
Archives, Access and Artificial Intelligence: Working with Born-Digital and Digitized Archival Collections
Digital archives are transforming the Humanities and the Sciences. Digitized collections of newspapers and books have pushed scholars to develop new, data-rich methods. Born-digital archives are now better preserved and managed thanks to the development of open-access and commercial software. Digital Humanities have moved from the fringe to the center of academia. Yet, the path from the appraisal of records to their analysis is far from smooth. This book explores crossovers between various disciplines to improve the discoverability, accessibility, and use of born-digital archives and other cultural assets
Archives, Access and Artificial Intelligence
Digital archives are transforming the Humanities and the Sciences. Digitized collections of newspapers and books have pushed scholars to develop new, data-rich methods. Born-digital archives are now better preserved and managed thanks to the development of open-access and commercial software. Digital Humanities have moved from the fringe to the center of academia. Yet, the path from the appraisal of records to their analysis is far from smooth. This book explores crossovers between various disciplines to improve the discoverability, accessibility, and use of born-digital archives and other cultural assets
- …