9 research outputs found
Deep Learning-based Recognition of Devanagari Handwritten Characters
Numerous techniques have been used over many years to study handwriting recognition. There are two methods for reading handwriting, one of which is online and the other offline. Image recognition is the main part of the handwriting recognition process. Image recognition gives careful consideration to the picture's dimensions, viewing angle, and image quality. Machine learning and deep learning techniques are the two areas of focus for developers looking to increase the intelligence of computers. A person may learn to perform a task by repeatedly exercising it until they recall how to do it. His brain's neurons begin to work automatically, enabling him to carry out the task he has quickly learned. This and deep learning are fairly similar. It uses a variety of neural network designs to address a range of problems. The convolution neural network (CNN) is a very effective technique for handwriting and picture detection
A study on idiosyncratic handwriting with impact on writer identification
© 2018 IEEE. In this paper, we study handwriting idiosyncrasy in terms of its structural eccentricity. In this study, our approach is to find idiosyncratic handwritten text components and model the idiosyncrasy analysis task as a machine learning problem supervised by human cognition. We employ the Inception network for this purpose. The experiments are performed on two publicly available databases and an in-house database of Bengali offline handwritten samples. On these samples, subjective opinion scores of handwriting idiosyncrasy are collected from handwriting experts. We have analyzed the handwriting idiosyncrasy on this corpus which comprises the perceptive ground-truth opinion. We also investigate the effect of idiosyncratic text on writer identification by using the SqueezeNet. The performance of our system is promising
Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval
Recognition and retrieval of textual content from the large document
collections have been a powerful use case for the document image analysis
community. Often the word is the basic unit for recognition as well as
retrieval. Systems that rely only on the text recogniser (OCR) output are not
robust enough in many situations, especially when the word recognition rates
are poor, as in the case of historic documents or digital libraries. An
alternative has been word spotting based methods that retrieve/match words
based on a holistic representation of the word. In this paper, we fuse the
noisy output of text recogniser with a deep embeddings representation derived
out of the entire word. We use average and max fusion for improving the ranked
results in the case of retrieval. We validate our methods on a collection of
Hindi documents. We improve word recognition rate by 1.4 and retrieval by 11.13
in the mAP.Comment: 15 pages, 8 figures, Accepted in IAPR International Workshop on
Document Analysis Systems (DAS) 2020, "Visit project page, at
http://cvit.iiit.ac.in/research/projects/cvit-projects/fused-text-recogniser-and-deep-embeddings-improve-word-recognition-and-retrieval
Cognitive Analysis for Reading and Writing of Bengali Conjuncts
© 2018 IEEE. In this paper, we study the difficulties arising in reading and writing of Bengali conjunct characters by human-beings. Such difficulties appear when the human cognitive system faces certain obstructions in effortlessly reading/writing. In our computer-based investigation, we consider the reading/writing difficulty analysis task as a machine learning problem supervised by human perception. To this end, we employ two distinct models: (a) an auto-derived feature-based Inception network and (b) a hand-crafted feature-based SVM (Support Vector Machine). Two commonly used Bengali printed fonts and three contemporary handwritten databases are used for collecting subjective opinion scores from human readers/writers. On this corpus, which contains the perceptive ground-truth opinion of reading/writing complications, we have undertaken to conduct the experiments. The experimental results obtained on various types of conjunct characters are promising
Design of an Offline Handwriting Recognition System Tested on the Bangla and Korean Scripts
This dissertation presents a flexible and robust offline handwriting recognition system which is tested on the Bangla and Korean scripts. Offline handwriting recognition is one of the most challenging and yet to be solved problems in machine learning. While a few popular scripts (like Latin) have received a lot of attention, many other widely used scripts (like Bangla) have seen very little progress. Features such as connectedness and vowels structured as diacritics make it a challenging script to recognize. A simple and robust design for offline recognition is presented which not only works reliably, but also can be used for almost any alphabetic writing system. The framework has been rigorously tested for Bangla and demonstrated how it can be transformed to apply to other scripts through experiments on the Korean script whose two-dimensional arrangement of characters makes it a challenge to recognize.
The base of this design is a character spotting network which detects the location of different script elements (such as characters, diacritics) from an unsegmented word image. A transcript is formed from the detected classes based on their corresponding location information. This is the first reported lexicon-free offline recognition system for Bangla and achieves a Character Recognition Accuracy (CRA) of 94.8%. This is also one of the most flexible architectures ever presented. Recognition of Korean was achieved with a 91.2% CRA. Also, a powerful technique of autonomous tagging was developed which can drastically reduce the effort of preparing a dataset for any script. The combination of the character spotting method and the autonomous tagging brings the entire offline recognition problem very close to a singular solution.
Additionally, a database named the Boise State Bangla Handwriting Dataset was developed. This is one of the richest offline datasets currently available for Bangla and this has been made publicly accessible to accelerate the research progress. Many other tools were developed and experiments were conducted to more rigorously validate this framework by evaluating the method against external datasets (CMATERdb 1.1.1, Indic Word Dataset and REID2019: Early Indian Printed Documents). Offline handwriting recognition is an extremely promising technology and the outcome of this research moves the field significantly ahead
Graphonomics and your Brain on Art, Creativity and Innovation : Proceedings of the 19th International Graphonomics Conference (IGS 2019 – Your Brain on Art)
[Italiano]: “Grafonomia e cervello su arte, creatività e innovazione”.
Un forum internazionale per discutere sui recenti progressi nell'interazione tra arti creative, neuroscienze, ingegneria, comunicazione, tecnologia, industria, istruzione, design, applicazioni forensi e mediche. I contributi hanno esaminato lo stato dell'arte, identificando sfide e opportunità , e hanno delineato le possibili linee di sviluppo di questo settore di ricerca. I temi affrontati includono: strategie integrate per la comprensione dei sistemi neurali, affettivi e cognitivi in ambienti realistici e complessi; individualità e differenziazione dal punto di vista neurale e comportamentale; neuroaesthetics (uso delle neuroscienze per spiegare e comprendere le esperienze estetiche a livello neurologico); creatività e innovazione; neuro-ingegneria e arte ispirata dal cervello, creatività e uso di dispositivi di mobile brain-body imaging (MoBI) indossabili; terapia basata su arte creativa; apprendimento informale; formazione; applicazioni forensi. / [English]: “Graphonomics and your brain on art, creativity and innovation”.
A single track, international forum for discussion on recent advances at the intersection of the creative arts, neuroscience, engineering, media, technology, industry, education, design, forensics, and medicine.
The contributions reviewed the state of the art, identified challenges and opportunities and created a roadmap for the field of graphonomics and your brain on art.
The topics addressed include: integrative strategies for understanding neural, affective and cognitive systems in realistic, complex environments; neural and behavioral individuality and variation; neuroaesthetics (the use of neuroscience to explain and understand the aesthetic experiences at the neurological level); creativity and innovation; neuroengineering and brain-inspired art, creative concepts and wearable mobile brain-body imaging (MoBI) designs; creative art therapy; informal learning; education; forensics
Handwriting Analysis and Personality: A Computerized Study on the Validity of Graphology
Handwriting analysis, also known as graphology, is defined as an analysis of a psychological structure of a human subject through his/her handwriting. It has been applied recently in different fields including areas where making a crucial decision is highly desirable such as forensic evidence, criminology, and disease analysis. However, making a crucial decision based on the results of handwriting analysis is a controversial scientific topic because
the validity of graphology rules is still in question.
A few validity studies on handwriting analysis have already been conducted earlier using the evaluation of correlation between psychological questionnaires and manual handwriting analysis and they ended up with conflicting results. Manual handwriting analysis suffers from some issues which could be the reasons of the early inconsistent results. Therefore, this study conducted an empirical study that investigates the validation of graphology rules
by evaluating the correlation between one of psychological tests named Big Five Factor Markers Test and our proposed automated handwriting analysis system that measures the level of the same big five personality traits based on graphological rules.
Our automated BFFM system is called Averaging of SMOTE multi-label SVM-CNN (AvgMlSC). It constructs synthetic samples using Synthetic Minority Oversampling Technique (SMOTE) and averages two learning-based classifiers i.e. Multi-label Support Vector Machine and Multi-label Convolutional Neural Network based on offline handwriting recognition to produce one optimal predictive model. The model is trained using 1066 handwriting samples written in English, French, Chinese, Arabic, and Spanish. The results reveal that our proposed model outperformed the overall performance of five traditional models with 93% predictive accuracy, 0.94 AUC, and 90% F-Score.
The statistical test of Spearman’s correlation reports that there is a statistically significant relationship between the score of the big five factors questionnaire and the graphologist’s evaluation for the Big Five Factors with a different strength of relationship. A weak positive
relationship is found for Extraversion. However, a moderate positive relationship is reported for Conscientiousness and Open to Experience. On the other hand, a strong positive relationship is indicated for Agreeableness, whilst a very weak positive relationship
has been found for the last factor which is Emotional Stability
Advanced document data extraction techniques to improve supply chain performance
In this thesis, a novel machine learning technique to extract text-based information from scanned images has been developed. This information extraction is performed in the context of scanned invoices and bills used in financial transactions. These financial transactions contain a considerable amount of data that must be extracted, refined, and stored digitally before it can be used for analysis. Converting this data into a digital format is often a time-consuming process. Automation and data optimisation show promise as methods for reducing the time required and the cost of Supply Chain Management (SCM) processes, especially Supplier Invoice Management (SIM), Financial Supply Chain Management (FSCM) and Supply Chain procurement processes. This thesis uses a cross-disciplinary approach involving Computer Science and Operational Management to explore the benefit of automated invoice data extraction in business and its impact on SCM. The study adopts a multimethod approach based on empirical research, surveys, and interviews performed on selected companies.The expert system developed in this thesis focuses on two distinct areas of research: Text/Object Detection and Text Extraction. For Text/Object Detection, the Faster R-CNN model was analysed. While this model yields outstanding results in terms of object detection, it is limited by poor performance when image quality is low. The Generative Adversarial Network (GAN) model is proposed in response to this limitation. The GAN model is a generator network that is implemented with the help of the Faster R-CNN model and a discriminator that relies on PatchGAN. The output of the GAN model is text data with bonding boxes. For text extraction from the bounding box, a novel data extraction framework consisting of various processes including XML processing in case of existing OCR engine, bounding box pre-processing, text clean up, OCR error correction, spell check, type check, pattern-based matching, and finally, a learning mechanism for automatizing future data extraction was designed. Whichever fields the system can extract successfully are provided in key-value format.The efficiency of the proposed system was validated using existing datasets such as SROIE and VATI. Real-time data was validated using invoices that were collected by two companies that provide invoice automation services in various countries. Currently, these scanned invoices are sent to an OCR system such as OmniPage, Tesseract, or ABBYY FRE to extract text blocks and later, a rule-based engine is used to extract relevant data. While the system’s methodology is robust, the companies surveyed were not satisfied with its accuracy. Thus, they sought out new, optimized solutions. To confirm the results, the engines were used to return XML-based files with text and metadata identified. The output XML data was then fed into this new system for information extraction. This system uses the existing OCR engine and a novel, self-adaptive, learning-based OCR engine. This new engine is based on the GAN model for better text identification. Experiments were conducted on various invoice formats to further test and refine its extraction capabilities. For cost optimisation and the analysis of spend classification, additional data were provided by another company in London that holds expertise in reducing their clients' procurement costs. This data was fed into our system to get a deeper level of spend classification and categorisation. This helped the company to reduce its reliance on human effort and allowed for greater efficiency in comparison with the process of performing similar tasks manually using excel sheets and Business Intelligence (BI) tools.The intention behind the development of this novel methodology was twofold. First, to test and develop a novel solution that does not depend on any specific OCR technology. Second, to increase the information extraction accuracy factor over that of existing methodologies. Finally, it evaluates the real-world need for the system and the impact it would have on SCM. This newly developed method is generic and can extract text from any given invoice, making it a valuable tool for optimizing SCM. In addition, the system uses a template-matching approach to ensure the quality of the extracted information
Offline cursive Bengali word recognition using CNNs with a recurrent model
© 2016 IEEE. This paper deals with offline handwritten word recognition of a major Indic script: Bengali. Due to the structure of this script, the characters (mostly ortho-syllables) are frequently overlapping and hard to segment, especially when the writing is cursive. Individual character recognition and the combination of outputs can increase the likelihood of errors. Instead, a better approach can be sending the whole word to a suitable recognizer. Here we use the Convolutional Neural Network (CNN) integrated with a recurrent model for this purpose. Long short-term memory blocks are used as hidden units. Also, the CNN-derived features are employed in a recurrent model with a CTC (Connectionist Temporal Classification) layer to get the output. We have tested our method on three datasets: (a) a publicly available dataset, (b) a new dataset generated by our research group and (c) an unconstrained dataset. The dataset (a) contains 17,091 words, while our dataset (b) contains 107,550 number of words in total. In addition to these, the dataset (c) is comprised of 5,223 words. We have compared our results with those of some earlier work in the area and have found improved performance, which is due to the novel integration of CNNs with the recurrent model