162 research outputs found
DeepScribe: Localization and Classification of Elamite Cuneiform Signs Via Deep Learning
Twenty-five hundred years ago, the paperwork of the Achaemenid Empire was
recorded on clay tablets. In 1933, archaeologists from the University of
Chicago's Oriental Institute (OI) found tens of thousands of these tablets and
fragments during the excavation of Persepolis. Many of these tablets have been
painstakingly photographed and annotated by expert cuneiformists, and now
provide a rich dataset consisting of over 5,000 annotated tablet images and
100,000 cuneiform sign bounding boxes. We leverage this dataset to develop
DeepScribe, a modular computer vision pipeline capable of localizing cuneiform
signs and providing suggestions for the identity of each sign. We investigate
the difficulty of learning subtasks relevant to cuneiform tablet transcription
on ground-truth data, finding that a RetinaNet object detector can achieve a
localization mAP of 0.78 and a ResNet classifier can achieve a top-5 sign
classification accuracy of 0.89. The end-to-end pipeline achieves a top-5
classification accuracy of 0.80. As part of the classification module,
DeepScribe groups cuneiform signs into morphological clusters. We consider how
this automatic clustering approach differs from the organization of standard,
printed sign lists and what we may learn from it. These components, trained
individually, are sufficient to produce a system that can analyze photos of
cuneiform tablets from the Achaemenid period and provide useful transliteration
suggestions to researchers. We evaluate the model's end-to-end performance on
locating and classifying signs, providing a roadmap to a linguistically-aware
transliteration system, then consider the model's potential utility when
applied to other periods of cuneiform writing.Comment: Currently under review in the ACM JOCC
Handwritten text generation and strikethrough characters augmentation
We introduce two data augmentation techniques, which, used with a Resnet-BiLSTM-CTC network, significantly reduce Word Error Rate and Character Error Rate beyond best-reported results on handwriting text recognition tasks. We apply a novel augmentation that simulates strikethrough text (HandWritten Blots) and a handwritten text generation method based on printed text (StackMix), which proved to be very effective in handwriting text recognition tasks. StackMix uses weakly-supervised framework to get character boundaries. Because these data augmentation techniques are independent of the network used, they could also be applied to enhance the performance of other networks and approaches to handwriting text recognition. Extensive experiments on ten handwritten text datasets show that HandWritten Blots augmentation and StackMix significantly improve the quality of handwriting text recognition models
Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics
This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p
Deep Learning Based Real Time Devanagari Character Recognition
The revolutionization of the technology behind optical character recognition (OCR) has helped it to become one of those technologies that have found plenty of uses in the entire industrial space. Today, the OCR is available for several languages and have the capability to recognize the characters in real time, but there are some languages for which this technology has not developed much. All these advancements have been possible because of the introduction of concepts like artificial intelligence and deep learning. Deep Neural Networks have proven to be the best choice when it comes to a task involving recognition. There are many algorithms and models that can be used for this purpose. This project tries to implement and optimize a deep learning-based model which will be able to recognize Devanagari script’s characters in real time by analyzing the hand movements
- …