Search CORE

176 research outputs found

JPEG for Arabic Handwritten Character Recognition: Add a Dimension of Application

Author: Abdurazzag Ali Aburas
Salem Ali Rehiel
Publication venue: 'IntechOpen'
Publication date: 01/10/2008
Field of study

IntechOpen

Crossref

Human Reading Based Strategies for off-line Arabic Word Recognition

Author: Belaïd Abdel
Choisy Christophe
Publication venue: HAL CCSD
Publication date: 27/09/2006
Field of study

International audienceThis paper summarizes some techniques proposed for off-line Arabic word recognition. The point of view developed here concerns the human reading favoring an interactive mechanism between global memorization and local checking making easier the recognition of complex scripts as Arabic. According to this consideration, some specific papers are analyzed and their strategies commente

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Deep Aramaic: Towards a synthetic data paradigm enabling machine learning in epigraphy

Author: Aioanei A.C.
Hunziker-Rodewald R.R.
Klein K.M.
Michels D.L.
Publication venue
Publication date: 01/01/2024
Field of study

Epigraphy is witnessing a growing integration of artificial intelligence, notably through its subfield of machine learning (ML), especially in tasks like extracting insights from ancient inscriptions. However, scarce labeled data for training ML algorithms severely limits current techniques, especially for ancient scripts like Old Aramaic. Our research pioneers an innovative methodology for generating synthetic training data tailored to Old Aramaic letters. Our pipeline synthesizes photo-realistic Aramaic letter datasets, incorporating textural features, lighting, damage, and augmentations to mimic real-world inscription diversity. Despite minimal real examples, we engineer a dataset of 250 000 training and 25 000 validation images covering the 22 letter classes in the Aramaic alphabet. This comprehensive corpus provides a robust volume of data for training a residual neural network (ResNet) to classify highly degraded Aramaic letters. The ResNet model demonstrates 95% accuracy in classifying real images from the 8th century BCE Hadad statue inscription. Additional experiments validate performance on varying materials and styles, proving effective generalization. Our results validate the model’s capabilities in handling diverse real-world scenarios, proving the viability of our synthetic data approach and avoiding the dependence on scarce training data that has constrained epigraphic analysis. Our innovative framework elevates interpretation accuracy on damaged inscriptions, thus enhancing knowledge extraction from these historical resources

Directory of Open Access Journals

International Migration, Integration and Social Cohesion online publications

NarDis:Narrativizing Disruption -How exploratory search can support media researchers to interpret ‘disruptive’ media events as lucid narratives

Author: Hagedoorn Berber
Sauer Sabrina
Publication venue: CLARIAH
Publication date: 01/03/2019
Field of study

University of Groningen

NarDis:Narrativizing Disruption -How exploratory search can support media researchers to interpret ‘disruptive’ media events as lucid narratives

Author: Hagedoorn Berber
Sauer Sabrina
Publication venue: CLARIAH
Publication date: 01/03/2019
Field of study

This project investigates how CLARIAH’s exploratory search and linked open data (LO D) browser DIVE+ supports media researchers to construct narratives about events, especially ‘disruptive’ events such as terrorist attacks and natural disasters. This project approaches this question by conducting user studies to examine how researchers use and create narratives with exploratory search tools, particularly DIVE+, to understand media events. These user studies were organized as workshops (using co-creation as an iterative approach to map search practices and storytelling data, including: focus groups & interviews; tasks & talk aloud protocols; surveys/questionnaires; and research diaries) and included more than 100 (digital) humanities researchers across Europe. Insights from these workshops show that exploratory search does facilitate the development of new research questions around disruptive events. DIVE+ triggers academic curiosity, by suggesting alternative connections between entities. Beside learning about research practices of (digital) humanities researchers and how these can be supported with digital tools, the pilot also culminated in improvements to the DIVE+ browser. The pilot helped optimize the browser’s functionalities, making it possible for users to annotate paths of search narratives, and save these in CLARIAH’s overarching, personalised, user space. The pilot was widely promoted at (inter)national conferences, and DIVE+ won the international LO DLAM (Linked Open Data in Libraries, Archives and Museums) Challenge Grand Prize in Venice (2017)

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

NarDis:Narrativizing Disruption -How exploratory search can support media researchers to interpret ‘disruptive’ media events as lucid narratives

Author: Hagedoorn Berber
Sauer Sabrina
Publication venue: CLARIAH
Publication date: 01/03/2019
Field of study

ARTS repository - University of Groningen

Biblia Arabica: An Update on the State of Research

Author: Miriam L. Hjälm
Nathan Gibson
Peter Tarras
Ronny Vollandt
Vevian Zaki
Publication venue: 'Modern Language Association'
Publication date: 01/01/2018
Field of study

The aim of this contribution is to review some of the major areas of current research on the Arabic Bible, along with the factors and trends contributing to them. Also we present some of the tools that are currently under development in the Biblia Arabica team, Munich. We provide here a very condensed survey of the transmission of traditions, as well as ways that biblical manuscripts in Arabic have been analysed and classified, covering both Old Testament/ Hebrew Bible and the New Testament. Overall, the lack of critical editions for Arabic biblical texts in general reflects not just the overwhelming number of versions and manuscripts, but also the fundamental challenge these translations present on the level of textuality. Standard paradigms of authorship and transmission break down in the face of the complex reuse, revision, and layering of paratexts seen in these texts. It is the careful study of manuscripts, not simply as texts but also as physical objects, which holds promise for reconstructing the practices of producers and consumers of the Arabic Bible. A union catalogue of Arabic Bible manuscripts will gather the paleographic and codicological information necessary for further research. Moreover, it will link manuscripts, translators, and scribes to the online Bibliography of the Arabic Bible, which is intended to be a comprehensive, classified, and searchable reference tool for secondary literature. In conclusion, scholarship of the Arabic Bible now has considerable momentum, but must continue to keep its fundamental resource – that of manuscripts – in the foreground of research

Humanities Commons

Learning to Read Bushman: Automatic Handwriting Recognition for Bushman Languages

Author: Williams Mr Kyle
Publication venue
Publication date: 01/01/2012
Field of study

The Bleek and Lloyd Collection contains notebooks that document the tradition, language and culture of the Bushman people who lived in South Africa in the late 19th century. Transcriptions of these notebooks would allow for the provision of services such as text-based search and text-to-speech. However, these notebooks are currently only available in the form of digital scans and the manual creation of transcriptions is a costly and time-consuming process. Thus, automatic methods could serve as an alternative approach to creating transcriptions of the text in the notebooks. In order to evaluate the use of automatic methods, a corpus of Bushman texts and their associated transcriptions was created. The creation of this corpus involved: the development of a custom method for encoding the Bushman script, which contains complex diacritics; the creation of a tool for creating and transcribing the texts in the notebooks; and the running of a series of workshops in which the tool was used to create the corpus. The corpus was used to evaluate the use of various techniques for automatically transcribing the texts in the corpus in order to determine which approaches were best suited to the complex Bushman script. These techniques included the use of Support Vector Machines, Artificial Neural Networks and Hidden Markov Models as machine learning algorithms, which were coupled with different descriptive features. The effect of the texts used for training the machine learning algorithms was also investigated as well as the use of a statistical language model. It was found that, for Bushman word recognition, the use of a Support Vector Machine with Histograms of Oriented Gradient features resulted in the best performance and, for Bushman text line recognition, Marti & Bunke features resulted in the best performance when used with Hidden Markov Models. The automatic transcription of the Bushman texts proved to be difficult and the performance of the different recognition systems was largely affected by the complexities of the Bushman script. It was also found that, besides having an influence on determining which techniques may be the most appropriate for automatic handwriting recognition, the texts used in a automatic handwriting recognition system also play a large role in determining whether or not automatic recognition should be attempted at all

UCT Computer Science Research Document Archive