Search CORE

1,215 research outputs found

A tool for semiautomatic cataloguing of an islamic digital library: a use case from the Digital Maktaba project (short paper)

Author: Martoglia R.
Sala L.
Vanzini M.
Vigliermo R.
Publication venue
Publication date: 01/01/2022
Field of study

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Novel Perspectives for the Management of Multilingual and Multialphabetic Heritages through Automatic Knowledge Extraction: The DigitalMaktaba Approach

Author: Federico Ruozzi
Luca Sala
Matteo Vanzini
Riccardo Amerigo Vigliermo
Riccardo Martoglia
Sonia Bergamaschi
Stefania De Nardis
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

The linguistic and social impact of multiculturalism can no longer be neglected in any sector, creating the urgent need of creating systems and procedures for managing and sharing cultural heritages in both supranational and multi-literate contexts. In order to achieve this goal, text sensing appears to be one of the most crucial research areas. The long-term objective of the DigitalMaktaba project, born from interdisciplinary collaboration between computer scientists, historians, librarians, engineers and linguists, is to establish procedures for the creation, management and cataloguing of archival heritage in non-Latin alphabets. In this paper, we discuss the currently ongoing design of an innovative workflow and tool in the area of text sensing, for the automatic extraction of knowledge and cataloguing of documents written in non-Latin languages (Arabic, Persian and Azerbaijani). The current prototype leverages different OCR, text processing and information extraction techniques in order to provide both a highly accurate extracted text and rich metadata content (including automatically identified cataloguing metadata), overcoming typical limitations of current state of the art approaches. The initial tests provide promising results. The paper includes a discussion of future steps (e.g., AI-based techniques further leveraging the extracted data/metadata and making the system learn from user feedback) and of the many foreseen advantages of this research, both from a technical and a broader cultural-preservation and sharing point of view

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Inviting AI into the archives:The reception of handwritten recognition technology into historical manuscript transcription

Author: Terras Melissa
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 21/04/2022
Field of study

Edinburgh Research Explorer

Offline MODI Character Recognition Using Complex Moments

Author: Prashant L. Borde
Pravin L. Yannawar
Ramesh R. Manza
Sadanand A. Kulkarni
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 31/12/2015
Field of study

AbstractThe varying writing style and critical representation of characters in Indian script makes Handwritten Optical Character (HOCR) challenging and has attracted researchers to contribute in this domain. ‘MODI’ Script had cursive type of writings in Devanagari, Marathi where large amount of historical documents were available and need to be digitally explored. The principal objective of this research work is to describe efficiency of Zernike Complex moments and Zernike moments with different Zoning patterns for offline recognition of handwritten ‘MODI’ characters. Every character was divided in six zoning patterns with 37 zones. Geometrical shapes were used to create zoning patterns. The work was resulted in 94.92% correct recognition rate was achieved by using Zernike moments and 94.78% by using Zernike complex moments with integrated approach for heterogeneous zones

Elsevier - Publisher Connector

Segmentation-free Word Spotting for Handwritten Arabic Documents

Author: Chenouni Driss
El Yacoubi Mounîm
Elfakir Youssef
Khaissidi Ghizlane
Mrabti Mostafa
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 07/07/2021
Field of study

In this paper we present an unsupervised segmentation-free method for spotting and searching query, especially, for images documents in handwritten Arabic, for this, Histograms of Oriented Gradients (HOGs) are used as the feature vectors to represent the query and documents image. Then, we compress the descriptors with the product quantization method. Finally, a better representation of the query is obtained by using the Support Vector Machines (SVM)

Re-UNIR

Access to recorded interviews: A research agenda

Author: Heeren W.F.L.
Jong F.M.G. de
Oard D.W.
Ordelman R.J.F.
Publication venue: ACM
Publication date: 01/01/2008
Field of study

Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

University of Twente Research Information