145 research outputs found

    Αναζήτηση Λέξεων σε Εικόνες Ιστορικών Εγγράφων

    Get PDF
    Στην παρούσα διδακτορική διατριβή αναπτύχθηκαν πρωτοποριακές μέθοδοι για τον εντοπισμό λέξεων σε ιστορικά τυπωμένα έγγραφα. Συγκεκριμένα, αναπτύχθηκαν δύο μέθοδοι οι οποίες κάνουν χρήση κατάτμησης των εγγράφων σε επίπεδο λέξεων. Η πρώτη μέθοδος χρησιμοποιεί ένα υβριδικό μοντέλο χαρακτηριστικών για τη σύγκριση μεταξύ των εικόνων των λέξεων. Οι λέξεις-κλειδιά είναι συνθετικά δεδομένα τα οποία δημιουργούνται απο εικόνες μεμονωμένων χαρακτήρων που έχουν επιλεχθεί από τα έγγραφα. Επίσης, η μέθοδος χρησιμοποιεί μια διαδικασία ανατροφοδότησης αποτελεσμάτων από το χρήστη με σκοπό να βελτιώσει τα τελικά αποτελέσματα. Η δεύτερη μέθοδος που βασίζεται σε κατάτμηση εγγράφων σε επίπεδο λέξεων έρχεται να αντιμετωπίσει το πρόβλημα της σύγκρισης συνθετικών δεδομένων με πραγματικά δεδομένα από τα έγγραφα. Λόγω του ότι οι συνθετικές λέξεις παρουσιάζουν διαφοροποίηση σε σχέση με τις κατετμημένες από τα έγγραφα λέξεις, αναπτύχθηκε μία μέθοδος που χρησιμοποιεί τον αλγόριθμο Δυναμικής Στρέβλωσης Χρόνου (Dynamic Time Warping - DTW) ώστε να απορροφήσει τις τοπικές ανωμαλίες και διαφοροποιήσεις μεταξύ των λέξεων. Τέλος, αναπτύχθηκε μία μέθοδος η οποία δε χρησιμοποιεί κανένα είδος κατάτμησης των εγγράφων. Οι λέξεις εντοπίζονται απευθείας επάνω σε ολόκληρες τις εικόνες των εγγράφων. Η μέθοδος αυτή έρχεται να ξεπεράσει το πρόβλημα που δημιουργείται σε περιπτώσεις λανθασμένης κατάτμησης όπου επηρεάζει σημαντικά το τελικό αποτέλεσμα. Επίσης, δίνει τη δυνατότητα μερικού εντοπισμού λέξεων όπως για παράδειγμα λέξεις οι οποίες περιλαμβάνονται άλλες όπως συμβαίνει στις σύνθετες λέξεις. Τα αποτελέσματα των μεθόδων είναι ικανοποιητικά και ξεπερνούν ανταγωνιστικές μεθόδους αναζήτησης λέξεων σε ιστορικά έγγραφα.In this PhD thesis innovative methods of wordspotting on historical printed documents are presented. In particular, two methods based on document segmentation on word level have been developed. The first method uses a hybrid feature scheme for word matching based on zones and projections. It also uses a process of creating query keyword images for any word using synthetic data. The synthetic words are created using images of individual characters taken from the processed documents. The method also presents a process allowing user feedback in order to improve the final results. The second method uses the Dynamic Time Warping (DTW) algorithm for comparing word images. It assist the transition between the synthetic data and real data comparison. Synthetic data and real data differ and DTW allows a better alignment between the features of the two images. Again, feedback can be applied to improve the results. Furthermore, a method that uses no segmentation on the document images has been also developed. The method overcomes the problem of incorrect segmentation that affect the final results since it detects query keyword images directly on entire document page images. It also allows for partial matching such as detecting word that are included in larger ones. The evaluation of the aforementioned methods showed satisfactory results presenting better performance against competitive methods of wordspotting

    Classifying Human Leg Motions with Uniaxial Piezoelectric Gyroscopes

    Get PDF
    This paper provides a comparative study on the different techniques of classifying human leg motions that are performed using two low-cost uniaxial piezoelectric gyroscopes worn on the leg. A number of feature sets, extracted from the raw inertial sensor data in different ways, are used in the classification process. The classification techniques implemented and compared in this study are: Bayesian decision making (BDM), a rule-based algorithm (RBA) or decision tree, least-squares method (LSM), k-nearest neighbor algorithm (k-NN), dynamic time warping (DTW), support vector machines (SVM), and artificial neural networks (ANN). A performance comparison of these classification techniques is provided in terms of their correct differentiation rates, confusion matrices, computational cost, and training and storage requirements. Three different cross-validation techniques are employed to validate the classifiers. The results indicate that BDM, in general, results in the highest correct classification rate with relatively small computational cost

    Arbitrary Keyword Spotting in Handwritten Documents

    Get PDF
    Despite the existence of electronic media in today’s world, a considerable amount of written communications is in paper form such as books, bank cheques, contracts, etc. There is an increasing demand for the automation of information extraction, classification, search, and retrieval of documents. The goal of this research is to develop a complete methodology for the spotting of arbitrary keywords in handwritten document images. We propose a top-down approach to the spotting of keywords in document images. Our approach is composed of two major steps: segmentation and decision. In the former, we generate the word hypotheses. In the latter, we decide whether a generated word hypothesis is a specific keyword or not. We carry out the decision step through a two-level classification where first, we assign an input image to a keyword or non-keyword class; and then transcribe the image if it is passed as a keyword. By reducing the problem from the image domain to the text domain, we do not only address the search problem in handwritten documents, but also the classification and retrieval, without the need for the transcription of the whole document image. The main contribution of this thesis is the development of a generalized minimum edit distance for handwritten words, and to prove that this distance is equivalent to an Ergodic Hidden Markov Model (EHMM). To the best of our knowledge, this work is the first to present an exact 2D model for the temporal information in handwriting while satisfying practical constraints. Some other contributions of this research include: 1) removal of page margins based on corner detection in projection profiles; 2) removal of noise patterns in handwritten images using expectation maximization and fuzzy inference systems; 3) extraction of text lines based on fast Fourier-based steerable filtering; 4) segmentation of characters based on skeletal graphs; and 5) merging of broken characters based on graph partitioning. Our experiments with a benchmark database of handwritten English documents and a real-world collection of handwritten French documents indicate that, even without any word/document-level training, our results are comparable with two state-of-the-art word spotting systems for English and French documents

    Arabic Manuscript Layout Analysis and Classification

    Get PDF

    Character Recognition

    Get PDF
    Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

    Music Synchronization, Audio Matching, Pattern Detection, and User Interfaces for a Digital Music Library System

    Get PDF
    Over the last two decades, growing efforts to digitize our cultural heritage could be observed. Most of these digitization initiatives pursuit either one or both of the following goals: to conserve the documents - especially those threatened by decay - and to provide remote access on a grand scale. For music documents these trends are observable as well, and by now several digital music libraries are in existence. An important characteristic of these music libraries is an inherent multimodality resulting from the large variety of available digital music representations, such as scanned score, symbolic score, audio recordings, and videos. In addition, for each piece of music there exists not only one document of each type, but many. Considering and exploiting this multimodality and multiplicity, the DFG-funded digital library initiative PROBADO MUSIC aimed at developing a novel user-friendly interface for content-based retrieval, document access, navigation, and browsing in large music collections. The implementation of such a front end requires the multimodal linking and indexing of the music documents during preprocessing. As the considered music collections can be very large, the automated or at least semi-automated calculation of these structures would be recommendable. The field of music information retrieval (MIR) is particularly concerned with the development of suitable procedures, and it was the goal of PROBADO MUSIC to include existing and newly developed MIR techniques to realize the envisioned digital music library system. In this context, the present thesis discusses the following three MIR tasks: music synchronization, audio matching, and pattern detection. We are going to identify particular issues in these fields and provide algorithmic solutions as well as prototypical implementations. In Music synchronization, for each position in one representation of a piece of music the corresponding position in another representation is calculated. This thesis focuses on the task of aligning scanned score pages of orchestral music with audio recordings. Here, a previously unconsidered piece of information is the textual specification of transposing instruments provided in the score. Our evaluations show that the neglect of such information can result in a measurable loss of synchronization accuracy. Therefore, we propose an OCR-based approach for detecting and interpreting the transposition information in orchestral scores. For a given audio snippet, audio matching methods automatically calculate all musically similar excerpts within a collection of audio recordings. In this context, subsequence dynamic time warping (SSDTW) is a well-established approach as it allows for local and global tempo variations between the query and the retrieved matches. Moving to real-life digital music libraries with larger audio collections, however, the quadratic runtime of SSDTW results in untenable response times. To improve on the response time, this thesis introduces a novel index-based approach to SSDTW-based audio matching. We combine the idea of inverted file lists introduced by Kurth and Müller (Efficient index-based audio matching, 2008) with the shingling techniques often used in the audio identification scenario. In pattern detection, all repeating patterns within one piece of music are determined. Usually, pattern detection operates on symbolic score documents and is often used in the context of computer-aided motivic analysis. Envisioned as a new feature of the PROBADO MUSIC system, this thesis proposes a string-based approach to pattern detection and a novel interactive front end for result visualization and analysis

    Framework for Automatic Identification of Paper Watermarks with Chain Codes

    Get PDF
    Title from PDF of title page viewed May 21, 2018Dissertation advisor: Reza DerakhshaniVitaIncludes bibliographical references (pages 220-235)Thesis (Ph.D.)--School of Computing and Engineering. University of Missouri--Kansas City, 2017In this dissertation, I present a new framework for automated description, archiving, and identification of paper watermarks found in historical documents and manuscripts. The early manufacturers of paper have introduced the embedding of identifying marks and patterns as a sign of a distinct origin and perhaps as a signature of quality. Thousands of watermarks have been studied, classified, and archived. Most of the classification categories are based on image similarity and are searchable based on a set of defined contextual descriptors. The novel method presented here is for automatic classification, identification (matching) and retrieval of watermark images based on chain code descriptors (CC). The approach for generation of unique CC includes a novel image preprocessing method to provide a solution for rotation and scale invariant representation of watermarks. The unique codes are truly reversible, providing high ratio lossless compression, fast searching, and image matching. The development of a novel distance measure for CC comparison is also presented. Examples for the complete process are given using the recently acquired watermarks digitized with hyper-spectral imaging of Summa Theologica, the work of Antonino Pierozzi (1389 – 1459). The performance of the algorithm on large datasets is demonstrated using watermarks datasets from well-known library catalogue collections.Introduction -- Paper and paper watermarks -- Automatic identification of paper watermarks -- Rotation, Scale and translation invariant chain code -- Comparison of RST_Invariant chain code -- Automatic identification of watermarks with chain codes -- Watermark composite feature vector -- Summary -- Appendix A. Watermarks from the Bernstein Collection used in this study -- Appendix B. The original and transformed images of watermarks -- Appendix C. The transformed and scaled images of watermarks -- Appendix D. Example of chain cod
    corecore