157 research outputs found

    Optical Music Recognition: State of the Art and Major Challenges

    Get PDF
    Optical Music Recognition (OMR) is concerned with transcribing sheet music into a machine-readable format. The transcribed copy should allow musicians to compose, play and edit music by taking a picture of a music sheet. Complete transcription of sheet music would also enable more efficient archival. OMR facilitates examining sheet music statistically or searching for patterns of notations, thus helping use cases in digital musicology too. Recently, there has been a shift in OMR from using conventional computer vision techniques towards a deep learning approach. In this paper, we review relevant works in OMR, including fundamental methods and significant outcomes, and highlight different stages of the OMR pipeline. These stages often lack standard input and output representation and standardised evaluation. Therefore, comparing different approaches and evaluating the impact of different processing methods can become rather complex. This paper provides recommendations for future work, addressing some of the highlighted issues and represents a position in furthering this important field of research

    Off-line signature verification

    Get PDF
    In todayā€™s society signatures are the most accepted form of identity verification. However, they have the unfortunate side-effect of being easily abused by those who would feign the identification or intent of an individual. This thesis implements and tests current approaches to off-line signature verification with the goal of determining the most beneficial techniques that are available. This investigation will also introduce novel techniques that are shown to significantly boost the achieved classification accuracy for both person-dependent (one-class training) and person-independent (two-class training) signature verification learning strategies. The findings presented in this thesis show that many common techniques do not always give any significant advantage and in some cases they actually detract from the classification accuracy. Using the techniques that are proven to be most beneficial, an effective approach to signature verification is constructed, which achieves approximately 90% and 91% on the standard CEDAR and GPDS signature datasets respectively. These results are significantly better than the majority of results that have been previously published. Additionally, this approach is shown to remain relatively stable when a minimal number of training signatures are used, representing feasibility for real-world situations

    Text Line Segmentation of Historical Documents: a Survey

    Full text link
    There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines),automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade, and dedicated to documents of historical interest.Comment: 25 pages, submitted version, To appear in International Journal on Document Analysis and Recognition, On line version available at http://www.springerlink.com/content/k2813176280456k3

    Hidden Markov models for robust recognition of vehicle licence plates

    Get PDF
    In this dissertation the problem of recognising vehicle licence plates of which the symĀ¬bols can not be segmented by standard image processing techniques is addressed. Most licence plate recognition systems proposed in the literature do not compensate for disĀ¬torted, obscured and damaged licence plates. We implemented a novel system which uses a neural network/ hidden Markov model hybrid for licence plate recognition. We implemented a region growing algorithm, which was shown to work well when used to extract the licence plate from a vehicle image. Our vertical edges algorithm was not as successful. We also used the region growing algorithm to separate the symbols in the licence plate. Where the region growing algorithm failed, possible symbol borders were identified by calculating local minima of a vertical projection of the region. A multilayer perceptron neural network was used to estimate symbol probabilities of all the possible symbols in the region. The licence plate symbols were the inputs of the neural network, and were scaled to a constant size. We found that 7 x 12 gave the best character recognition rate. Out of 2117 licence plate symbols we achieved a symbol recognition rate of 99.53%. By using the vertical projection of a licence plate image, we were able to separate the licence plate symbols out of images for which the region growing algorithm failed. Legal licence plate sequences were used to construct a hidden Markov model containĀ¬ing all allowed symbol orderings. By adapting the Viterbi algorithm with sequencing constraints, the most likely licence plate symbol sequences were calculated, along with a confidence measure. The confidence measure enabled us to use more than one licence plate and symbol segmentation technique. Our recognition rate increased dramatically when we comĀ¬bined the different techniques. The results obtained showed that the system developed worked well, and achieved a licence plate recognition rate of 93.7%.Dissertation (MEng (Computer Engineering))--University of Pretoria, 2002.Electrical, Electronic and Computer Engineeringunrestricte

    Geometric correction of historical Arabic documents

    Get PDF
    Geometric deformations in historical documents significantly influence the success of both Optical Character Recognition (OCR) techniques and human readability. They may have been introduced at any time during the life cycle of a document, from when it was first printed to the time it was digitised by an imaging device. This Thesis focuses on the challenging domain of geometric correction of Arabic historical documents, where background research has highlighted that existing approaches for geometric correction of Latin-script historical documents are not sensitive to the characteristics of text in Arabic documents and therefore cannot be applied successfully. Text line segmentation and baseline detection algorithms have been investigated to propose a new more suitable one for warped Arabic historical document images. Advanced ideas for performing dewarping and geometric restoration on historical Arabic documents, as dictated by the specific characteristics of the problem have been implemented.In addition to developing an algorithm to detect accurate baselines of historical printed Arabic documents the research also contributes a new dataset consisting of historical Arabic documents with different degrees of warping severity.Overall, a new dewarping system, the first for Historical Arabic documents, has been developed taking into account both global and local features of the text image and the patterns of the smooth distortion between text lines. By using the results of the proposed line segmentation and baseline detection methods, it can cope with a variety of distortions, such as page curl, arbitrary warping and fold

    Content-based image retrieval of museum images

    Get PDF
    Content-based image retrieval (CBIR) is becoming more and more important with the advance of multimedia and imaging technology. Among many retrieval features associated with CBIR, texture retrieval is one of the most difficult. This is mainly because no satisfactory quantitative definition of texture exists at this time, and also because of the complex nature of the texture itself. Another difficult problem in CBIR is query by low-quality images, which means attempts to retrieve images using a poor quality image as a query. Not many content-based retrieval systems have addressed the problem of query by low-quality images. Wavelet analysis is a relatively new and promising tool for signal and image analysis. Its time-scale representation provides both spatial and frequency information, thus giving extra information compared to other image representation schemes. This research aims to address some of the problems of query by texture and query by low quality images by exploiting all the advantages that wavelet analysis has to offer, particularly in the context of museum image collections. A novel query by low-quality images algorithm is presented as a solution to the problem of poor retrieval performance using conventional methods. In the query by texture problem, this thesis provides a comprehensive evaluation on wavelet-based texture method as well as comparison with other techniques. A novel automatic texture segmentation algorithm and an improved block oriented decomposition is proposed for use in query by texture. Finally all the proposed techniques are integrated in a content-based image retrieval application for museum image collections
    • ā€¦
    corecore