9 research outputs found

    A Multiclassifier Approach for Drill Wear Prediction

    Get PDF
    Classification methods have been widely used during last years in order to predict patterns and trends of interest in data. In present paper, a multiclassifier approach that combines the output of some of the most popular data mining algorithms is shown. The approach is based on voting criteria, by estimating the confidence distributions of each algorithm individually and combining them according to three different methods: confidence voting, weighted voting and majority voting. To illustrate its applicability in a real problem, the drill wear detection in machine-tool sector is addressed. In this study, the accuracy obtained by each isolated classifier is compared with the performance of the multiclassifier when characterizing the patterns of interest involved in the drilling process and predicting the drill wear. Experimental results show that, in general, false positives obtained by the classifiers can be slightly reduced by using the multiclassifier approach

    A discrete hidden Markov model for the recognition of handwritten Farsi words

    Get PDF
    Handwriting recognition systems (HRS) have been researched for more than 50 years. Designing a system to recognize specific words in a handwritten clean document is still a difficult task and the challenge is to achieve a high recognition rate. Previously, most of the research in the handwriting recognition domain was conducted on Chinese and Latin languages, while recently more people have shown an interest in the Indo-Iranian script recognition systems. In this thesis, we present an automatic handwriting recognition system for Farsi words. The system was trained, validated and tested on the CENPARMI Farsi Dataset, which was gathered during this research. CENPARMI's Farsi Dataset is unique in terms of its huge number of images (432,357 combined grayscale and binary), inclusion of all possible handwriting types (Dates, Words, Isolated Characters, Isolated Digits, Numeral Strings, Special Symbols, Documents), the variety of cursive styles, the number of writers (400) and the exclusive participation of Native Farsi speakers in the gathering of data. The words were first preprocessed. Concavity and Distribution features were extracted and the codebook was calculated by the vector quantization method. A Discrete Hidden Markov Model was chosen as the classifier because of the cursive nature of the Farsi script. Finally, encouraging recognition rates of98.76% and 96.02% have been obtained for the Training and Testing sets, respectivel

    RECONNAISSANCE DE FORMES APPLIQUEE A L’ECRITURE ARABEMANUSCRITE PAR DES MULTICLASSIFIEURS

    Get PDF
    Le présent travail porte sur une étude concernant le domaine de reconnaissance de formes appliqué sur l’écriture arabe manuscrite par des multiclassifieurs, D’abords il s’agit de faire une étude générale sur la reconnaissance de formes, puis de faire une étude bibliographique sur les systèmes existants et les différentes recherches effectuées sur ce domaine, ensuite de faire une étude sur les caractéristiques morphologiques et structurelles de l’écriture Arabe, puis étudier les systèmes de classification couramment utilisés, ainsi que des concepts de bases des combinaisons parallèles des classifieurs. Pour enfin proposer un système multiclassifieur de reconnaissance de mots arabes dans un lexique défini

    Arbitrary Keyword Spotting in Handwritten Documents

    Get PDF
    Despite the existence of electronic media in today’s world, a considerable amount of written communications is in paper form such as books, bank cheques, contracts, etc. There is an increasing demand for the automation of information extraction, classification, search, and retrieval of documents. The goal of this research is to develop a complete methodology for the spotting of arbitrary keywords in handwritten document images. We propose a top-down approach to the spotting of keywords in document images. Our approach is composed of two major steps: segmentation and decision. In the former, we generate the word hypotheses. In the latter, we decide whether a generated word hypothesis is a specific keyword or not. We carry out the decision step through a two-level classification where first, we assign an input image to a keyword or non-keyword class; and then transcribe the image if it is passed as a keyword. By reducing the problem from the image domain to the text domain, we do not only address the search problem in handwritten documents, but also the classification and retrieval, without the need for the transcription of the whole document image. The main contribution of this thesis is the development of a generalized minimum edit distance for handwritten words, and to prove that this distance is equivalent to an Ergodic Hidden Markov Model (EHMM). To the best of our knowledge, this work is the first to present an exact 2D model for the temporal information in handwriting while satisfying practical constraints. Some other contributions of this research include: 1) removal of page margins based on corner detection in projection profiles; 2) removal of noise patterns in handwritten images using expectation maximization and fuzzy inference systems; 3) extraction of text lines based on fast Fourier-based steerable filtering; 4) segmentation of characters based on skeletal graphs; and 5) merging of broken characters based on graph partitioning. Our experiments with a benchmark database of handwritten English documents and a real-world collection of handwritten French documents indicate that, even without any word/document-level training, our results are comparable with two state-of-the-art word spotting systems for English and French documents

    Fusion of Multiple Handwritten Word Recognition Techniques

    No full text
    Fusion of multiple handwritten word recognition techniques is described. A novel borda count for fusion based on ranks and confidence values is proposed. Three techniques with two different conventional segmentation algorithms in conjunction with backpropagation and radial basis function neural networks have been used in this research. Development has taken place at the University of Missouri and Griffith University. All experiments were performed on real-world handwritten words taken from the CEDAR benchmark database. The word recognition results are very promising and highest (91 96) among published results for handwritten words
    corecore