515 research outputs found

    Online Handwriting Recognition using HMM

    Get PDF
    Basically handwriting recognition can be divided into two parts as Offline handwriting recognition and Online handwriting recognition. Highly accurate output with predefined constraints can be given by Online handwriting recognition system as it is related to size of vocabulary and writer dependency, printed writing style etc. Hidden markov model increases the success rate of online recognition system. Online handwriting recognition gives additional time information which is not present in Offline system. A Markov process is a random prediction process whose future behavior rely only on its present state, does not depend on the past state. Which means it should satisfy the Markov condition. A Hidden markov model (HMM) is a statistical markov model. In HMM model the system being modeled is assumed to be a markov process with hidden states. Hidden Markov models (HMMs) can be viewed as extensions of discrete-state Markov processes. Human-machine interaction can be drastically getting improved as On-line handwriting recognition technology contains that capability. As instead of using keyboard any person can write anything by hand with the help of digital pen or any similar equipment would be more natural. HMM build a effective mathematical models for characterizing the variance both in time and signal space presented in speech signal

    Feature Trajectory Dynamic Time Warping for Clustering of Speech Segments

    Get PDF
    Dynamic time warping (DTW) can be used to compute the similarity between two sequences of generally differing length. We propose a modification to DTW that performs individual and independent pairwise alignment of feature trajectories. The modified technique, termed feature trajectory dynamic time warping (FTDTW), is applied as a similarity measure in the agglomerative hierarchical clustering of speech segments. Experiments using MFCC and PLP parametrisations extracted from TIMIT and from the Spoken Arabic Digit Dataset (SADD) show consistent and statistically significant improvements in the quality of the resulting clusters in terms of F-measure and normalised mutual information (NMI).Comment: 10 page

    Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)

    Get PDF
    Most state-of-the-art large vocabulary continuous speech recognition systems employ context dependent (CD) phone units, however, the CD phone units are not efficient in capturing long-term spectral dependencies of tone in most tone languages. The Standard Yorùbá (SY) is a language composed of syllable with tones and requires different method for the acoustic modeling. In this paper, a context dependent tone acoustic model was developed. Tone unit is assumed as syllables, amplitude magnified difference function (AMDF) was used to derive the utterance wide F contour, followed by automatic syllabification and tri-syllable forced alignment with speech phonetization alignment and syllabification SPPAS tool. For classification of the context dependent (CD) tone, slope and intercept of F values were extracted from each segmented unit. Supervised clustering scheme was utilized to partition CD tri-tone based on category and normalized based on some statistics to derive the acoustic feature vectors. Multi-class support vector machine (MSVM) was used for tri-tone training. From the experimental results, it was observed that the word recognition accuracy obtained from the MSVM tri-tone system based on dynamic programming tone embedded features was comparable with phone features. A best parameter tuning was obtained for 10-fold cross validation and overall accuracy was 97.5678%. In term of word error rate (WER), the MSVM CD tri-tone system outperforms the hidden Markov model tri-phone system with WER of 44.47%.Keywords: Syllabification, Standard Yorùbá, Context Dependent Tone, Tri-tone Recognitio

    Génération d'indicateurs de maintenance par une approche semi-paramétrique et par une approche markovienne

    Get PDF
    National audienceLes stratégies de maintenance et leurs évaluations demeurent une préoccupation particulièrement forte au sein des entreprises aujourd'hui. Les enjeux socio-économiques dépendant de la compétitivité de chacune d'entre elles sont de plus en plus étroitement liés à l'activité et à la qualité des interventions de maintenance. Une suite d'évènements particuliers peut, éventuellement, informer l'expert d'une panne prochaine. Notre étude tente d'appréhender "cette signature" à l'aide d'un modèle de Markov caché. Nous proposons à l'expert deux stratégies sur l'estimation du niveau de dégradation du système maintenu. La première stratégie consiste à utiliser des lois de dégradation non paramétriques. La deuxième stratégie consiste à utiliser une approche markovienne

    Real time speaker recognition using MFCC and VQ

    Get PDF
    Speaker Recognition is a process of automatically recognizing who is speaking on the basis of the individual information included in speech waves. Speaker Recognition is one of the most useful biometric recognition techniques in this world where insecurity is a major threat. Many organizations like banks, institutions, industries etc are currently using this technology for providing greater security to their vast databases.Speaker Recognition mainly involves two modules namely feature extraction and feature matching. Feature extraction is the process that extracts a small amount of data from the speaker’s voice signal that can later be used to represent that speaker. Feature matching involves the actual procedure to identify the unknown speaker by comparing the extracted features from his/her voice input with the ones that are already stored in our speech database.In feature extraction we find the Mel Frequency Cepstrum Coefficients, which are based on the known variation of the human ear’s critical bandwidths with frequency and these, are vector quantized using LBG algorithm resulting in the speaker specific codebook. In feature matching we find the VQ distortion between the input utterance of an unknown speaker and the codebooks stored in our database. Based on this VQ distortion we decide whether to accept/reject the unknown speaker’s identity. The system I implemented in my work is 80% accurate in recognizing the correct speaker.In second phase we implement on the acoustic of Real Time speaker ecognition using mfcc and vq on a TMS320C6713 DSP board. We analyze the workload and identify the most timeconsuming operations

    Automatic signature verification system

    Get PDF
    Philosophiae Doctor - PhDIn this thesis, we explore dynamic signature verification systems. Unlike other signature models, we use genuine signatures in this project as they are more appropriate in real world applications. Signature verification systems are typical examples of biometric devices that use physical and behavioral characteristics to verify that a person really is who he or she claims to be. Other popular biometric examples include fingerprint scanners and hand geometry devices. Hand written signatures have been used for some time to endorse financial transactions and legal contracts although little or no verification of signatures is done. This sets it apart from the other biometrics as it is well accepted method of authentication. Until more recently, only hidden Markov models were used for model construction. Ongoing research on signature verification has revealed that more accurate results can be achieved by combining results of multiple models. We also proposed to use combinations of multiple single variate models instead of single multi variate models which are currently being adapted by many systems. Apart from these, the proposed system is an attractive way for making financial transactions more secure and authenticate electronic documents as it can be easily integrated into existing transaction procedures and electronic communication

    Annotated Bibliography for the MATADOR Project

    Full text link

    Template Based Recognition of On-Line Handwriting

    Get PDF
    Software for recognition of handwriting has been available for several decades now and research on the subject have produced several different strategies for producing competitive recognition accuracies, especially in the case of isolated single characters. The problem of recognizing samples of handwriting with arbitrary connections between constituent characters (emph{unconstrained handwriting}) adds considerable complexity in form of the segmentation problem. In other words a recognition system, not constrained to the isolated single character case, needs to be able to recognize where in the sample one letter ends and another begins. In the research community and probably also in commercial systems the most common technique for recognizing unconstrained handwriting compromise Neural Networks for partial character matching along with Hidden Markov Modeling for combining partial results to string hypothesis. Neural Networks are often favored by the research community since the recognition functions are more or less automatically inferred from a training set of handwritten samples. From a commercial perspective a downside to this property is the lack of control, since there is no explicit information on the types of samples that can be correctly recognized by the system. In a template based system, each style of writing a particular character is explicitly modeled, and thus provides some intuition regarding the types of errors (confusions) that the system is prone to make. Most template based recognition methods today only work for the isolated single character recognition problem and extensions to unconstrained recognition is usually not straightforward. This thesis presents a step-by-step recipe for producing a template based recognition system which extends naturally to unconstrained handwriting recognition through simple graph techniques. A system based on this construction has been implemented and tested for the difficult case of unconstrained online Arabic handwriting recognition with good results
    corecore