17 research outputs found

    A probabilistic approach to melodic similarity

    Get PDF
    Melodic similarity is an important research topic in music information retrieval. The representation of symbolic music by means of trees has proven to be suitable in melodic similarity computation, because they are able to code rhythm in their structure leaving only pitch representations as a degree of freedom for coding. In order to compare trees, different edit distances have been previously used. In this paper, stochastic k-testable tree-models, formerly used in other domains like structured document compression or natural language processing, have been used for computing a similarity measure between melody trees as a probability and their performance has been compared to a classical tree edit distance.This work is supported by the Spanish Ministry projects: DPI2006-15542-C04, TIN2006-14932-C02, both partially supported by EU ERDF, the Consolider Ingenio 2010 research programme (project MIPRCV, CSD2007-00018) and the Pascal Network of Excellence

    Empirical Bayes estimation of software failures

    Get PDF
    The empirical Bayes estimator is applied to software failures production. The time between failures data registered up to a given time, are used in order to estimate the probability of failure appearance dur- ing the next interval time. This method is similar to the estimation of n-grams in natural language processing. A modi ed expression to the estimator usually used in language and speech processing is introduced in order to follow the failures production curve. Results of simulations comparing well with experimental data are also shown.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    Empirical Bayes estimation of software failures

    Get PDF
    The empirical Bayes estimator is applied to software failures production. The time between failures data registered up to a given time, are used in order to estimate the probability of failure appearance dur- ing the next interval time. This method is similar to the estimation of n-grams in natural language processing. A modi ed expression to the estimator usually used in language and speech processing is introduced in order to follow the failures production curve. Results of simulations comparing well with experimental data are also shown.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    Error Checking for Chinese Query by Mining Web Log

    Get PDF
    For the search engine, error-input query is a common phenomenon. This paper uses web log as the training set for the query error checking. Through the n-gram language model that is trained by web log, the queries are analyzed and checked. Some features including query words and their number are introduced into the model. At the same time data smoothing algorithm is used to solve data sparseness problem. It will improve the overall accuracy of the n-gram model. The experimental results show that it is effective

    Study of Jacobian Normalization for VTLN

    Get PDF
    The divergence of the theory and practice of vocal tract length normalization (VTLN) is addressed, with particular emphasis on the role of the Jacobian determinant. VTLN is placed in a Bayesian setting, which brings in the concept of a prior on the warping factor. The form of the prior, together with acoustic scaling and numerical conditioning are then discussed and evaluated. It is concluded that the Jacobian determinant is important in VTLN, especially for the high dimensional features used in HMM based speech synthesis, and difficulties normally associated with the Jacobian determinant can be attributed to prior and scaling

    Improving on-line handwritten recognition in interactive machine translation

    Full text link
    [EN] On-line handwriting text recognition (HTR) could be used as a more natural way of interaction in many interactive applications. However, current HTR technology is far from developing error-free systems and, consequently, its use in many applications is limited. Despite this, there are many scenarios, as in the correction of the errors of fully-automatic systems using HTR in a post-editing step, in which the information from the specific task allows to constrain the search and therefore to improve the HTR accuracy. For example, in machine translation (MT), the on-line HTR system can also be used to correct translation errors. The HTR can take advantage of information from the translation problem such as the source sentence that is translated, the portion of the translated sentence that has been supervised by the human, or the translation error to be amended. Empirical experimentation suggests that this is a valuable information to improve the robustness of the on-line HTR system achieving remarkable results.The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under Grant agreement no. 287576 (CasMaCat), from the EC (FEDER/FSE), and from the Spanish MEC/MICINN under the Active2Trans (TIN2012-31723) project. It is also supported by the Generalitat Valenciana under Grant ALMPR (Prometeo/2009/01) and GV/2010/067.Alabau Gonzalvo, V.; Sanchis Navarro, JA.; Casacuberta Nolla, F. (2014). Improving on-line handwritten recognition in interactive machine translation. Pattern Recognition. 47(3):1217-1228. https://doi.org/10.1016/j.patcog.2013.09.035S1217122847

    Un nuevo modelo para la estimación de bi-gramas en reconocimiento del habla

    Get PDF
    Se presenta un nuevo método para el suavizado de N-gramas utilizando regularización en un modelo de máxima entropía. Dicha regularización se efectúa introduciendo un término en la función objetivo al estilo de las máquinas de soporte vectorial. Relacionado con dicho término se incluye una variable que actúa como descuento de probabilidades en el estimador, similar al usado en otros métodos de suavizado de modelos de lenguaje, pero considerando dicho descuento como otra variable a optimizar. El modelo fue evaluado en una tarea de reconocimiento de habla usando modelos de lenguaje de bi-gramas. Los resultados se testaron usando la base de datos Latino-40 midiendo perplejidad y porcentaje de palabras reconocidas. Los resultados fueron significativamente superiores a un modelo que es estado del arte.Sociedad Argentina de Informática e Investigación Operativ

    Unsupervised Clustering and Automatic Language Model Generation for ASR

    Get PDF
    The goal of an automatic speech recognition system is to enable the computer in understanding human speech and act accordingly. In order to realize this goal, language modeling plays an important role. It works as a knowledge source through mimicking human comprehension mechanism in understanding the language. Among many other approaches, statistical language modeling technique is widely used in automatic speech recognition systems. However, the generation of reliable and robust statistical model is very difficult task, especially for a large vocabulary system. For a large vocabulary system, the performance of such a language model degrades as the vocabulary size increases. Hence, the performance of the speech recognition system also degrades due to the increased complexity and mutual confusion among the candidate words in the language model. In order to solve these problems, reduction of language model size as well as minimization of mutual confusion between words are required. In our work, we have employed clustering techniques, using self-organizing map, to build topical language models. Moreover, in order to capture the inherent semantics of sentences, a lexical dictionary, WordNet has been used in the clustering process. This thesis work focuses on various aspects of clustering, language model generation, extraction of task dependent acoustic parameters, and their implementations under the framework of the CMU Sphinx3 speech engine decoder. The preliminary results, presented in this thesis show the effectiveness of the topical language models

    A Statistical Framework for Discrete Visual Features Modeling and Classification

    Get PDF
    Multimedia contents are mostly described in discrete forms, so analyzing discrete data becomes an important task in many image processing and computer vision applications. One of the most used approaches for discrete data modeling is the finite mixture of multinomial distributions, considering that the events to model are independent. It, however, fails to capture the true nature in the case of sparse data and leads generally to poor biased estimates. Different smoothing techniques that reflect prior background knowledge are proposed to overcome this issue. Generalized Dirichlet distribution has suitable covariance structure, so it offers flexibility in parameter estimation; therefore, it has become a favorable choice as a prior. This specific choice, however, has its problems mainly in the estimation of the parameters, which appears to be a laborious task and can deteriorate the estimates accuracy when we consider the maximum likelihood (ML) approach. In this thesis, we propose an unsupervised statistical approach to learn structures of this kind of data. The central ingredient in our model is the introduction of the generalized Dirichlet distribution mixture as a prior to the multinomial. An estimation algorithm for the parameters based on leave-one-out (LOO) likelihood and empirical Bayesian inference is developed. This estimation algorithm can be viewed as a hybrid expectation-maximization (EM) which alternates EM iterations with Newton-Raphson iterations using the Hessian matrix. We also propose the use of our model as a parametric basis for support vector machines (SVM) within a hybrid Generative/discriminative framework. Through a series of experiments involving scene modeling and classification using visual words and color texture modeling, we show the efficiency of the proposed approaches
    corecore